#1799 completed 8.2 UKV reconfiguration and run jobs on MONSooN (xkyib and c) ros fd001150

I'm having a couple of problems running my 8.2 UKV reconfiguration and run jobs on MONSooN (xkyib and c).

When I submit I have the following message:

./REMCOMMS.36620: line 18: 5: Bad file descriptor ./REMCOMMS.36620: line 19: 5: Bad file descriptor

Which I get round by manually submitting stage_1_submit, is there another way round this?

Also when I run xkyib, I get the following error message: (note my reconfiguration and build steps run fine)

? Error in routine: UM_SETUP ? Error Code: 2 ? Error Message: READHIST: Read ERROR on history file for namelist NLCHISTG ? Error generated from processor: 0 ? This run generated 0 warnings

the full .leave file is located at /home/dflack/output/xkyib000.xkyib.d16032.t151629.leave

Any ideas on how to fix this?

#1587 answered 8.2 dump in 7.5 model reconfiguration job um_support chollow


I'm testing a 7.5 reconfiguration job (for a limited-area 12-km model) for Fadzil to see if it will work for him. However, it looks like the UM start dump I am using as my input file (which was reconfigured from an ECMWF grib2 file with a special set of scripts) is version 8.2 (I assume this from the rcf. leave job, I've copied some output below). This may be why my reconfiguration is failing, although there is a more specific complaint in the leave file.

The job is xlned on ARCHER

The output file is: /home/n02/n02/chollow/output/xlned000.xlned.d15159.t182008.rcf.leave

The error is:

66 Rank 138 [Mon Jun 8 20:33:55 2015] [c3-1c0s13n3] application called MPI_Abort(MPI_COMM_WORLD, 9) - process 138

67 Rank 120 [Mon Jun 8 20:33:55 2015] [c3-1c0s13n3] application called MPI_Abort(MPI_COMM_WORLD, 9) - process 120 68 Error Code:- 2 69 Error Message:- Cant find required STASH item 376 section 0 model 1 in STASHmaster 70 Error generated from processor 0

I looked up this stash item (in an 8.2 job) and it is: SNOW DEPTH ON GROUND IN TILES (M)

There are a few related ones, 377-386, which are also related to snow which I don't think are in version 7.5 (or at least not all of them).

Is there a way around this? Or is it never advisable to have your input dump be from a later model version?

Thanks, Chris

#1443 fixed 8.4 Run Crash Problem grenville pliojop


I have a version 8.4 job (xkmwb) running on Archer. It is a copy of Grenville's Archer test job, xgwtq (HadGEM3 atmosphere only with GA4). At present it runs for about 5 hours,about 6 months worth of model time before crashing with the error:

UM Executable : /work/n02/n02/japope/um/xkmwb/bin/xkmwb.exe *

mkdir:: File exists [NID 04011] 2015-01-19 16:37:25 Apid 12632862: initiated application termination [NID 04011] 2015-01-19 16:37:28 Apid 12632862: OOM killer terminated this process. xkmwb: Run failed

I had already encountered this error and had changed the job from running on 6 NS processors to 12 processors based on a review of other tickets with OOM errors. However the error still persisted.

There is another error in the file, which may be linked the recurrence of:


Conservation enforcement failed Run continuing using best estimate


non-conservation for field 4

in the leave file. Not sure if they are linked in anyway.

The leave file for this run is




