#1414 fixed HadGEM2 failure annette charlie

My job that we discussed a few weeks ago has been running, but has got to a certain point (February 1978, to be exact) and won't go any further. I have tried restarting from several of my start dumps, e.g. 1 February 1978 and 1 January 1978, but it's always gets to the same point, then stops.

I don't understand what's gone wrong. I have looked at the .leave file (attached) and although I can see errors, I don't know which one is the important one.

One of the reasons reason I don't understand this problem is that I am currently running 4 jobs at once: xkmna-d. These correspond to 4 ensemble members of the same job, so they are absolutely identical apart from the initial start file used - they all start in 1971, but xkmna reconfigures the 1971 start file, b the 1972 start file, etc. The problem is only occurring with xkmna - all of the other jobs have got past February 1978 and are running fine. So given that they are all identical, why are the others working but not xkmna?

Further to my last message, I've just discovered my attachment is too large. You can find it at /home/n02/n02/cjrw09/um/umui_out/xkmna000.xkmna.d14335.t170822.leave

Thanks a lot,


#1418 answered Pre-industrial control version of HadGEM3-A? annette pliojop


I am looking for a pre-industrial HadGEM3-A simulation (ideally version 8.4) at N96 resolution with a start dump. I have had a search around the UMUI but couldn't find any simulations, but thought I would ask on here incase someone knew of a version existing.

I will be running on Archer, but wouldn't mind converting a run from another platform to run on Archer.

Many thanks


#1419 fixed Model crash relating to .xhist / .thist files annette James

I'm running UM-UKCA vn7.3 (JobID: xkpib) and the model's falling over with the following error message written to the .leave file.


sys-2 : UNRECOVERABLE error on system request

No such file or directory

Encountered during an OPEN of unit 12 Fortran unit 12 is not connected _pmiu_daemon(SIGCHLD): [NID 00484] [c2-0c1s9n0] [Thu Dec 11 03:35:10 2014] PE RANK 0 exit signal Aborted [NID 00484] 2014-12-11 03:35:10 Apid 12200682: initiated application termination diff: /work/n02/n02/jgl22/tmp/tmp.mom4.9816/xkpib.xhist: No such file or directory qsexecute: Copying /work/n02/n02/jgl22/um/xkpib/xkpib.thist to backup thist file /work/n02/n02/jgl22/um/xkpib/xkpib.thist_keep xkpib: Run failed


It seems the model's compiling ok but falling over close to the start of the run, and I'm at a bit of loss as to what's wrong.

Luke Abraham kindly took a look at the output with me and suggested I raise a ticket. I'd really appreciate your help.



