#1435 answered weird glitch in submitting jobs to ARCHER annette swr04ojb

I'm finding today that when I submit jobs to ARCHER sometimes the job will fail. It will be at a point where it is trying to ssh something to ARCHER, but exactly which point seems to vary. If, upon encountering the error, I do nothing other than close the information window (by hitting OK), and press submit another time, then sometimes it will fail due to the same type of error but at a different point (e.g. doing an ssh for UMRECON rather than for UMSCRIPTS) and sometimes it will go through. Typically it doesn't seem to take more than 3 attempts to work, so it's not crippling, but it is a bit weird. Is this something to do with my ssh passwordless setup, something on PUMA, something on ARCHER, or something else entirely?

For info I've been mainly submitting xini#f and xini#g today.

#1446 fixed Run failing, no error message annette webber24

Dear CMS,

The job xkyor, which I submitted last night and this morning has failed to complete twice, but when I look in pe_output on archer:/work/n02/n02/webber24/xkyor and in my .leave files there is no mention of the word error or fail. Do you have any ideas as to why this could be failing?



#1451 answered Model long run crashed annette fcentoni


I have been running without any problem two jobs (xkzra,xkzrb) (10 years run lenght) since 2 days ago. At some point, in one of the last run, the model crashed giving me this error (see output files xkzra004.xkzra.d15027.t220300.leave xkzrb004.xkzrb.d15027.t220708.leave)

UM ERROR (Model aborting) :
 Routine generating error: U_MODEL
 Error code:  1
 Error message:
DUMPCTL : Fail to open output dump - may already exist

Could you help me out to fix that (there only 2 years to go).

Many thanks. Federico

