Opened 3 years ago

Closed 3 years ago

#1783 closed error (answered)

Previously working job now not creating start dump

Reported by: ee10hp Owned by: ros
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 7.3

Description

Dear CMS helpdesk,

Thanks in advance for your help.
In mid-December my job xlylv successfully ran on ARCHER. It is a v7.3 UM-UKCA NRUN with the extended nitrate version of GLOMAP-Mode.

Now, after the winter break, I have made a copy of this job (xmhha). However, the following error is returned in the xmhha .leave output:

 ERROR!!! in reconfiguration in routine Rcf_Files_Init
 Error Code:-  10
 Error Message:- Failed to Open Start Dump
 Error generated from processor  0

I can see that the start dump $DATAW/$RUNID.astart hasn't been created for job xmhha. Can you please advise me on what may have gone wrong here? I'm confused because xlylv worked before the break! The only other thing I've done recently has been to reset my SSH key between puma and ARCHER and that seems to be working fine.

Diff'ing these 2 jobs in the umui confirms that they are essentially identical:

No differences

Job xlylv Title cp -s 2 day test run -- but hourly diags to UPJ
Job xmhha Title cp xlylv base 2day run w hourly diags at sites
No differences

The comp.leave and .leave files are here:
/home/n02/n02/ee10hp/outputxmhha000.xmhha.d16004.t124845.comp.leave
/home/n02/n02/ee10hp/output/xmhha000.xmhha.d16004.t124845.leave

(Files for previously working xlylv are here:
/home/n02/n02/ee10hp/outputxlylv000.xlylv.d15350.t140945.comp.leave
/home/n02/n02/ee10hp/output/xlylv000.xlylv.d15350.t140945.leave)

Any advice or help with this would be much appreciated.

Many thanks,
Hannah

Change History (4)

comment:1 Changed 3 years ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi Hannah,

The start dump you are trying to use was in Karthee's directory: /work/n02/n02/karthee/um/xjnjj/xjnjj.astart (See Atmosphere → Ancil & input files → Start dump). This is no longer there.

As your 2 jobs are identical you could use the start dump from your xlylv job.

Regards,
Ros.

comment:2 Changed 3 years ago by ee10hp

Hi Ros,

Thanks for helping me track down the cause of that error. To change the start dump is it as simple as just changing the file in Atmosphere → Ancil & input files → Start dump ?

I ask as I changed the dump to /work/n02/n02/ee10hp/um/xlylv/xlylv.astart and now have the following error messages:

*********************************************************
UM Executable : /work/n02/n02/ee10hp/um/xmhha/bin/xmhha.exe
*********************************************************

diff: /work/n02/n02/ee10hp/tmp/tmp.mom4.6923/xmhha.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/ee10hp/um/xmhha/xmhha.thist to backup thist file /work/n02/n02/ee10hp/um/xmhha/xmhha.thist_keep
=========================================================
xmhha: qsserver failure at Tue Jan 5 12:27:29 GMT 2016
=========================================================

and later

/work/n02/n02/ee10hp/um/xmhha/bin/qsfinal: Model xmhha - Error: No history files

The .leave file is: /home/n02/n02/ee10hp/output/xmhha000.xmhha.d16005.t105114.leave

Thanks again for your help,
Hannah

comment:3 Changed 3 years ago by ros

Hi Hannah,

Just changing the start dump as you have done is fine. The reconfiguration has run ok and your new reconfigured start dump has been created successfully.

For some reason the model run is looking for history files which obviously won't exist as you've not run this job before and it's an NRUN anyhow. I found if you switch off post-processing (Post Processing → Main switch + General Questions) the job then ran the NRUN ok for me. If you need the post-processing, try switching it back on for the CRUN. I'm not sure why it's having problems with this now, vn7.3 does have some idiosyncrasies.

Regards,
Ros.

comment:4 Changed 3 years ago by ros

  • Resolution set to answered
  • Status changed from accepted to closed

Closed due to inactivity.

Note: See TracTickets for help on using tickets.