Opened 3 months ago

Last modified 3 weeks ago

#2129 new help

UKCA vn8.4 job does not run further than 20 days

Reported by: dilshadshawki Owned by: um_support
Priority: normal Component: UM Model
Keywords: UKCA, UMUI, resubmission, walltime Cc:
Platform: Monsoon2 UM Version: 8.4

Description

Hello helpdesk,

I am picking up on an old error, from ticket number #1976.

I have a ukca job, xlqqg, which I know worked on the old ibm02 machine, I managed to make it run on the Cray machine, but only changing diagnostics in STASH, and it only runs up until 20 days after reducing the resubmission to 10 days as instructed in the ticket #1976.

I run it as a normal run and it works fine and outputs 10 days of data, then I run it as a continuation run and it stops after reaching 20 days of output, so it only outputs 10 more days before it says walltime exceeded in the .leave file:

/home/d02/dshawk/output/xlqqg000.xlqqg.d17081.t154216.leave

There is very little information here.

Can anyone please help me with this issue?

Many thanks,
Dill

Change History (6)

comment:1 Changed 3 months ago by willie

Hi Dill,

The NRUN

/home/d02/dshawk/output/xlqqg000.xlqqg.d16272.t155641.leave

on the Sep 28, 2016 did not complete and has lots of name list errors. Am I looking at the right thing?

Regards
Willie

comment:2 Changed 6 weeks ago by willie

Hi Dill,

Is this still an issue?

Willie

comment:3 Changed 6 weeks ago by dilshadshawki

Hi Willie,

Yes this is still an issue and the latest .leave file is:

/home/d02/dshawk/output/xlqqg000.xlqqg.d17081.t154216.leave

I'm not sure if the original one was Sep 28th as I tried running this job many times after that date.

The output files are in

/projects/ukca-imp/d shawk/xlqqg

Please forgive me for the slow response. I would really appreciate your help on this.

Best wishes,
Dill

comment:4 Changed 3 weeks ago by willie

Hi Dill,

You have archiving switched on, but you've switched off Jeff's archiving branch. You've got it as a working copy but switched off. You need to put this entry in the main table, switch it on and rebuild the code.

Regards,
Willie

comment:5 Changed 3 weeks ago by willie

Hi Dill,

You also refer to the script in

$UMDIR/archiving/bin

This doesn't exist on Monsoon. What is it you're trying to do here?

Regards
Willie

comment:6 Changed 3 weeks ago by willie

Hi Dill,

The archiving instructions for Monsoon can be found at http://collab.metoffice.gov.uk/twiki/bin/view/Support/CrayUMInstall#Archiving

The scripts you need are in /common/moci/archiving/bin.

Sorry for the confusion

Regards
Willie

Note: See TracTickets for help on using tickets.