Opened 5 months ago

Closed 5 months ago

#3218 closed help (worksforme)

atmos_main running out of time

Reported by: dgalea Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 11.0

Description

Hi,

I am trying to run a copy of the training suite u-ba799 using a resolution of N96 on ARCHER, and transferring the data to JASMIN using the instructions on http://cms.ncas.ac.uk/wiki/Archer/Transition2020/PPTransfer. The model runs well when having a cycling frequency of 1 hour for a run length of 1 day with postproc and pptransfer turned off. When turning the postproc and pptransfer on, and changing the cycling frequency to one day as told on page 33 of the training exercises, the atmos_main task fails due to running out of time. I have changed the walclock time to 10mins, 20mins and 1 hour, but none of these work. Could you help me figuring this out? My suite id is u-bs383.

Finally, I would eventually like to run the model with a cycling frequency of 6hr. Is this allowed for postproc and pptransfer to work?

Thanks.

Attachments (1)

postproc.txt (2.1 KB) - added by dgalea 5 months ago.
Error file of postproc

Download all attachments as: .zip

Change History (12)

comment:1 Changed 5 months ago by grenville

Please
chmod -R g+rX
both your /home ans /work spaces on archer

Grenville

comment:2 Changed 5 months ago by dgalea

I have just changed the persmissions as requested.

comment:3 Changed 5 months ago by grenville

Switching on postproc and data transfer has no bearing on the runtime for atmos_main. You should be able to use any cycling frequency - the run length should be a day or more for archiving to work.

Could you simply rose suite-run —new - I see no reason why the model should not run in ~5 mins

If that doesn't work, we can rethink

Grenville

Changed 5 months ago by dgalea

Error file of postproc

comment:4 Changed 5 months ago by dgalea

I have done that and changed the cycle time to PT1H and wallclock time to PT10M and atmos_main succeeded. However, the postproc app in now stuck at retrying with the error as in the attached postproc.txt, hence why I started off with a cycle time of a day. I'm running it again with a cycle time of P1D and see what happens.

comment:5 Changed 5 months ago by dgalea

As expected, atmos_main fails with a cycle time of P1D, a run time of P1D and a wallclock time of PT10M

comment:6 Changed 5 months ago by grenville

That's odd - we'll try it

comment:7 Changed 5 months ago by grenville

My copy of u-ba799 work OK irrespective of cycling frequency - I'm not sure it's worth spending time figuring out where your suite differs.

Why are you using a copy of u-ba799 - that has been mangled for training? Probably better to use a standard suite.

Grenville

comment:8 Changed 5 months ago by dgalea

I was using a copy of u-ba799 just because it had the option of running for a variety of resolutions with a fairly recent version of the UM. What standard suite do you recommend?

comment:10 Changed 5 months ago by dgalea

Great. I'll give that a go and reply back if I have a problem with it.

comment:11 Changed 5 months ago by dgalea

  • Resolution set to worksforme
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.