#3024 closed help (fixed)

wrong restart file in the middle of the run

Reported by: xd904476 Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version:

Description

Hi, I am running suite u-bl536 on Archer and yesterday I hit my disk quota so that the coupled task at year 2071 failed because files could not be written out on disk.
I have now more space, but when I restarted the suite something has gone wrong with the timestamp because I have some extra startdump in the output directory and I guess the model is picking up june as start date for the ice and the ocean but january for the atmosphere:
The UM restart data does not match the current cycle time
. Cycle time is 20710101

UM restart time is 20710601

[WARN] The NEMO restart data does not match the current cycle time
. Cycle time is 20710101

NEMO restart time is 20710601

[WARN] Automatically removing NEMO dumps ahead of the current cycletime, and pick up the dump at this time
[WARN ]The CICE restart data does not match the current cycle time
. Cycle time is 20710101

CICE restart time is 20710601

If I hold the suite, could I manually delete the extra startdumps on archer for 2071 and restart the suite in January? and then I could retrigger the coupled task? Or should I kill the suite and restart it with a new initial startdump?

thanks
dani

Change History (7)

comment:1 Changed 13 months ago by grenville

Dani

Possibly - sounds like the same problem as in ticket #3020 ?

Grenville

comment:2 Changed 13 months ago by xd904476

Hi Grenville, it does.
I'll try pointing to the right restart files and hope this won't mess up the ctrl run.
thanks

comment:3 Changed 13 months ago by xd904476

Hi Grenville,
I went through the instructions but I'd like to make sure I'm following them properly before making unrecoverable mistakes on the timestamp.

I would like my run to be set again on the 20710101T0000Z

following the instructions in the metoffice link in #3020 I see (instructions are in italic):

Make a note of the failing cycletime.

Cycle time is 20710101
UM restart time is 20710601

Note the UM checkpoint dump in the file $DATAM/<runid>.xhist matches the current cycletime - eg:

&NLCHISTG CHECKPOINT_DUMP_IM = '/home/d00/haden/cylc-run/u-ae085_test/share/data/History_Data/gc3aoa.da19790301_00

for me this is
&NLCHISTG CHECKPOINT_DUMP_IM = '/work/n02/n02/dflocco/cylc-run/u-bl536/share/data/History_Data/bl536a.da20711201_00

If the atmosphere is in advance of the cycletime then you will need to source an xhist file from the previous cycle. Previous history states can be found in $CYLC_TASK_WORK_DIR/<previous cycle>/coupled/history_archive/temp_hist.. Most likely the most suitable file will be the latest file in that directory. Ensure that you have a temp_hist file in which the datestamps for CHECKPOINT_DUMP_IM and the current cycletime match. Copy this file to $DATAM/<runid>.xhist

Do I actually need to copy any u-bl536.hist? or can I adjust the &NLCHISTG CHECKPOINT_DUMP_IM = '/work/n02/n02/dflocco/cylc-run/u-bl536/share/data/History_Data/bl536a.da20711201_00

to

&NLCHISTG CHECKPOINT_DUMP_IM = '/work/n02/n02/dflocco/cylc-run/u-bl536/share/data/History_Data/bl536a.da20710101_00 ?

Note the latest datestamp available for the NEMO restart file(s) - often found in in $DATAM/NEMOhist - and compare with current cycletime. eg: ls $DATAM/NEMOhist/*restart* which produces a listing where the last file is NEMOhist/gc3aoo_icebergs_19790301_restart_0467.nc.

If the latest NEMO datestamp is in advance of the current cycletime, then simply remove (rename, or move elsewhere) NEMO restart files with later datestamps.

Latest iceberg restart file for me is
/work/n02/n02/dflocco/cylc-run/u-bl536/share/data/History_Data/NEMOhist/bl536o_icebergs_20711201_restart_0071.nc

the oldest in the same dir is bl536o_icebergs_20701101_restart_0000.nc

shall I delete them all since I want to restart in January 2071?

Ensure that the CICE restart file datestamp matches the current cycletime. Usually the restart file is a string which can be edited in the ${CICEDATA}/ice.restart_file pointer file. For the above example, the restart file should be: /home/d00/haden/cylc-run/u-ae085_test/share/data/History_Data/CICEhist/gc3aoi.restart.1979-03-01-00000.nc

I edited the ice.restart file to
/work/n02/n02/dflocco/cylc-run/u-bl536/share/data/History_Data/CICEhist/bl536i.restart.2071-01-01-00000.nc

Re-trigger the task.

Hopefully soon!
thanks

comment:4 Changed 13 months ago by grenville

Dani

Best to follow the instructions, so copy the appropriate history file.
Delete nemo restart files with 20711201 timestamps.

Grenville

comment:5 Changed 12 months ago by xd904476

  • Resolution set to fixed
  • Status changed from new to closed

comment:6 Changed 12 months ago by xd904476

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:7 Changed 12 months ago by xd904476

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.