Opened 2 months ago

Closed 2 months ago

#3326 closed error (answered)

Failure to restart due to tracer dump

Reported by: seg Owned by: um_support
Component: NEMO/CICE Keywords: Tracers
Cc: Platform: NEXCS
UM Version: 10.7

Description

Hi

My job is failing due to the model having issues with the tracer restart dump.

My initial job has nrstdt (nn_rstctl) set to zero, and the first month runs fine. On restart,

Specific section from ocean.output:

  *** Info read in restart :
    previous time-step                               :  960
  *** restart option
  nrstdt = 2 : calendar parameters read in restart ===>>> : E R R O R
         ===========  ===>>>> : problem with nit000 for the restart
  verify the restart file or rerun with nrstdt = 0 (namelist)
  *** Info used values :
    date ndastp                                      :  20000130
    number of elapsed days since the begining of run :  30.

The code producing this error is:

            ! Control of date
            IF( nit000 - NINT( zkt ) /= 1 .AND. nrstdt /= 0 )                                         &
                 &   CALL ctl_stop( ' ===>>>> : problem with nit000 for the restart',                 &
                 &                  ' verify the restart file or rerun with nrstdt = 0 (namelist)' )


where zkt is 960. So nit000-zky != 1 (the error diagnostics do not seem to provide the value of nit000 … just to be annoying probably).

Elsewhere in the output file I have:

       number of the first time step   nn_it000   =  2881
       number of the last time step    nn_itend   =  3840
       initial calendar date aammjj    nn_date0   =  20000101
       leap year calendar (0/1)        nn_leapy   =  30

So, I assume from this nit000 is probably 3841?

Change History (7)

comment:1 Changed 2 months ago by seg

Hmmmm. I thought this was a passive tracer issue (restart dump) but I switched them off. This is an issue with the main ocean restart dump

comment:2 Changed 2 months ago by seg

And perhaps is related to ticket #1166?

Is thos guidance still valid?

https://www.ukca.ac.uk/wiki/index.php/Restarting_an_Atmos-Ocean_Integration#4._CICE_restart

My initial atmos, ocean and ice dumps are extracted using MOOSE, all files named as:

bg467a.da20000101_00
bg467i.restart.2000-01-01-00000.nc
bg467o_20000101_restart.nc
bg467o_20000101_restart_trc.nc
bg467o_icebergs_20000101_restart.nc

I'm initially telling th ejob not to use dates etc from the NC files, but do I still have to mumdge them a bit (before use)?

comment:3 Changed 2 months ago by seg

(I realise that guidance is directed at UMUI jobs)

comment:4 Changed 2 months ago by seg

So. The problem is with the namrun namelist. The values defined for the second month are incorrect.

nn_it000=2881,
nn_itend=3840,

when it should be

nn_it000=961,
nn_itend=1920,

comment:5 Changed 2 months ago by seg

namrun is found in namelist_cfg

comment:6 Changed 2 months ago by seg

I think I have sorted it out. Looks like the rose framework is a bit shit at keeping track of things.

comment:7 Changed 2 months ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Steve

see
https://code.metoffice.gov.uk/trac/moci/wiki/tips_CRgeneral

where there is reference to run by copying the namelist and $HIST_FILE files from the previous completed cycle…

so, yes it's a bit painful restarting a coupled suite.

Grenville

Note: See TracTickets for help on using tickets.