#2974 closed help (answered)

ACUMPS: Diagnostic error

Reported by: akpandeyjnu Owned by: um_support
Component: UKCA Keywords:
Cc: luke, mdalvi Platform: ARCHER
UM Version: 11.0



I am running the suite u-bj506 (nudged UKCA11.0 with transient). It was running along happily enough but then fell over producing the following error message in the job.error file:

??!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 4
? Error from routine: U_MODEL_4A
? Error message: ACUMPS: Diagnostic error. See output for item no.
? Error from processor: 0
? Error number: 40

The job.out file has the following error message:


ERROR: checksum failure in climate mean

The path of job.err and job.out file is:

I am unable to fix this error. Please help.

Regards, Alok

Change History (5)

comment:1 Changed 18 months ago by dcase


As a first guess, it looks as though the meaning can't be done as data is missing. Your error is from the 08th month of 2015, but in your logs you have run monthly up to 07th month of 2014. There is also a log for 2019 in another place.
It's possible that I'm missing things, but have you skipped a year? Have you been restarting manually, or did you just start at 1997 and this is the first problem that you've had?


comment:2 Changed 18 months ago by mdalvi

Hi Alok,

I am afraid this is a known issue if a run with climate meaning is restarted after not completing successfully (checksum file written at end of run becomes corrupted).
Your log information (atmos_main/09) shows you have tried to resubmit this multiple times after an initial failure?
In this case the run might have to be started fresh, without Reconfiguration if this is in the middle of a long run.


comment:3 Changed 18 months ago by akpandeyjnu

Hi Dave, Mohit

Thanks for responses.

@Dave: I have not skipped a year. I have used a dump file of the year 1997 and submitted it for a few months and then manually restarted the run using "rose suite-run —restart" command after changing run length. I have manually restarted it several times in chunks of 1 or 2 years.

@Mohit: Thanks for suggestions, this run (u-bj506) has crashed almost at the end months of the long run as emissions files are available up to Dec2015 only.

To start fresh, I have taken a dump file of Dec 2014 to reinitiate the run and planning to follow the following steps:

  1. um —> namelist —> Model Input and output —> Dumping and Meaning —> astart (/work/n02/n02/alok/bj506a.da20141201_00)
  1. Suite conf —> Run Initialisation and Cycling —> Model basis time (20151201)
  1. Suite conf —> Tasks —> Run Reconfiguration (switch off)
  1. Suite conf —> Tasks —> Other run length (P15M)
  1. Save and then submit job via terminal – ‘rose suite-run’

Am I doing the right things or I have to do something else? Do I have to do ‘rose suite-clean’ before submitting the job?

Thank you

Regards, Alok

comment:4 Changed 18 months ago by mdalvi

Hi Alok,

1 astart (/work/n02/n02/alok/bj506a.da20141201_00)

I would recommend making this file read-only (using chmod), otherwise it will get overwritten in case you turn Reconfiguration On in this suite or its future copy.

3 Model basis time (20151201)

As you have noted, the emissions data is only available upto end of 2015, so the model will not run beyond 10-15 days, unless you switch to using 'cyclic' emissions files like those in /work/y07/y07/umshared/cmip6/ancils/n96e/timeslice_2014.
(The official CMIP6 emission data is only available upto dec2014 and data for year 2015 in the timeseries_1850_2014/ files is a repeat of 2014, to be able run the model till 31-dec-2014).

rose suite-clean will also remove the compiled code, so the Build step will have to be re-run which is not really necessary. Manually deleting the ~/cylc-run/u-bj506/work and ~/cylc-run/u-bj506/share/data folders should also be sufficient.

comment:5 Changed 17 months ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

closed for inactivity

Note: See TracTickets for help on using tickets.