INCOMPASS suite fail

Reported by: amenon
Component: UM Model Keywords: Nesting suite
Cc: Platform: ARCHER
UM Version: 10.4


Hi Ros,

Problem 1

Reconfiguration jobs "INCOMPASS_k4p4_2_v2p1_qcons_um_recon" and "INCOMPASS_km4p4_v2p1_qcons_um_recon" are failing with an error as follows (I have attached the screenshot of the suite cylc)

???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 10
?  Error from routine: Calc_nlookups
?  Error message: Ancillary files have not been found - Check output for details
?  Error from processor: 0
?  Error number: 0

When I check the job.out file, it shows that the following ancilliary files are missing

File No    6 /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_seaice.anc
Ancillary File does not exist.
File : /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_seaice.anc
Stashcode :    31
File No    7 /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_sst.anc
Ancillary File does not exist.
File : /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_sst.anc
Stashcode :    24

I checked those directories and these files don't exist. In the OSTIA directory in my ARCHER work folder (/work/n02/n02/amenon/suite/INC4P4/OSTIA) I see that the files are named as "20150531_ostia_seaice.anc", "20150531_ostia_sst.anc 20150601_ostia_seaice.anc" and " 20150601_ostia_sst.anc"

Problem 2

Forecast jobs "INCOMPASS_k4p4_2_v2p1_qcons_um_fcst_000" and "NCOMPASS_km4p4_v2p1_qcons_um_fcst_000" fail with error from routine: io: buffin as shown below

???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 25
?  Error from routine: io:buffin
?  Error message: Error in buffin errorCode= 0.00 len=524288/937984
?  Error from processor: 0
?  Error number: 20

Suite id: u-af584

Any thoughts on this?


Change History (6)

comment:1

  • Reporter changed from ros to amenon

comment:2

Update from Arathy:

Stu is back. I contacted him regarding the error. This was his reply

So the error is actually in the task that should create the missing files, i.e. its in the *_um_surf_ostia task.

The error is in the job.out file (on PUMA its in file
p1_qcons_um_surf_ostia/01/job.out`) and is that

 "XALT Error: unable to find aprun" .

So, can you try the following?

(1) on PUMA edit your site/ncas-cray-xc30/suite-adds.rc file, deleting the 4 lines below the line [[SURF_OSTIA]] , i.e. these ones…



(2) Then I presume that your suite will have shut down. So rose suite-run --restart and then retrigger one of the failed um_surf_ostia tasks.

(3) Let me know what that yields!

I did this, but ended up with the same error. So Stu suggested me that I should contact NCAS-CMS and this might be more technical.


comment:3

Hi Arathy,

I found the offending aprun line in one of the SURF scripts so I’ve removed that and then SURF fails as I built it for the compute nodes not serial nodes as I didn’t realise it was run there. Just rebuilding it now and I’ll let you know when it good to try again.


comment:4

Hi Arathy,

After a few fights with ARCHER I think we now have a working SURF executable at least my copy of your INCOMPASS_km4p4_v2p1_qcons_um_surf_ostia and its sibling have both supposedly succeeded.

The directory for SURF has now changed as it's built for a different part of ARCHER with a different architecture, so you'll need to change the SURF source in


It's just a change in the name from ivybridge to x86_64.


I also had to change a couple of wallclock times as they got exceeded which I would recommend you to do as well.


Changed from 00:10:00 to 00:20:00 under:

        -l walltime = 00:20:00

Changed from 01:00:00 to 02:00:00
        -l walltime = 02:00:00


comment:5

Hi Ros,

Excellent! Thanks a lot. My suite was not running when I got your mail. I made all these changes, changed the wall clock time too and restarted the suite. The INCOMPASS_km4p4_v2p1_qcons_um_surf_ostia succeeded for me too. Thanks.


comment:6

  • Resolution set to fixed
  • Status changed from new to closed
