Opened 5 years ago
Closed 5 years ago
#1961 closed help (fixed)
INCOMPASS suite fail
Reported by: | amenon | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | Nesting suite |
Cc: | Platform: | ARCHER | |
UM Version: | 10.4 |
Description
Hi Ros,
Problem 1
Reconfiguration jobs "INCOMPASS_k4p4_2_v2p1_qcons_um_recon" and "INCOMPASS_km4p4_v2p1_qcons_um_recon" are failing with an error as follows (I have attached the screenshot of the suite cylc)
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!! ? Error code: 10 ? Error from routine: Calc_nlookups ? Error message: Ancillary files have not been found - Check output for details ? Error from processor: 0 ? Error number: 0
When I check the job.out file, it shows that the following ancilliary files are missing
File No 6 /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_seaice.anc Ancillary File does not exist. File : /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_seaice.anc Stashcode : 31 File No 7 /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_sst.anc Ancillary File does not exist. File : /work/n02/n02/amenon/cylc-run/u-af584/share/cycle/20150601T0000Z/INCOMPASS/km4p4/v2p1_qcons/ics//ostia_sst.anc Stashcode : 24
I checked those directories and these files don't exist. In the OSTIA directory in my ARCHER work folder (/work/n02/n02/amenon/suite/INC4P4/OSTIA) I see that the files are named as "20150531_ostia_seaice.anc", "20150531_ostia_sst.anc 20150601_ostia_seaice.anc" and " 20150601_ostia_sst.anc"
Problem 2
Forecast jobs "INCOMPASS_k4p4_2_v2p1_qcons_um_fcst_000" and "NCOMPASS_km4p4_v2p1_qcons_um_fcst_000" fail with error from routine: io: buffin as shown below
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!! ? Error code: 25 ? Error from routine: io:buffin ? Error message: Error in buffin errorCode= 0.00 len=524288/937984 ? Error from processor: 0 ? Error number: 20
Suite id: u-af584
Any thoughts on this?
Cheers,
Arathy
Change History (6)
comment:1 Changed 5 years ago by ros
- Reporter changed from ros to amenon
comment:2 Changed 5 years ago by ros
comment:3 Changed 5 years ago by ros
Hi Arathy,
I found the offending aprun line in one of the SURF scripts so I’ve removed that and then SURF fails as I built it for the compute nodes not serial nodes as I didn’t realise it was run there. Just rebuilding it now and I’ll let you know when it good to try again.
Cheers,
Ros.
comment:4 Changed 5 years ago by ros
Hi Arathy,
After a few fights with ARCHER I think we now have a working SURF executable at least my copy of your INCOMPASS_km4p4_v2p1_qcons_um_surf_ostia and its sibling have both supposedly succeeded.
The directory for SURF has now changed as it's built for a different part of ARCHER with a different architecture, so you'll need to change the SURF source in
~/roses/u-af584/app/install_cold/opt/rose-app-ncas-cray-xc30.conf
It's just a change in the name from ivybridge to x86_64.
[file:$ROSE_SUITE_DIR/share/fcm_make_surf] mode=symlink source=/work/n02/n02/ros/SURF/SURF31.2.0/share/fcm_make_surf_xc30_x86_64_ifort_opt
I also had to change a couple of wallclock times as they got exceeded which I would recommend you to do as well.
~/rose/u-af584/site/ncas-cray-xc30/suite-adds.rc
Changed from 00:10:00 to 00:20:00 under:
[[HOST_HPC]] .... -l walltime = 00:20:00 Changed from 01:00:00 to 02:00:00 [[BUILD_HPC]] .... -l walltime = 02:00:00
Cheers,
Ros.
comment:5 Changed 5 years ago by ros
Hi Ros,
Excellent! Thanks a lot. My suite was not running when I got your mail. I made all these changes, changed the wall clock time too and restarted the suite. The INCOMPASS_km4p4_v2p1_qcons_um_surf_ostia succeeded for me too. Thanks.
Cheers,
Arathy
comment:6 Changed 5 years ago by ros
- Resolution set to fixed
- Status changed from new to closed
Update from Arathy:
Stu is back. I contacted him regarding the error. This was his reply
So the error is actually in the task that should create the missing files, i.e. its in the *_um_surf_ostia task.
I did this, but ended up with the same error. So Stu suggested me that I should contact NCAS-CMS and this might be more technical.
Thanks,
Arathy