Opened 2 years ago
Closed 2 years ago
#2866 closed help (answered)
North/South halos too small for advection.
Reported by: | ChrisWells | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | UKESM |
Cc: | Platform: | NEXCS | |
UM Version: | 11.2 |
Description
Hi,
I've had a suite of UKESM (u-bg601) running for ~95 years, and it just failed on 21081001, with
[0] ???????????????????????????????????????????????????????????????????????????????? [0] ???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!! [0] ? Error code: 15 [0] ? Error from routine: LOCATE_HDPS [0] ? Error message: North/South halos too small for advection. [0] ? See the following URL for more information: [0] ? https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints [0] ? Error from processor: 218 [0] ? Error number: 320 [0] ????????????????????????????????????????????????????????????????????????????????
I followed the advice on https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints and turned on Max wind output and High diagnostics.
But I'm unsure what to look at now? Looking in cylc-run/u-bg601 I can see the job.err file in /log , but can't see anything to go from there.
I can see in /share/data/History_Data that the model got to 20181201.
Do you know where I should look to see the max wind and other diagnostics? And how I could use that to fix my problem and continue this run?
Cheers,
Chris
Change History (3)
comment:1 Changed 2 years ago by ChrisWells
comment:2 Changed 2 years ago by willie
- Keywords UKESM added
- Platform set to NEXCS
- UM Version set to 11.2
comment:3 Changed 2 years ago by grenville
- Resolution set to answered
- Status changed from new to closed
closed through inactivity
Hi,
just an update to this: I copied my suite to u-bh765, downloaded the restart files from my other suite for 21080101, and ran from there, and it seems to have worked (it's on 21091001 now) - but I am further confused by this. In hindsight, I shouldn't have expected this to work; I thought it would be bit-reproducible and therefore fail at the same point?
So my issue seems to be fixed, but I don't understand what's happened here; I'll leave this open as hopefully someone will be able to shed some light on this.
Cheers,
Chris