Opened 9 months ago
Closed 9 months ago
#3319 closed help
North/South halos too small for advection error
Reported by: | ggxmy | Owned by: | annette |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | Monsoon2 | |
UM Version: | 10.7 |
Description
Dear Helpdesk,
I tied to run my UM v10.7 GC3.1 suite (u-bv563), which is based on a previously running suite (u-br927), but the process 'coupled' gets crashed after a few minutes of running with an error below.
? Error from routine: LOCATE_HDPS
? Error message: North/South? halos too small for advection.
? See the following URL for more information:
? https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints
The Wiki page contains some explanations and instructions on this error. But before trying that I tried running u-br927, which ran OK a few months ago (as shown in http://cms.ncas.ac.uk/ticket/3203 ), and got the same (halo) error. Isn't this strange? Has any change been made on Monsoon recently that can cause a problem like this?
Although these suites are never shown like they are committed because they have an additional file, they should basically be up to date. I may be making a minor changes though. So far I changed domain decomposition a bit smaller (36→28) but the result doesn't change.
Masaru
Change History (5)
comment:1 Changed 9 months ago by ros
comment:2 Changed 9 months ago by ggxmy
If there is no change on Monsoon what could have caused the change? I followed the instruction but that doesn't seem to give any helpful information. Or maybe I don't know where to look? Could you please check and see if there is any clue?
comment:3 Changed 9 months ago by annette
- Owner changed from um_support to annette
- Status changed from new to assigned
comment:4 Changed 9 months ago by annette
Hi Masaru,
If you are running a global model with the standard halo sizes, then this error usually means that the model has become unstable with unphysically large winds. And given that it fails straight away, it points to an issue with the input data files.
I can't see any logs from your previous runs for this suite so it is hard to see whether anything is different. As Ros says it seems unlikely that a system change would cause a problem like this. It is more likely that something has changed in the suite, or in the input data you are using.
I can see from #3203 that you had problems with the start dump before.
The dump you are using in the suite is symlinked to this file:
xcslc0 um$ ls -l /projects/ukca-leeds/myosh/dumps/bg466a.da20150101_00 lrwxrwxrwx 1 myosh ukca-leeds 47 Mar 27 08:47 /projects/ukca-leeds/myosh/dumps/bg466a.da20150101_00 -> /projects/asci/myosh/dumps/bg466a.da20150101_00
And redoing the diff between the file that Ros retrieved and the dump you are using it looks like they are different:
xcslc0 um$ diff /projects/umadmin/rhatcher/u-bs160_test/bg466a.da20150101_00 /projects/asci/myosh/dumps/bg466a.da20150101_00 Files /projects/umadmin/rhatcher/u-bs160_test/bg466a.da20150101_00 and /projects/asci/myosh/dumps/bg466a.da20150101_00 differ
This is the issue you had before, so maybe try again using Ros' version of the file.
Best wishes,
Annette
comment:5 Changed 9 months ago by ggxmy
- Status changed from assigned to closed
Thank you Annete. I don't understand why the data was corrupted again but I copied it again and the suite ran. I hope it will not be corrupted again.
Masaru
Hi Masaru,
There haven't been any changes to Monsoon recently that would cause this to our knowledge. I would suggest trying the instructions on the Met Office page to try and diagnose the problem.
Regards,
Ros.