Opened 3 years ago

Closed 3 years ago

#1884 closed help (answered)

Pressure solver convergence failure

Reported by: bharvey Owned by: um_support
Component: UM Model Keywords: BICGSTAB omg
Cc: Platform: MONSooN
UM Version: 10.4

Description

I'm using Stu Webster's nesting suite to run a case study from 2009, including running the global model from an analysis file. However, the global model keeps crashing a few hours in with the error:

????????????????????????????
? Error code: 11
? Error from routine: EG_BICGSTAB_MIXED_PREC
? Error message: Convergence failure in BiCGstab, omg is too small
? Error from processor: 0
? Error number: 29
????????????????????????????

This is running with vn10.4 at N320, suite id is u-ad543. The stdout file is here on monsoon:
/work/home/bharv/cylc-run/u-ad543/log/job/20091124T0900Z/glm_um_fcst_000/01/job.out

I've run this producing lots of dumps to see what's happening, and it seems to be a problem with the winds in the top few levels over the N pole. E.g. see this file:
/home/bharv/cylc-run/u-ad543/share/cycle/20091124T0900Z/glm/um/umglaa_d200911241136

I've tried a number of things including: smaller timestep, lower resolution and reducing gcr_tol, none of which have worked so far.

One thing I noticed - the MO analyses changed configuration only a few days before this case study (including raising the top), so perhaps the upper statosphere is noisy in my file - the winds do seem quite large:
/home/bharv/data/tnawdex3/analysis_files/20091124_qwqu06.T+3

Any suggestions on how to get this running?

Change History (6)

comment:1 Changed 3 years ago by grenville

Ben

pl try running without OMP threads - let us know what happens. You won't get IO servers without threads, but it may give us a clue

Grenville

comment:2 Changed 3 years ago by bharvey

Hi Grenville,

Thanks for that. Could explain how to do that? I've just tried running with UM_THREAD_LEVEL=SINGLE, is that what you meant? If so, it crashed with the same error, output is in same location as before.

Thanks

Ben

comment:3 Changed 3 years ago by grenville

Ben

In the rode site , search for "thread" - you should see an entry

Atmosphere: Number of OpenMP threads — set it to zero.

Grenville

comment:4 Changed 3 years ago by grenville

Ben

Please see http://cms.ncas.ac.uk/wiki/UM/Configurations/GA7.0-GC3.0 for an N216 suite - maybe try your start file through that as another staw to clutch.

Grenville

comment:5 Changed 3 years ago by bharvey

Hi Grenville,

Thanks for that. I managed to find another recent N320 suite I had for something else which I have used to test this case - that also failed in the same way.

I've tried a range of start dumps from around the date I need and the problem is specific to a period of a few days only, and seems to be related to particularly strong winds in the stratosphere then.

I have got one setup to run to completion, by turning on the polar filter, and that appears to be a satisfactory solution for now, so no need for further action here.

Thanks again for your help,

Ben

comment:6 Changed 3 years ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Ben

Thanks for the update - knowing your solution/workaround may come in handy for others too.

Grenville

Note: See TracTickets for help on using tickets.