#2345 closed help (fixed)

Error in routine: eg_sl_helmholtz

Reported by: nfreychet Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 8.5

Description

Hello,

I am running 2 nudged run on ARCHER (nudged to ERA Interim winds, with the only difference between the two being the aerosol emissions), and it was fine for the first 30 years but then both runs stopped with the same error message:

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: eg_sl_helmholtz
? Error Code: 1
? Error Message: Convergence failure in BiCGstab
? Error generated from processor: 0
? This run generated 54 warnings
????????????????????????????????????????????????????????????????????????????????

I read on other tickets that it could be due to numerical instability. But as both runs crash exactly at the same day, I'm wondering what could create this instability?
Also, a suggestion seems to change the time step of the model, but I don't know where to do that in UMUI?

Cheers,
Nicolas

Change History (9)

comment:1 Changed 16 months ago by nfreychet

PS: the runs are xnbre and xnbrd.

comment:2 Changed 16 months ago by grenville

Nicolas

In the UMUI model selection→atmosphere→scientific parameter…→time-stepping.

You could try to diagnose the problem by writing out startdumps close to the point of failure, and/or writing out increments to help to isolate where the instability is developing.

Grenville

comment:3 Changed 16 months ago by nfreychet

Hi Grenville,

Thanks, I will try to do that.
Also, I forgot to say that the runs stop at the very beginning of a new year (2013 01 01), so I suspect it might also be due to an error or something with the calendar in the forcing data.

Nico

comment:4 Changed 16 months ago by grenville

Nico

I did look at the forcing data - I didn't see anything pathological. I did notice that Jan 2013 appears to have a warm N pole (and Jun 2004 (for example) had a warm south pole.

Grenville

comment:5 Changed 16 months ago by nfreychet

Hi Grenville,

I tried to change the time step of the model but I still have the same problem.

I looked at the last increment for some variable (u,v,theta…) but I didn't notice any unusual variable. (the last dump can be accessed at /work/n02/n02/nfreyche/xnbrd/xnbrda.da20130101_00 )
Also, in the output, just before the time step where it crashes, everything looks fine:

 EG_SISL_Resetcon: calculate reference profile
 ********************************************
 *    Linear solve for Helmholtz problem    *
 *   ====================================   *
 * Inner iter  1                            *
 * No. Of linear solver iterations   12     *
 * Initial error  0.100000E+01               *
 *   Final error  0.870522E-03               *
 *   Min exner prime -0.304858E-02          *
 *   Max exner prime  0.181341E-02          *
 * Inner iter  2                            *
 * No. Of linear solver iterations    5     *
 * Initial error  0.122262E+00               *
 *   Final error  0.958504E-03               *
 *   Min exner prime -0.413351E-02          *
 *   Max exner prime  0.143016E-02          *
 ********************************************

 ********************************************
 *    Linear solve for Helmholtz problem    *
 *   ====================================   *
 * Inner iter  1                            *
 * No. Of linear solver iterations    5     *
 * Initial error  0.762387E+00               *
 *   Final error  0.588338E-03               *
 *   Min exner prime -0.333058E-02          *
 *   Max exner prime  0.990215E-02          *
 * Inner iter  2                            *
 * No. Of linear solver iterations    2     *
 * Initial error  0.332354E+00               *
 *   Final error  0.812022E-03               *
 *   Min exner prime -0.320480E-02          *
 *   Max exner prime  0.501388E-02          *
 ********************************************

Atm_Step_4A: L_USE_CARIOLLE = F
 Atm_Step_4A: Cariolle scheme not called
NUDGING_MAIN: Entering routine
 Leaving NUDGING_MAIN


 Minimum theta level 1 for timestep  824041
                This timestep                         This run
   Min theta1     proc          position            Min theta1 timestep
      230.12      93    82.5deg W      77.5deg N       230.12824041
  Largest negative delta theta1 at minimum theta1
 This timestep =    -8.34K. At min for run =    -8.34K

  Maximum vertical velocity at timestep  824041       Max w this run
    w_max   level  proc         position             run w_max level timestep
   0.263E+01  84     89  172.5deg E     75.6deg N    0.263E+01   84824041


********************************************************************************

Atm_Step: Timestep   824042   Model time:   2013-01-01 00:40:00
 EG_SISL_Resetcon: calculate reference profile
 ********************************************
 *    Linear solve for Helmholtz problem    *
 *   ====================================   *

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: eg_sl_helmholtz
? Error Code:     1
? Error Message: Convergence failure in BiCGstab
? Error generated from processor:     0
? This run generated  53 warnings
????????????????????????????????????????????????????????????????????????????????

Yet as both runs stop at the same time, it must have something to do with either the nudging or forcing.
I will try to kill the nudging for this day and see if there is any change.

Nico

comment:6 Changed 16 months ago by nfreychet

Hi Grenville,

So I tried to turn off the nudging, and also changed the time step to 24. I recompile everything and tried to restart from the last dump, but I still have the exact same error.

comment:7 Changed 16 months ago by grenville

Nico

I think the problem is with Specification of trace gases (model selection→atmosphere→scientific parameters..→Spec of trace gases) for which the input data is only specified to 2013.

Grenville

comment:8 Changed 16 months ago by nfreychet

Hi Grenville,

Indeed I extended the input data after 2013 and it solved the problem. Thanks a lot!

Nico

comment:9 Changed 16 months ago by grenville

  • Resolution set to fixed
  • Status changed from new to closed

Nico

Great - it's a pity there is no checking for this in the job set up.

Grenville

Note: See TracTickets for help on using tickets.