Opened 11 months ago

Closed 9 months ago

#2555 closed help (fixed)

Convergence failure on first time step

Reported by: shakka Owned by: willie
Component: UM Model Keywords: convergence failure, omgbicgstab,
Cc: Platform: Monsoon2
UM Version: 10.4



I'm having the same problem with two suites that I am currently trying to run (u-aw620 and u-ai781), both of which successfully ran recently (April/May?). I've been away for a few weeks, but I'm fairly certain I haven't changed anything since then. In both runs, I get a convergence failure at the first timestep:

???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 11
? Error from routine: EG_BICGSTAB
? Error message: Convergence failure in BiCGstab, omg is too small
? Error from processor: 0
? Error number: 23

I haven't changed the time step, and the cycle dates are the same as before when the suites both ran successfully. The only difference between the two versions of u-ai781 is that I am now using a prescribed cloud droplet number concentration in the cloud scheme (the default UM setting), rather than the climatological aerosol-generated droplet number that was previously being used. I haven't changed suite u-aw620.

Do you have any idea what is going on?


Change History (8)

comment:1 Changed 11 months ago by grenville


Please point me to log.out files for the successful and failing runs


comment:2 Changed 11 months ago by grenville

job.out files that should say

comment:3 Changed 11 months ago by shakka

Hi Grenville,

All the forecasts in this folder definitely ran: ~/cylc-run/u-ai781/log.20180406T152019Z/job/20110118T0000Z/

e.g. Peninsula_km1p5_PC2_tnuc_um_fcst_003/NN

But the most recent ones e.g. ~/cylc-run/u-ai781/log.20180723T094345Z/job/20110117T0000Z/Peninsula_km4p0_Smith_tnuc_prog_aer_um_fcst_000/NN



comment:4 Changed 11 months ago by shakka

(the most recent run has failed on the recon step with a 'PP HEADERS DO NOT MATCH' error, much to my bemusement, but the last one log contains the error this ticket is about.)

comment:5 Changed 11 months ago by shakka

Hi Grenville, any chance you've been able to look at this? I'm getting the same error with suite u-aw620 using completely different dates, too.

comment:6 Changed 10 months ago by willie

Hi Ella,

I'll just focus on u-ai781 here. It is failing in Peninsula_km4p0_Smith_tnuc_presc_aer_um_fcst_000 at the very first cycle time, 20110117T0000Z, with the error you report above. It fails at the first time step, which generally, but not always indicates a problem with the start dump or the ancillary files.

Only a few files have been changed. fcm status shows

?       app/um/STASHexport.ini
M       app/um/rose-app.conf
?       install_glm_startdata[-PT12H]
M       rose-suite.conf
M       site/monsoon-cray-xc40/suite-adds.rc

The ones with the question mark are spurious files which are not used in the suite. The ones with the M are the modified ones.

If we assume the revision 78735 (23rd May 2018) was working, then the main differences can be found using the command

fcm diff -g

This I think is more than the cloud drop number scheme.

I hope that helps.


PS It is a good idea to commit your changes, once you have things working.

comment:7 Changed 10 months ago by willie

  • Owner changed from um_support to willie
  • Status changed from new to accepted

comment:8 Changed 9 months ago by willie

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.