Opened 2 years ago

Closed 2 years ago

#2131 closed help (fixed)

Activation of Nudging for dates after 2013 causes NaNs in BICGstab

Reported by: mhollaway Owned by: um_support
Component: UKCA Keywords: Nudging, UKCA
Cc: Platform: Monsoon2
UM Version: 10.6

Description

Hi,

I am having some issues with running a nudged version of my suite (u-ak609). I am nudging the model towards ERA-INT reanalysis data and the suite runs fine when the model time stamp is pre-2013. As soon as the date is 1st January 2013 or later the model crashes with the following error.

???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 1
? Error from routine: EG_BICGSTAB
? Error message: NaNs? in error term in BiCGstab
? This is a common point for the model to fail if it
? has ingested or developed NaNs? or infinities
? elsewhere in the code.
? See the following URL for more information:
? https://code.metoffice.gov.uk/trac/um/wiki/KnownUMFailurePoints
? Error from processor: 239
? Error number: 19

From reading the online documentation I can see that this is the introduction of NaNs? from either the physics scheme or bad input. I have checked the reanalysis files and they seem fine. I cannot find any other tickets reporting this issue or anything in the nudging UMDP. Could this be linked to a change in file format for anything with a 2013 model date onwards? Have I missed something in the nudging setup?

I attach the job.err and job.out files for reference.

Best Wishes,

Michael.

Attachments (2)

job.err (342.4 KB) - added by mhollaway 2 years ago.
job.out (558.2 KB) - added by mhollaway 2 years ago.

Download all attachments as: .zip

Change History (6)

Changed 2 years ago by mhollaway

Changed 2 years ago by mhollaway

comment:1 Changed 2 years ago by mhollaway

Update: I have tried a number of different tests to try and resolve this issue but sadly the above error keeps occurring. At first I thought it was my new updates that caused the error so I turned these off and removed the link to the branch. I also tried running the suite without nudging. My final test was to run to the end of 2012 (with nudging which ran successfully to 31st Dec 2012) and then attempt to start a run from the restart dump produced by this setup.

As soon as the model basis time moves into 2013 the model crashes with the above BiCGstab error.

Am a missing something silly in my setup that could be causing this error?

Best Wishes,

Michael.

comment:2 Changed 2 years ago by willie

Hi Michael,

There are 81 errors/warnings in the suite. I ran the Metadata > Autofix all configurations and this resolved all but one of them. This is the ukca_mode_seg_size which the help advises to set to 4. So these can all be repaired.

The fact that the model runs for so long suggests the general setup including nudging is fine.

Which files have changed format?

There is no need to include the job.err/out files in the ticket as we can see them on the HPC.

Regards
Willie

comment:3 Changed 2 years ago by mhollaway

Hi Willie,

Apologies for the delayed response I was away at a project meeting last week. As it happens I have now managed to resolve the issue. It turns out the issue was with the time-varying prescribed greenhouse gas values only running up to 2012 hence when the model ran into 2013 it generated instabilities in the radiation scheme. I was able to solve the problem by just running with fixed values instead.

As I suspected a silly error on my part.

You can close this ticket now.

Best wishes,

Michael.

comment:4 Changed 2 years ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.