Opened 5 months ago

Closed 2 months ago

#3393 closed help (fixed)

Problems with adapting nudging code

Reported by: jlgarcia Owned by: um_support
Component: UM Model Keywords: nudging, atmos_main
Cc: Platform:
UM Version: 10.9



My suite u-by341 has been failing at the atmos_main task and I can't find a way to fix it.
This suite stems from my suite u-bx823 which itself stems from u-as019 which is a standard GA7.1 suite for vn10.9. Note: u-bx823 runs fine.

I then copied the suite to u-by341 and tried to turn on the nudging in this suite and found that the nudging code was not adapted to nudge in specific regions bounded by latitude and longitude.

Matt Brown, in my group, has several branches for vn10.3 that do this (see e.g. branches/dev/mattbrown/vn10.3_um10.3_nudge_sep_regions@52914).

I attempted to adapt his code to the nudging code of u-bx823 in u-by341, and was able to compile and get the fcm_make task to work without error.
however, the atmos_main task fails and I don't know why. The error log file is full of messages like this one:

lib-4211 : UNRECOVERABLE library error
  A WRITE operation tried to write a record that was too long.

Encountered during a sequential formatted WRITE to an internal file (character variable)
 an internal file (character variable)

And the final bit of the log file says:

ATP Stack walkback for Rank 425 starting:

I suppose there is a problem with nudging_filename_mod.F90 but I didn't touch that file, which makes me wonder if perhaps Matt's nudging implementation is incompatible with vn10.9 and it is best to just use a suite from the same UM version as his branches.

Could you advise on this?

All the best,

Change History (6)

comment:1 Changed 5 months ago by jeff


If you look at file nudging_filename_mod.F90, line 131, you can see it's trying to use a write statement to copy three strings into another string. My guess is the strings are to long for the size of dataname1. You could print out the strings and see what is happening here.

As an aside the use of a write statement here is unnecessary and you could copy the strings directly (making sure they are the right size of course). This use of a write statement is usually used for copying integer or real values into a string.


comment:2 Changed 5 months ago by jlgarcia

Hi Jeff,

Yes, the problem is with the strings of the corresponding files for the nudging data, in this case era-interim.
I am using the same string value for the nudging data at '/projects/ukca-admin/analyses/era-in' in other suites and it works fine.

Printing the three strings in the write statement shows the following lines:

 /projects/ukca-admin/analyses/era-in                                                                                                                                                                                            ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@~@������^@^@^@^@^@^@^@^@/                                    
 /projects/ukca-admin/analyses/era-in                                                                                                                                                                                            ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@~@������^@^@^@^@^@^@^@^@/

I don't really know if the @ are supposed to be there, perhaps the TRIM function is not working properly here. I am a bit lost.

comment:3 Changed 5 months ago by jeff


The directory path string is read from a namelist on PE0 and then broadcast to all the other PE's. As far as I can tell this all seems correct in the nudging code. The error happens when the correct string on PE0 is broadcast to the other PE's, I doesn't seem to send the last 32 characters and they are left as undefined NULL characters, why this happens I don't know maybe it's a problem with the Cray mpi library.

There is an easy fix, in file nudging_input_mod.F90 change these lines

  my_nml % ndg_datapath = ndg_datapath

to be

  my_nml % ndg_datapath = ndg_datapath


  my_nml % ndg_datapath = ' '


This should get around the problem by setting the string to spaces before the string on PE0 is copied over it.


comment:4 Changed 5 months ago by pmcguire

  • Summary changed from Problems with adapting nuding code to Problems with adapting nudging code

comment:5 Changed 3 months ago by jlgarcia

Thanks Jeff,
We tried this in the group and appears to have worked.

All the best.

comment:6 Changed 2 months ago by ros

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.