Opened 7 months ago

Closed 8 weeks ago

#2165 closed error (fixed)

Nesting Suite UM frame job error

Reported by: dsergeev Owned by: willie
Priority: high Component: UM Model
Keywords: CreateBC Cc:
Platform: Monsoon2 UM Version: 10.2

Description

Hi,

In one of my UM runs, I decided to increase the run length from 48 to 72 hours. Almost everything works fine, but the um_frame job fails at the forecast period 9 (see screenshot).

Below is the stderr message of that frame job:

Warning in umPrintMgr: umPrintLoadOptions : Failed to get filename for IO control file from environment
Warning in umPrintMgr: umPrintSetLevel : Problem reading $PRINT_STATUS, output level will stay as    2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
forrtl: severe (18): too many values for NAMELIST variable, unit 10, file /projects/accacia/deser/cylc-run/u-ae616/work/20080301T1200Z/NorwegianSea_km2p2_ukv_default_um_frame_009/input.nl, line 68, position 81
Image              PC                Routine            Line        Source             
um-createbc.exe    00000000005EA997  Unknown               Unknown  Unknown
um-createbc.exe    000000000061433A  Unknown               Unknown  Unknown
um-createbc.exe    00000000005003E0  lbc_grid_namelist         138  lbc_grid_namelist_file_mod.f90
um-createbc.exe    0000000000401B19  MAIN__                    178  createbc.f90
um-createbc.exe    000000000040050E  Unknown               Unknown  Unknown
um-createbc.exe    00000000006BEEF1  Unknown               Unknown  Unknown
um-createbc.exe    00000000004003E9  Unknown               Unknown  Unknown
[FAIL] um-createbc.exe input.nl # return-code=18
Received signal EXIT
2017-05-04T15:06:02Z CRITICAL - Task job script received signal EXIT
2017-05-04T15:06:02Z CRITICAL - failed

What could be the reason of this error? I've tried restarting the suite from scratch a couple of times, but it fails nonetheless. From time to time this error appears even at the first forecast period.

Thanks,
Denis

Attachments (1)

um_frame_error.png (273.8 KB) - added by dsergeev 7 months ago.

Download all attachments as: .zip

Change History (13)

Changed 7 months ago by dsergeev

comment:1 Changed 6 months ago by willie

Hi Denis,

Is this still an issue? The name list file has been deleted.

Regards
Willie

comment:2 Changed 6 months ago by dsergeev

  • Resolution set to answered
  • Status changed from new to closed

Hi Willie,

I'm running the suite for different cases at the moment, so I'm going to close this issue for now. That problem seemed to be persistent, so if I decide to make longer runs again and get the same error, I'll reopen the ticket.

Denis

comment:3 Changed 3 months ago by dsergeev

  • Resolution answered deleted
  • Status changed from closed to reopened

Hi Willie,

I needed to rerun one of my cases for a longer period, and I ran into the same error.

Should I change the frequency of LBC files somehow?

I think the problem is that max_input_files is set to 60 in src/utility/createbc/lbc_output_control_mod.f90, which is reached at frame009 stage (see /projects/accacia/deser/cylc-run/u-ai883/work/20070119T1200Z/NorwegianSea_km2p2_ukv_default_um_frame_009/input.nl).

Now the suite id is u-ai883, and the forecast cycle is 20070119T1200Z.

Could you please look into this?

Best regards,
Denis

comment:4 Changed 3 months ago by willie

  • Owner changed from um_support to willie
  • Status changed from reopened to accepted

comment:5 Changed 3 months ago by willie

Hi Denis,

CreateBC is designed to handle up to 1,000 files so it's not that. I'll look further …

Willie

comment:6 Changed 3 months ago by willie

Hi Denis,

Oh dear, you're quite right. The UM10.2 Create BC can only do 60 input files; the UM10.3 version can do 1,000 files.

The solution I think is to change, in rose edit, the um_createbc meta data to um-createbc/vn10.3, and then run again.

Regards
Willie

comment:7 Changed 3 months ago by willie

  • Keywords CreateBC added; nesting suite removed
  • Resolution set to fixed
  • Status changed from accepted to closed

comment:8 Changed 2 months ago by dsergeev

  • Resolution fixed deleted
  • Status changed from closed to reopened

Hi Willie,

Sorry for my slow reaction, but I changed the line in um_createbc as you suggested, and now all frame jobs fail with errors like

Warning in umPrintMgr: umPrintLoadOptions : Failed to get filename for IO control file from environment
Warning in umPrintMgr: umPrintSetLevel : Problem reading $PRINT_STATUS, output level will stay as    2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
forrtl: severe (18): too many values for NAMELIST variable, unit 10, file /projects/accacia/deser/cylc-run/u-ai883/work/20070119T1200Z/NorwegianSea_km2p2_ukv_default_um_frame_000/input.nl, line 68, position 81
Image              PC                Routine            Line        Source             
um-createbc.exe    00000000005EA997  Unknown               Unknown  Unknown
um-createbc.exe    000000000061433A  Unknown               Unknown  Unknown
um-createbc.exe    00000000005003E0  lbc_grid_namelist         138  lbc_grid_namelist_file_mod.f90
um-createbc.exe    0000000000401B19  MAIN__                    178  createbc.f90
um-createbc.exe    000000000040050E  Unknown               Unknown  Unknown
um-createbc.exe    00000000006BEEF1  Unknown               Unknown  Unknown
um-createbc.exe    00000000004003E9  Unknown               Unknown  Unknown
[FAIL] um-createbc.exe input.nl # return-code=18
Received signal EXIT
2017-09-27T08:29:49Z CRITICAL - Task job script received signal EXIT
2017-09-27T08:29:49Z CRITICAL - failed

Did I do something wrong here?

Cheers,
Denis

comment:9 Changed 2 months ago by dsergeev

Hi Willie,

Could you please look at this error again?
I guess something else needs to be changed in um-createbc?

Cheers,
Denis

comment:10 Changed 2 months ago by willie

Hi Denis,

Sorry, I naively thought that would work. It turns out the meta data for 10.2 and 10.3 are the same.

You need to add a new environment variable to app/install_cold/rose-app.conf

UM_VNMBC=10.3

and then in app/install/cold/opt/rose-app-monsoon.conf change

source=/projects/um1/vn${UM_VN}/xc40/utilities

to

source=/projects/um1/vn${UM_VNMBC}/xc40/utilities

So this will keep the same version of UM, but switch to 10.3 for all the utilities.

regards,

Willie

comment:11 Changed 8 weeks ago by dsergeev

Thank you, Willie!
This seems to do the job.

comment:12 Changed 8 weeks ago by dsergeev

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.