Opened 7 years ago

Closed 7 years ago

#1037 closed help (fixed)

Problems with C run. Diagnostics have not data => file cannot be created and cannot be stored on MASS....hence job falls over!

Reported by: RPope Owned by: um_support
Component: UM Model Keywords:
Cc: nicholas.savage@… Platform: MONSooN
UM Version: 8.2

Description

Hi,

I am trying to run a year long simulation of 2006 using the AQUM on MONSOON. The compile, reconfiguration and N-runs work, but after 8 days in the C-run the model falls over with falling messsages:

Reading SW spectral files
Namelist file: /projects/um1/vn8.2/ctldata/spectral/ga3_1/spec_sw_ga3_0
Namelist file: /projects/um1/vn8.2/ctldata/spectral/ga3_1/spec_sw_cloud3_0

????????????????????????????????????????????????????????????????????????????????
??????????????????????????????????? WARNING ????????????????????????????????????
? Warning in routine: r2_sw_specin
? Warning Code: -1
? Warning Message: * warning: the sw spectrum contains no data for aerosols.
? Warning generated from processor: 0
????????????????????????????????????????????????????????????????????????????????

So the diagnostic has no data which I think leads to the next message:

??????????????????????????????????? WARNING ????????????????????????????????????
? Warning in routine: Init_PP_Crun
? Warning Code: -10
? Warning Message: Error: Not enough data in file to hold fixed header, need 256 words.
? Warning generated from processor: 0
????????????????????????????????????????????????????????????????????????????????

So now the file is not created or something and then not saved on MASS via MOOSE (I think):

run MOOSE_JOB_SNAP
MOOSE_JOB_SNAP: tar command is OK
MOOSE_JOB_SNAP: gzip command is OK
MOOSE_JOB_SNAP run is OK
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pa20060104_06 ARCHIVE PPNOCHART
xiimaa.pa20060104_06 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pa20060104_06 DELETE
xiimaa.pa20060104_06 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pd20060103 ARCHIVE PPNOCHART
xiimaa.pd20060103 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pd20060103 DELETE
xiimaa.pd20060103 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pe20060104_00 ARCHIVE PPNOCHART
xiimaa.pe20060104_00 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pe20060104_00 DELETE
xiimaa.pe20060104_00 is zero length or does not exist:
next request
qsserver: EOF on PIPE but model still executing - waiting
Server process:… Ending

Thanks for your help.

Richard

Change History (2)

comment:1 in reply to: ↑ description Changed 7 years ago by jeff

Replying to RPope:

Hi Richard

Hi,

I am trying to run a year long simulation of 2006 using the AQUM on MONSOON. The compile, reconfiguration and N-runs work, but after 8 days in the C-run the model falls over with falling messsages:

Reading SW spectral files
Namelist file: /projects/um1/vn8.2/ctldata/spectral/ga3_1/spec_sw_ga3_0
Namelist file: /projects/um1/vn8.2/ctldata/spectral/ga3_1/spec_sw_cloud3_0

????????????????????????????????????????????????????????????????????????????????
??????????????????????????????????? WARNING ????????????????????????????????????
? Warning in routine: r2_sw_specin
? Warning Code: -1
? Warning Message: * warning: the sw spectrum contains no data for aerosols.
? Warning generated from processor: 0
????????????????????????????????????????????????????????????????????????????????

This warning message appears in all your runs and is not the cause of the problem.

So the diagnostic has no data which I think leads to the next message:

This is not a diagnostic, the warning messages refer to the spectral namelist files referenced above

??????????????????????????????????? WARNING ????????????????????????????????????
? Warning in routine: Init_PP_Crun
? Warning Code: -10
? Warning Message: Error: Not enough data in file to hold fixed header, need 256 words.
? Warning generated from processor: 0
????????????????????????????????????????????????????????????????????????????????

So now the file is not created or something and then not saved on MASS via MOOSE (I think):

run MOOSE_JOB_SNAP
MOOSE_JOB_SNAP: tar command is OK
MOOSE_JOB_SNAP: gzip command is OK
MOOSE_JOB_SNAP run is OK
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pa20060104_06 ARCHIVE PPNOCHART
xiimaa.pa20060104_06 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pa20060104_06 DELETE
xiimaa.pa20060104_06 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pd20060103 ARCHIVE PPNOCHART
xiimaa.pd20060103 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pd20060103 DELETE
xiimaa.pd20060103 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pe20060104_00 ARCHIVE PPNOCHART
xiimaa.pe20060104_00 is zero length or does not exist:
next request
qsserver: Thu Mar 14 19:54:59 GMT 2013: xiimaa.pe20060104_00 DELETE
xiimaa.pe20060104_00 is zero length or does not exist:
next request
qsserver: EOF on PIPE but model still executing - waiting
Server process:… Ending

These errors are from the file xiima000.xiima.d13073.t195450.leave, but the run actually first fails in the previous job, see file xiima009.xiima.d13073.t174618.leave. If you examine the dump file xiimaa.da20060104_12, and look at the surface temperature field for example, you can see it has non-physical values, therefore your model has blown up. You need to fix this and the above errors should go away.

Jeff.

comment:2 Changed 7 years ago by grenville

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.