Opened 3 years ago

Closed 3 years ago

#2024 closed help (duplicate)

UKCA job runs for 16 days and then walltime is exceeded

Reported by: dilshadshawki Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: MONSooN
UM Version: 8.4

Description

Dear Helpdesk,

I have a job, xlqqg which manages to give me output for 16 days but then stops because of the walltime is exceeded. The .leave file does not provide much more detail. Is there a way of finding out why this doesn't work?

/home/dshawk/output/xlqqg000.xlqqg.d16327.t162342.leave

The job did work when it was run in the past on the IBM system, but I made the appropriate changes as per the ticket #1976 and http://www.ukca.ac.uk/wiki/index.php/MONSooN_IBM_to_Cray_Transition

But that should be dealt with since it is able to run for a few days.

Any idea what could be happening?

This is a new ticket that I've created after dealing with the issue from ticket #1976

Many thanks,
Dill

Change History (2)

comment:1 Changed 3 years ago by willie

Hi Dill,

I looked at the pe_output files and there are lots of messages like

Too many negatives (>2000) in spimpmjp (iter  7)  lon=  1  lat=132; halving step

in the ASAD chemical solver. I don't know if that is significant, but it might slow the model down.

Did xmwsd work? If so here is a list of the differences, excluding STASH are below.

Job xlqqh Title cp xmwsd cp xlqqc - vn8.4 UKCA CheST+GLOMAP-mode nudged run (CRAY)
Job xmwsd Title cp xlqqc - vn8.4 UKCA CheST+GLOMAP-mode nudged run
Difference in window subindep_FileDir
 -> Model Selection
   -> Input/Output Control and Resources
     -> Time Convention and SCRIPT Environment Variables.
Differences in Table Defined Environment Variables for Directories
 5,6c5,6
<  UKCA_EMISS /projects/ukca/inputs/ancil/N96L85/emiss
<  UKCA_INIT /projects/ukca/inputs/initial
---
>  UKCA_EMISS /projects/ukca-admin/inputs/ancil/N96L85/emiss
>  UKCA_INIT /projects/ukca-admin/inputs/initial


Difference in window subindep_HandEdit
 -> Model Selection
   -> Input/Output Control and Resources
     -> User hand edit files
Differences in Table Hand edits
 5c5
<  ~ukca/hand_edits/VN8.4/raderv2.1_vn84_MONSooN.ed Y
---
>  ~ukca/hand_edits/VN8.4/raderv2.1_vn84_MONSooN_XC40.ed Y


Difference in window subindep_FCM_UM_Opt
 -> Model Selection
   -> FCM Configuration
     -> FCM Options for Atmosphere and Reconfiguration
Differences in Table Central Script Modifications
 1c1
<  fcm:um_br/dev/jeff/vn8.4_hector_monsoon_archiving/src Blank Y
---
>  fcm:um_br/dev/jeff/vn8.4_hector_monsoon_archiving/src Blank N


Difference in window subindep_PostProc_Gen
 -> Model Selection
   -> Post Processing
     -> Main Switch + General Questions
Entry box: Monsoon project group name
 Job xlqqh: Entry is set to 'ukca'
 Job xmwsd: Entry is set to 'ukca-imp'
Radio button: Specify archiving system required
 Job xlqqh: Entry is set to 'MONSooN NERC disk archive '
 Job xmwsd: Entry is set to 'The new system (MOOSE) '
Entry box: Path to the archiving script
 Job xlqqh: Entry is inactive
 Job xmwsd: Entry is set to '$UMDIR/archiving/bin'

Difference in window atmos_Science_Section_UKCA_Rad
 -> Model Selection
   -> Atmosphere
     -> Scientific Parameters and Sections
       -> Section by section choices
         -> Section 34: UKCA Chemistry and Aerosols
           -> UKCA Chemistry Coupling
             -> MODE Aerosols in Radiation Scheme
Entry box: Look-up table for accumulation-mode aerosols in the longwave:
 Job xlqqh: Entry is set to 'nml_ac_lw_new'
 Job xmwsd: Entry is set to 'nml_ac_lw'
Entry box: Look-up table for accumulation-mode aerosols in the shortwave:
 Job xlqqh: Entry is set to 'nml_ac_sw_new'
 Job xmwsd: Entry is set to 'nml_ac_sw'
Entry box: Look-up table for coarse-mode aerosols in the longwave:
 Job xlqqh: Entry is set to 'nml_cr_lw_new'
 Job xmwsd: Entry is set to 'nml_cr_lw'
Entry box: Look-up table for coarse-mode aerosols in the shortwave:
 Job xlqqh: Entry is set to 'nml_cr_sw_new'
 Job xmwsd: Entry is set to 'nml_cr_sw'

Difference in window atmos_InFiles_PAncil_SST
 -> Model Selection
   -> Atmosphere
     -> Ancillary and input data files
       -> Climatologies & potential climatologies
         -> Sea surface temperatures
Entry box: and file name
 Job xlqqh: Entry is set to 'sst_rgd_200908-201201.n96'
 Job xmwsd: Entry is set to 'sst_mon_1981-2012_prf.n96'

Difference in window atmos_InFiles_PAncil_Seaice
 -> Model Selection
   -> Atmosphere
     -> Ancillary and input data files
       -> Climatologies & potential climatologies
         -> Sea ice fields
Entry box: and file name
 Job xlqqh: Entry is set to 'ice_rgd_200908-201201.n96'
 Job xmwsd: Entry is set to 'ice_mon_1981-2012_prf.n96'

Difference in window atmos_InFiles_PAncil_UserS
 -> Model Selection
   -> Atmosphere
     -> Ancillary and input data files
       -> Climatologies & potential climatologies
         -> User single-level ancillary file & fields
Entry box: and file name
 Job xlqqh: Entry is set to 'surf_lvl_emsns_nmv1.anc'
 Job xmwsd: Entry is set to 'surf_level_emsns_nmv.anc'

<STASH differences>

Difference in window subindep_ResQsub
Path to subindep_ResQsub not in navigation tree
Difference in variable CJTLIM2
 Job xlqqh: 
 Job xmwsd: 10800
Check box: Use Hyperthreads
 Job xlqqh: Entry is unset
 Job xmwsd: Entry is set to 'ON'
Radio button: Class name
 Job xlqqh: Entry is unset
 Job xmwsd: Entry is set to 'normal'

Regards
Willie

comment:2 Changed 3 years ago by ros

  • Resolution set to duplicate
  • Status changed from new to closed

Closing ticket as this problem is now being dealt with in the original ticket #1976

Note: See TracTickets for help on using tickets.