Opened 8 years ago

Closed 8 years ago

#1004 closed help (fixed)

Error in global run

Reported by: rm024650 Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: <select platform>
UM Version: <select version>

Description

Hi

I am trying to run a UM job (xiax) in global configuration. Nothing shows up in check setup and the 12 hours run (xiax.a) worked fine. I tried to increase the run length to 200 hours (xiax.b) and I have the following error in the .leave file:

aprun: Apid 3282282: Caught signal Terminated, sending to application
/var/spool/PBS/mom_priv/jobs/1034021.sdb.SC[338]: .: line 252: 24563: Terminated
-ksh: line 1: 24540: Terminated
_pmiu_daemon(SIGCHLD): [NID 02344] [c5-1c2s4n2] [Mon Dec 24 17:33:53 2012] PE RANK 35 exit signal Terminated
_pmiu_daemon(SIGCHLD): [NID 02347] [c5-1c2s5n3] [Mon Dec 24 17:33:53 2012] PE RANK 3 exit signal Terminated
_pmiu_daemon(SIGCHLD): [NID 02345] [c5-1c2s4n3] [Mon Dec 24 17:33:53 2012] PE RANK 64 exit signal Terminated


The job xiax.c is the same as xiax.b with the following options turned on and failed in similar way:

flush print buffer , operational prints in DIAG_PRN
subroutine timer diagnostics , extra diagnostic messages in output option

I hope you can help me to go beyond the first 12 hours.

Regards

Mike Wong

Change History (2)

comment:1 Changed 8 years ago by grenville

Mike

Your job is running out of time - submit it for longer. Go to Model Selection→Input/Output?..→Job Submission.. and increase the Job time limit.

Grenville

comment:2 Changed 8 years ago by grenville

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.