Opened 9 years ago

Closed 8 years ago

#634 closed help (fixed)

Failure of budget access/allocation validation

Reported by: oma Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: 7.3

Description

Hello,

I've been trying to run the UM with a case that usually runs fine. So far I had been requesting up to 1 hr to complete the job. Yesterday I requested 4 hrs and it didn't work. Then I asked 1 hr again but the same sort of error message keeps appearing. The following lines represent all the .leave file

--------------------------------------------------------------------------------
*** oma   Job: 243574.sdb   starts: 05/06/11 15:26:18   host: phase2b ***
*** oma   Job: 243574.sdb   starts: 05/06/11 15:26:18   host: phase2b ***
*** oma   Job: 243574.sdb   starts: 05/06/11 15:26:18   host: phase2b ***
*** oma   Job: 243574.sdb   starts: 05/06/11 15:26:18   host: phase2b ***

No time in budget n02-weat

Job terminated on failure of budget access/allocation validation
--------------------------------------------------------------------------------

Resources requested: mpparch=XT,mppnppn=24,mppwidth=72,ncpus=1,place=pack,wallti
me=00:30:00
Resources allocated: 

*** oma   Job: 243574.sdb   ends: 05/06/11 15:26:19   queue: par:4n_1h ***
*** oma   Job: 243574.sdb   ends: 05/06/11 15:26:19   queue: par:4n_1h ***
*** oma   Job: 243574.sdb   ends: 05/06/11 15:26:19   queue: par:4n_1h ***
*** oma   Job: 243574.sdb   ends: 05/06/11 15:26:19   queue: par:4n_1h ***
--------------------------------------------------------------------------------

Would it be possible that n02-weat does not have any more available time? What can I be doing wrong?

Thanks in advance for your help,

Oscar

Change History (6)

comment:1 Changed 9 years ago by grenville

Oscar

The time allocation for n02-weat has been increased. Your job should go through now.

Regards

Grenville

comment:2 Changed 9 years ago by oma

Hi Grenville,

Thank you for increasing the time allocation for n02-weat. However, I've tried to submit the job again and I got the following weird message:

/work/n02/n02/oma/xfwbs/bin/qsexecute: Executing dump reconfiguration program

*********************************************************
RCF Executable : /work/n02/n02/oma/xfwbs/bin/qxreconf
*********************************************************


apsched: the confirmed user ID is different from this claim's user ID
/work/n02/n02/oma/xfwbs/bin/qsexecute: Error in dump reconfiguration - see OUTPUT
*****************************************************************
   Ending script   :   qsexecute
   Completion code :   1
   Completion time :   Tue Jun  7 11:00:33 BST 2011
*****************************************************************

Would you know what happened here and how would I solve it?

Thanks,

Oscar

comment:3 Changed 9 years ago by lois

Just re-submit your job Oscar and it should run it is a know issue

http://ncas-cms.nerc.ac.uk/index.php/hpc-faqs/1544-error-messages-and-solution-on-the-hector-phase2b-cray-xe6-service

I will inform HECToR that it is still happening.

Lois

comment:4 Changed 9 years ago by grenville

Oscar

This is a known problem at Hector and the work around is simply to resubmit.

Regards

Grenville

comment:5 Changed 9 years ago by oma

Dear Lois and Grenville,

The job is running after resubmission.

Thanks,

Oscar

comment:6 Changed 8 years ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.