Opened 6 years ago

Closed 6 years ago

#1124 closed help (fixed)

submission of UM jobs from Puma command line

Reported by: cwright Owned by: simon
Component: PUMA Keywords: ensemble, resubmission, Puma, SUBMIT
Cc: sosprey@… Platform: PUMA
UM Version: 6.6.3

Description

Hi,

I'm trying to set up an ensemble of HadGEM runs on Hector via Puma, where the varying initial conditions are generated by:

(a) running the model for a few days
(b) using the output generated from this as the start dump, and then running again from the original date
(c ) repeating as many times as I require ensemble members

As far as I'm aware, this should only involve:

(A) replacing the (n-1)th iteration's start dump with the output from the nth iteration (retaining the (n-1)th start dump for later use/comparison in a seperate directory)
(B) modifying the job's RECONA file to have the right start date

The easiest way I can see to do this is with a shell script (running locally at Oxford, so as to avoid consuming Puma resources) which will:

  1. log into Hector
  2. check if the previous run was successful (rest of steps assumes this is the case)
  3. replace the existing start dump with the new one (copying the old one elsewhere)
  4. log out of Hector
  5. log into Puma
  6. modify the RECONA job appropriately
  7. resubmit the job to Hector
  8. log out of Puma
  9. repeat as necessary

(with appropriate time delays and file checking to make sure I don't have too many jobs running concurrently, don't use too much disk space, etc, but this is the basic idea).

However, I'm not sure how to do step (7) automatically, i.e. without logging in to the UMUI. I have four jobs and want at least 10 ensemble members for each, maybe more, so this will get a bit tedious to do by hand! (each ensemble member is only 60 days, so it shouldn't be too large a load on Hector overall)

So, my question: given the job won't otherwise change, except in ways the shell script will handle, is there any way to just resubmit an existing job from the command line?

I've had a look at the SUBMIT script in /umui_jobs/(job name)/, but when I try and run this from the command line (as 'source ./SUBMIT'), I just get ksh errors (see e.g. ~cwright/umui_jobs/xiwaz, which is the only job I've tried to submit from the command line yet - the eventual intention is to apply this to jobs xiw(a,b,c,d) but I figured I should start with a short run first!)

Change History (6)

comment:1 Changed 6 years ago by simon

  • Owner changed from um_support to simon
  • Status changed from new to assigned

Hi,

You can do this without re-logging onto puma. When you submit the run via the umui, all it
does is copy over the contents of ~/umui_jobs/runid to hector, process the files to produce a submission script and then submits it. Have a look in ~/umui_runs/ on hector. The jobs directory
is runid-xxxxxxxxx. Inside you'll find the submission script which is used to submit the job on hector via qsub. Sometimes this directory get deleted after a successful run, so it's a good idea to
copy it elsewhere. You can edit the RECONA inside this directory and resubmit using the submission script.

Simon.

comment:2 Changed 6 years ago by cwright

Hi Simon,

I get the idea, and I've implemented all the supporting parts of a script to handle it, but I'm having trouble working out how to fire the run itself off from Hector.

I've copied the contents of the relevant runid-xxxxx directory to ~cwright/start/ensemblegen, and I've been trying to figure out how to fire the new run off from the SUBMIT file in that directory. The furthest I've got is to use

qsub -N xiwxc000 \

-l cput=3600 \
-o /home/n02/n02/cwright/um/umui_out/test.comp.leave \
-A (ACCOUNT CODE HERE) \
-S /bin/ksh /home/n02/n02/cwright/start/ensemblegen/SUBMIT

but this just generates an error file, e.g. ~cwright/start/xiwxc000.e1682762

It's fairly obvious I'm doing something wrong - is this in my qsub call syntax, or in which file I'm trying to use? As far as I can tell the run should work in and of itself, as when I send it from the umui it works fine!

comment:3 Changed 6 years ago by simon

Hi,

Can you change the permissions on the error file so that I can read it?

Sorry for the delay, but I've been in SE Asia for the last fortnight.

Simon.

comment:4 Changed 6 years ago by cwright

Should all be visible now, sorry about that. Hope you had a nice trip!

comment:5 Changed 6 years ago by simon

Hi,

The SUBMIT script isn't supposed to be submitted. It's the submission script generator script.
Have a look at umuisubmit_run in the same directory. This contains all of the PBS control
commands so there's no need for the command line arguments.

Simon.

comment:6 Changed 6 years ago by ros

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.