wiki:UM/RunningUMOnArcher

Running the UM on ARCHER

To run a UM experiment on ARCHER you need:

  1. to have an account on PUMA in order to access the UMUI (UM User Interface) (see PUMA service to request an account)
  2. to have an account on ARCHER (see the FAQ How do I get a ARCHER user account?)
  3. completed the instructions for setting up your ARCHER environment

Running a Sample UM Job

There are sample ARCHER jobs for all supported versions of the UM on the PUMA UMUI under the userid umui. Use these sample jobs to check that a standard job runs before running your own UM experiments.

  • Carry out the minimum changes needed to the job:
    • Model Selection→User Information and Target Machine→General Details:
      • User id
      • Mail id
      • Account name/Tic code
    • Model Selection→FCM Configuration→FCM Extract and Build directories and Output levels:
      • Target machine root extract directory: Set this to location in you home directory on ARCHER. E.g. /home/n02/n02/ros/um
      • Local machine root extract directory: Check that this is set to $HOME/um/um_extracts
  • Click save and process the job in the UMUI which creates the UM scripts locally.
  • Click the submit button to submit the job to ARCHER.
    A window will pop up detailing progress of the job submission and should always be checked for errors before closing. It also gives details of where your compilation and model output will be sent to on ARCHER.
  • Log onto ARCHER and run the command qstat -u <username> to follow the job in the queues.
    • the compile stage will run in the ARCHER serial queue.
    • the reconfiguration (if requested) and the run stage will be automatically submitted after the compilation stage and will run in the parallel queues.

Wait until both your serial job (the compilation and build phase) and your parallel job has finished. If nothing appears when you run qstat -u <yourusename>, then ARCHER has finished doing what you asked it to do.

  • Check the output files (details of which were given when you submitted the job) to ensure that the run completed successfully.

What might fail:

  • You have not specified the correct NCAS sub-project allocation in the UMUI as a TIC/Account code. Use the command groups on ARCHER to see what sub-projects you have access to.
  • The job is trying to use some disk space on ARCHER where either you do not have any, or sufficient, disk space allocation. Read the information about NCAS ARCHER disk organisation before contacting g.m.s.lister@…

Running your own Job

The sample UM jobs are only meant to be examples so when setting up your own UM experiments it may be more appropriate to copy a UM job from a colleague which may be a closer starting point. UM jobs can be copied from one UMUI installation to another by using the download or export button in the UMUI, which creates a basis file that can then be emailed to another UMUI site and then using the upload or import button to transfer it into the new UMUI.

When adapting another user's UM experiment to your own you should check:

  • What disk areas the job uses
  • Do you have permission to access all the files (mods, dumps, ancillary files) on ARCHER?
  • Are the resources sufficient, CPU time, number of processors, memory, disk space?
  • Are the science settings what you need or want?
  • Are the diagnostics (STASH ) set up as you need them to analyse the experiment results?
  • If you are copying a job that was run on a platform other than ARCHER, are all the input files present on ARCHER?

What might fail:

  • You could run out of ARCHER resources, insufficient disk space or CPU time.
  • The model may crash in the middle of a run. The reason for this is not always easy to determine. Read the UM User Guide and look at the NCAS training material and explore the help buttons in the UMUI to see if there are restrictions or limitations associated with what you are trying to run.

If you are really perplexed then contact the CMS helpdesk!

Useful Tips

Please don't run the UM if there are problems revealed by the UMUI's check setup facility. Most of the problems reported by check setup are easy to solve and should be solved. Simply visiting a page and doing a "close" may be all that is required. In one or two instances, "check setup" speaks a language all of its own and it is difficult to find where the problem lies. In this case, drop a line to the CMS helpdesk and we will try to locate the problem.

Carry out the following checks prior to a run:

  • In submodel independent > Sub model independent Options > Misc. Sections 94 Check that option 1A (general MPP) is selected
  • Check that the number of boundary layer levels in the vertical resolution page agrees with that specified in the scientific section
  • On the STASH diagnostics page, do a verify (Ctl-V) and correct any reported problems
  • Ensure that PP packing is not used (see below)
  • Do a final check setup and correct any reported problems.

Do NOT use PP Packing on ARCHER

In the UMUI, go to Sub Model Independent > Post processing > Initialization and Post Processing of mean and standard PP files. Ensure that the "Unpacked, profile 0" button is highlighted and that each entry in the table has packing profile zero. Failure to do this will result in illegal numbers ("Not a Number" or Nan) in the diagnostic STASH output.

Last modified 4 months ago Last modified on 21/12/16 18:53:45