Running a UMUI job on ARCHER2

To run a UMUI experiment on ARCHER2 you need:

  1. to have an account on PUMA in order to access the UMUI (UM User Interface) (see PUMA service to request an account)
  2. to have an account on ARCHER2 (see the FAQ How do I get a ARCHER2 user account?)
  3. completed the instructions for setting up your ARCHER2 environment

Running an Example UMUI Job

The only supported versions of the UMUI on ARCHER2 are 7.3 and 8.4

There are example ARCHER2 jobs for both these versions of the UM on the PUMA UMUI under the userid umui. Use these example jobs to check that a standard job runs before running your own UM experiments.

  • Copy an example job on the PUMA UMUI
  • Carry out the minimum changes needed to the job:
    • Model Selection→User Information and Target Machine→General Details:
      • User id
      • Mail id
      • Account name/Tic code
    • Model Selection→FCM Configuration→FCM Extract and Build directories and Output levels:
      • Target machine root extract directory: Set this to location in your work directory on ARCHER2. E.g. /work/n02/n02/ros/um
      • Local machine root extract directory: Check that this is set to $HOME/um/um_extracts
    • Model Selection → Input/Output? Control and Resources → Time Convention and SCRIPT Environment Variables:
      • Set DATADIR in the Defined Environment Variables table. This must be on /work(e.g. /work/n02/n02/ros)
  • Click save and process the job in the UMUI which creates the UM scripts locally.
  • Click the submit button to submit the job to ARCHER2.
    A window will pop up detailing progress of the job submission and should always be checked for errors before closing. It also gives details of where your compilation and model output will be sent to on ARCHER2.
  • Log onto ARCHER2 and run the command squeue -u <username> to follow the job in the queues.
    • the compile stage will run in the ARCHER2 serial queue.
    • the reconfiguration (if requested) and the run stage will be automatically submitted after the compilation stage and will run in the parallel queues.

Wait until both your serial job (the compilation and build phase) and your parallel job has finished. If nothing appears when you run squeue -u <yourusename>, then ARCHER2 has finished doing what you asked it to do.

  • Check the output files (details of which were given when you submitted the job) to ensure that the run completed successfully.

What might fail:

  • You have not specified the correct NCAS sub-project allocation in the UMUI as a TIC/Account code. Use the command groups on ARCHER2 to see what sub-projects you have access to.
  • The job is trying to use some disk space on ARCHER2 where either you do not have any, or sufficient, disk space allocation. Read the information about NCAS ARCHER2 disk organisation before contacting the CMS helpdesk.

Running your own Job

The sample UM jobs are only meant to be examples so when setting up your own UM experiments it may be more appropriate to copy a UM job from a colleague which may be a closer starting point. UM jobs can be copied from one UMUI installation to another by using the download or export button in the UMUI, which creates a basis file that can then be emailed to another UMUI site and then using the upload or import button to transfer it into the new UMUI.

When adapting another user's UM experiment to your own you should check:

  • What disk areas the job uses
  • Do you have permission to access all the files (mods, dumps, ancillary files) on ARCHER2?
  • Are the resources sufficient, CPU time, number of processors, memory, disk space?
  • Are the science settings what you need or want?
  • Are the diagnostics (STASH ) set up as you need them to analyse the experiment results?
  • If you are copying a job that was run on a platform other than ARCHER2, are all the input files present on ARCHER2?

What might fail:

  • You could run out of ARCHER2 resources, insufficient disk space or CPU time.
  • The model may crash in the middle of a run. The reason for this is not always easy to determine. Read the UM User Guide and look at the NCAS training material and explore the help buttons in the UMUI to see if there are restrictions or limitations associated with what you are trying to run.

If you are really perplexed then contact the CMS helpdesk!

Useful Tips

Please don't run the UM if there are problems revealed by the UMUI's check setup facility. Most of the problems reported by check setup are easy to solve and should be solved. Simply visiting a page and doing a "close" may be all that is required. In one or two instances, "check setup" speaks a language all of its own and it is difficult to find where the problem lies. In this case, drop a line to the CMS helpdesk and we will try to locate the problem.

Carry out the following checks prior to a run:

  • In submodel independent > Sub model independent Options > Misc. Sections 94 Check that option 1A (general MPP) is selected
  • Check that the number of boundary layer levels in the vertical resolution page agrees with that specified in the scientific section
  • On the STASH diagnostics page, do a verify (Ctl-V) and correct any reported problems
  • Ensure that PP packing is not used (see below)
  • Do a final check setup and correct any reported problems.

Do NOT use PP Packing on ARCHER2

In the UMUI, go to Sub Model Independent > Post processing > Initialization and Post Processing of mean and standard PP files. Ensure that the "Unpacked, profile 0" button is highlighted and that each entry in the table has packing profile zero. Failure to do this will result in illegal numbers ("Not a Number" or Nan) in the diagnostic STASH output.

Last modified 9 months ago Last modified on 27/01/21 18:49:06