The ARCHER2 Service is a world class advanced computing resource for UK researchers. ARCHER2 is provided by UKRI, EPCC, Cray (an HPE company) and the University of Edinburgh.

ARCHER2 is due to commence operation in 2020, replacing the current service ARCHER. Please visit the ARCHER2 website.

Pilot System

Prior to installation of the complete ARCHER2, we have access to a 4-cabinet pilot machine that will run in parallel with ARCHER. ARCHER users will find the new machine very familiar in many respects but with some important differences - see for a comprehensive array of presentations, in particular the one titled Differences between ARCHER and ARCHER2.

CMS has installed and undertaken limited testing of several versions of the Unified Model and its auxiliary software. The process is ongoing - we encourage users where possible to migrate their workflows to use the latest versions of the UM.

Limitations of the pilot system may result in some constraint on the nature of workflows that it can accommodate.


ARCHER2 uses SLURM (ARCHER used PBS), so all ARCHER batch scripts need to be rewritten for use on ARCHER2.

login nodes

The login nodes do support persistent ssh agents, so data transfer to JASMIN through Rose/Cylc workflows is possible.

compute nodes

Compute nodes can not see /home. Unlike ARCHER, batch scripts run on the compute nodes, so batch scripts must not have references to /home.

serial nodes

The pilot system does not have serial nodes. The full system will have serial nodes.


Request access through the ARCHER2 SAFE.

File Systems

/home and /work file systems with identical structure to that on ARCHER. The pilot system will have only 325TB on /work and 1.7TB on /home; the full system will have substantially more.


The ARCHER budget structure and membership will carry over to ARCHER2.


Currently installed versions 7.3, 8.4, 11.1 up to 11.8

Standard Jobs

Example jobs
UM version job/suite id Branches Description Notes
7.3 xoxtb CCMI
8.4 xoxta GLOMAP; + CLASSIC: RJ4.0 ARCHER GA4.0
10.3 u-cb431 fcm:um.xm-br/dev/annetteosprey/vn10.3_metum-goml_archer2
MetM-GOML3 N216 global atmos-ocean This starts from the end of u-bd818, as couldn't find initial start files |
10.7 u-as037/archer2 Replace the nemo_sources branch dev_r5518_GO6_package with working copy /home/ros/nemo/branches/dev_r5518_GO6_package to fix nemo_alloc problem HadGEM3-GC3.1 N96ORCA1 PI Control for CMIP6 set ocean ldflags_overrides_suffix to -lstdc++
11.1 u-be303/archer2 fcm:um.x_br/dev/jeffcole/vn11.1_archer2_fixes UKESM AMIP
11.1 u-ca103 fcm:um.x_br/dev/jeffcole/vn11.1_archer2_fixes Nesting suite
11.1 u-bu108 fcm:um.x_br/dev/jeffcole/vn11.1_archer2_fixes
KPP coupled India domain
11.2 u-be463/archer2 fcm:um.x_br/dev/jeffcole/vn11.2_archer2_fixes GA7.0 N96 UM11.2 AMIP Climate Development (1988 - 2008)
11.2 u-bc964/archer2 fcm:um.x_br/dev/jeffcole/vn11.2_archer2_fixes UKESM coupled PI control cpmip analysis not currently working
set ocean ldflags_overrides_suffix to -lstdc++
11.2 will be u-bc613/archer2 fcm:um.x_br/dev/jeffcole/vn11.2_archer2_fixes UKESM coupled Historical cpmip analysis not currently working
set ocean ldflags_overrides_suffix to -lstdc++
11.4 u-ca369 GA7.0 N96 AMIP
11.5 u-ca370 GA7.1 N1280 UM11.5 AMIP
11.6 u-bs251/archer2 GA7.0 N96 UM11.6 AMIP Climate Development (1988 - 2008)
11.6 u-cb151 N96 ORCA025 GC4.0 main assessment run (CMIP6 forcing) NEMO_LAND_SUPPRESS=false
11.7 u-ca634 N96 GA8.0 AMIP Climate Development (1988 - 2008)
11.7 u-by395/archer2 Nesting Suite for RA3+ . no ANTS currently

Table 1. Note, for example, u-be303/archer2 indicates the archer2 branch of u-be303.

Initial setup for running UM Rose/Cylc Suites on ARCHER2

If submitting from puma, add

. /work/y07/shared/umshared/bin/rose-um-env

to your ~/.bash_profile on ARCHER2.

For people submitting from pumatest add

. /work/y07/shared/umshared/bin/rose-um-env
export FCM_VERSION=pumatest
export CYLC_VERSION=pumatest
export ROSE_VERSION=pumatest

to your ~/.bash_profile on ARCHER2.

Quick Start

The following should be appropriate for most UM vn11.x versions. Unfortunately as UM suites can be configured in multiple ways and there's no "standard" UM job configurations it cannot be guaranteed to work. If there are still issues, please reference the more detailed instructions following this section.

There are 4 stages in converting an ARCHER job into an ARCHER2 job.

  • Copy a standard archer2.rc site file.
  • Edit the suite.rc to use this file.
  • Edit the metadata file to allow access to the ARCHER2 configuration.
  • Update the GUI to use ARCHER2 and change the processor decomposition.

First, copy the working suite you wish to run on ARCHER2, and check it out and cd into its rose directory. Then, taking each of the above in turn:


Atmosphere only jobs

Firstly look at your suite.rc file.

If it has lines of the form UM_ATM_NPROCX = {{MAIN_ATM_PROCX}} with a MAIN_ pre-pending the variable then copy /home/simon/archer2/archer2.rc_main to site/archer2.rc.

If it has lines of the form UM_ATM_NPROCX = {{ATM_PROCX}} then copy /home/simon/archer2/archer2.rc to site/archer2.rc.

If the UM_ATM_NPROCX = type lines use some other format, copy /home/simon/archer2/archer2.rc as a base and then edit it to change each line in

{% set APPN = ATM_PPN if ATM_PPN is defined else PPN %}

so that the variables match the equivalents in the suite.rc file.

Edit the line


to use your ARCHER2 username in the copied archer2.rc.

Users of vn11.7 may have to change EXPT_HORIZ to EXPT_HORIZ_ATM in archer2.rc

Coupled and UKESM jobs

Copy /home/simon/archer2/archer2.rc_ukesm to site/archer2.rc.

Edit the line


to use your archer username in the copied archer2.rc.

Note as the Coupled configuration is more complicated than the atmosphere only configuration, there's a greater possibility that the archer2.rc file will need further modifications. Please see the more detailed documentation below.


Search the suite.rc for any reference to archer. If there are none, go onto the next step. Otherwise, change each instance of a reference to archer with archer2.

For example change

% set KNOWN_SITE_CFGS = ['archer', 'meto_cray', 'monsoon', 'nci_raijin', 'niwa_cray'] %}


% set KNOWN_SITE_CFGS = ['archer2', 'meto_cray', 'monsoon', 'nci_raijin', 'niwa_cray'] %}

Note: If the line

%include site/archer-tests.rc

is present, do not change it.


In meta/rose-meta.conf, locate and change all instances of archer to archer2, ARCHER to ARCHER2, Archer to Archer2, etc.

For example:

help=Account code under which to run HPC tasks (e.g. n02-ncas)
title=Account group for HPC tasks

should be updated to

help=Account code under which to run HPC tasks (e.g. n02-ncas)
title=Account group for HPC tasks

Also, in [jinja2:suite.rc=MAIN_ATM_PPN] or [jinja2:suite.rc=ATM_PPN] change range=1:36 to range=1:128


Now start the GUI with rose edit. There will be a warning triangle next to the suite conf section. In suite conf→Host Machine select Archer2, then set the Queue standard queue and the account group by clicking on the plus to add then to the configuration, and then changing the values. This may be in a subsection, selected by clicking the arrow. Remember to put the account group in single quotes. The aim is to remove all of the warning triangles.

In suite conf→Domain Decomposition set the Max number of processors/node to be 128. If you plan to depopulate the node, set this to some multiple of 16. If not using IO servers, it is also a good idea to change the Atmosphere decomposition so that the NSxEWx(OpenMP threads) is some multiple of 128 for most efficient running. Keep the total number of cores roughly the same as for ARCHER.

In fcm_um_make→env→Configuration file set config_root_path to be


where x is the UM version of the suite. Remove the revision number from config_revision so that the field is clear.

For vn11.1 and vn11.2 jobs, in fcm_um_make→env→Sources add


where x is the UM version of the suite.

Ensure that Run Development Tests under suite conf→Tasks is set to false as these don't currently work on ARCHER2.

Save the new config.

Note: If you have any bespoke ARCHER app conf files (possibly set in Model Configuration), these may have to be renamed and updated for ARCHER2

You should now able to submit the job to ARCHER2.

End of quick start.



The multiplicity and diversity of Rose/Cylc suites prevents us from providing a simple comprehensive guide to suite modifications necessary for running on ARCHER2. However, the suites referred to in Table 1 should give hints on to how to upgrade your suite. The suite changes required stem from the following differences between ARCHER and ARCHER2:

  • scheduler: ARCHER uses PBS, ARCHER2 uses SLURM
  • architecture: ARCHER has 24 cores per node, ARCHER2 has 128 cores per node

Changes to account for SLURM will typically be in the [[directives]] section of tasks in the suite.rc file or in an appropriate site/archer2.rc file (you my need to create one of these.) The example below serves to illustrate common SLURM features. Note: the SLURM directives --partition, --qos, and --reservation combine to provide a more flexible replacement for the PBS directive --queue. Additional partitions will become available with the full ARCHER2 system.

        pre-script = """
                     ulimit -s unlimited
                     module restore $UMDIR/modulefiles/um/2020.12.14  <====== to load the environment
                     module list 2>&1
                     export OMP_NUM_THREADS=$TOMP_NUM_THREADS

            --chdir=/work/n02/n02/<your ARCHER2 user name>   <===== you must set this 
{% if ARCHER2_QUEUE == 'short' %}
{% endif %}
            PLATFORM = cce
            UMDIR = /work/y07/shared/umshared
            batch system = slurm                             <===== specify use of SLURM
            host =                       <====== use ARCHER2
{% if HPC_USER is defined %}
            owner = {{HPC_USER}}
{% endif %}

        inherit = HPC
            ROSE_TASK_N_JOBS = 32

            CONFIG = ncas-ex-cce                             <====== note name of config for ARCHER2

Setting SLURM options that specify the number of processors requires assigning values to --nodes, --ntasks, --tasks-per-node, and --cpus-per-task. These should be familiar from ARCHER modulo the precise names for the attributes. Your suite may use different names for the various parameters, such as TASKS_RCF, for example, but there should be a simple correspondence.

        inherit = UM_PARALLEL
            --ntasks= {{TASKS_RCF}}
            execution time limit = PT20M

            --ntasks= {{TASKS_ATM}}
            execution time limit = {{MAIN_CLOCK}}

Your suite should include a section to specify the flags that will be passed the command to launch the job (for ARCHER that command is aprun, for ARCHER2 it is srun.) The flags are different for jobs running with or without OpenMP. Most suites will need some jinja like this:

{# set up slurm flags for OpenMP/non-OpenMP #}
{% if MAIN_OMPTHR_RCF > 1 %}
 {% set RCF_SLURM_FLAGS= "--hint=nomultithread --distribution=block:block" %}
{% else %}
 {% set RCF_SLURM_FLAGS = "--cpu-bind=cores" %}
{% endif %}
{% if MAIN_OMPTHR_ATM > 1 %}
 {% set ATM_SLURM_FLAGS= "--hint=nomultithread --distribution=block:block" %}
{% else %}
 {% set ATM_SLURM_FLAGS = "--cpu-bind=cores" %}
{% endif %}

Suites frequently contain macros to calculate the number of nodes and cores required - the only change needed is to set to 8 the number of NUMA regions per node.

Coupled suites

Review one or more the coupled suites listed in the Standard Jobs table above for a detailed view of changes to the suite needed to run under SLURM.

We have adopted the SLURM heterogeneous jobs method of handling coupled suites where the atmosphere, NEMO, and XIOS are separate executables running under a common communicator. The basic SLURM ideas above carry over to heterogeneous jobs but rather than making an overarching job resource request (as is the case for PBS), each component of the coupled job specifies its own requirements.

For the coupled task (or in its inherited resources)

            hetjob_1_--ntasks= {{OCEAN_TASKS}}
            hetjob_2_--ntasks= {{XIOS_TASKS}}

where hetjob_0_ is associated with the atmosphere, hetjob_1_ with the ocean, and hetjob_2_ with the (XIOS)io-servers.

The variables ROSE_LAUNCHER_PREOPTS_UM, ROSE_LAUNCHER_PREOPTS_NEMO, and ROSE_LAUNCHER_PREOPTS_XIOS also need modification to link the resource request to the job launcher command, for example:

            {% if OMPTHR_ATM > 1 %}
              ROSE_LAUNCHER_PREOPTS_UM  = --het-group=0 --hint=nomultithread --distribution=block:block --export=all,OMP_NUM_THREADS={{OMPTHR_ATM}},HYPERTHREADS={{HYPERTHREADS}},OMP_PLACES=cores
            {% else %}
              ROSE_LAUNCHER_PREOPTS_UM  = --het-group=0 --cpu-bind=cores --export=all,OMP_NUM_THREADS={{OMPTHR_ATM}},HYPERTHREADS={{HYPERTHREADS}}
            {% endif %}

where the flag --het-group=0 makes the connection to hetjob_0_.


KPP coupled suites generally follow the structure of NEMO coupled suites (as above).

Build flags for KPP are set in the suite, in app/fcm_make_kpp/file/fcm-make.cfg. KPP also requires a code branch for Archer2 which allows for flexiblity in the number of OpenMP threads. Otherwise this was hard-coded to 24 for Archer and 36 for NEXCS. Now the number of threads is picked up from the OMP_NUM_THREADS environment variable, as is standard. (Note KPP is OpenMP parallel only.)

See the standard jobs table for examples of a global coupled GC3 MetUM-GOML and a regional nesting-suite-like configuration.

Post Processing and Data Transfer to JASMIN

In ~/roses/<SUITEID>/site/archer2.rc ensure that [[POSTPROC_RESOURCE]] loads the correct module and sets the stack limit, thus:

        inherit = HPC_SERIAL
        pre-script = """module restore /work/y07/shared/umshared/modulefiles/postproc/2020.12.11
                        module list 2>&1
                        ulimit -s unlimited

For coupled jobs make the following suite changes:

  • Ensure the suite is using postproc_2.3.
  • In fcm_make_pp → Configuration:
    • Set config_base to fcm:moci.xm-br/dev/annetteosprey/postproc_2.3_archer2
    • Remove contents of config_rev so it is blank.
    • In pp_sources ensure the branches fcm:moci.xm-br/dev/annetteosprey/postproc_2.3_archer2@3910 & fcm:moci.xm-br/dev/rosalynhatcher/postproc_2.3_pptransfer_gridftp_nopw@3202 are listed.
  • In postproc → CICE → Diagnostics → Meaning
    • Set means_cmd to ncra --64bit_offset -O
    • Note: Some suites have an optional configuration override file so you may find you need to change this in the override file: app/postproc/opt/rose-app-archer2.conf

For guidance on configuring the data transfer app. see wiki:Archer2/PPTransfer


NCAS CMS will support only UM versions 7.3 and 8.4 on ARCHER2. For each version, currently only cumf and pumf have been built to run on ARCHER2.

Initial setup for running UMUI jobs on ARCHER2

1) Add the following snippet to your ARCHER2 ~/.bash_profile:

# Setup UM Variables
VN=7.3 ## or 8.4 as appropriate
if test -f $HOME/.umsetvars_$VN; then
  . $HOME/.umsetvars_$VN
  . /work/y07/shared/umshared/vn$VN/cce/scripts/.umsetvars_$VN

2) Setup umui_runs directory:

archer2$ mkdir /work/n02/n02/<archer2_username>/umui_runs
archer2$ ln -s /work/n02/n02/<archer2_username>/umui_runs ~/umui_runs

Very few changes are required in order to run these jobs:

UM 8.4

  • in Model Selection → User Information and Submit method → Job submission method
    • Select submission method: SLURM Cray EX (ARCHER2)
    • Set Host name to
    • Set the number of processors to be a multiple of 128
    • click the Slurm button to specify the Job time limit
  • In Model Selection → FCM Configuration → FCM Extract directories and Output levels
    • Set Target machine root directory (UM_ROUTDIR) to a location on /work (e.g. /work/n02/n02/$USERID/um)
  • in Model Selection → Input/Output Control and Resources → Time Convention and SCRIPT Environment Variables
    • Set DATADIR in the Defined Environment Variables table. This must be on /work(e.g. /work/n02/n02/<username>)
    • Ensure DATAM and DATAW are set to a location on /work. E.g $DATADIR/um/$RUNID

UM 7.3

  • In Model Selection → User Information and Target Machine → Target Machine
    • Set Machine name to
    • Set the number of processors to be a multiple of 128
  • In Model Selection → Input/Output Control and Resources → Time Convention and SCRIPT Environment Variables
    • Set DATADIR in the Defined Environment Variables table. This must be on /work (e.g. /work/n02/n02/<username>)
    • Ensure DATAM and DATAW are set to a location on /work. E.g $DATADIR/um/$RUNID
  • In Model Selection → Input/Output Control and Resources → Job submission, resources and re-submission pattern
    • Select submission method: SLURM Cray EX (ARCHER2)
    • Note - the model will not recognize a change to the default number of cores/node
  • In Model Selection → FCM Extract and Build directories and Output levels
    • Set Target machine root directory (UM_ROUTDIR) to a location on /work (e.g. /work/n02/n02/ros/um)
  • In Model Selection → Compilation and Modifications → UM User Override Files
    • The User machine overrides must use ~umui/overrides/archer2_cce_7.3_machine
    • The User file overrides must use ~umui/overrides/archer2_cce_7.3_file

Post Processing has not been tested.

umshared (UMDIR)

All n02 users will be granted read access to the umshared package account (as was the case on ARCHER.).

UM data and software is installed under /work/y07/shared/umshared.

You may set UMDIR in your ~/.bash_profile, but note, batch jobs cannot see /home and will not source scripts that reside there.

UKCA input files that used to be under /work/n02/n02/ukca is now located at $UMDIR/ukca.


  • The ssh agent on PUMA/pumatest will need to be restarted from time to time. You will need to add your ~/.ssh/id_rsa_archerum key for job submission. We recommend that you attempt to login to ARCHER2 to address a possible issue with the know_hosts file thus:$ ssh
    Warning: the RSA host key for '' differs from the key for the IP address ''
    Offending key for IP in /home/grenville/.ssh/known_hosts:53
    Matching host key in /home/grenville/.ssh/known_hosts:87

Deletion of the offending key should prevent problems with UM jobs - in the case above deletion of the entry at line 53.

  • Coupled jobs that run with NEMO_LAND_SUPPRESS=true may break the slurm node calculation - check that NEMO_IPROC and NEMO_JPROC are consistent with NEMO_NPROC when running in this configuration.


Suite id Description UM Version Date Run Decomposition* OpenMP Threads Length of run (days) Dump Frequency (days) Wallclock (h:m) Data output vol (GB) Comment Cost/model-yr SYPD CHSY maxRSS SYPD (Archer) CHSY (A)
u-bz764 UKESM 11.2 11.12.20 32x18x2:12x9:8 (9:1:1) 2(atm) 90 90 1:41 103 73.3 CU (135kAU) 3.75 9349
u-be303-archer2 UKESM AMIP 11.1 14.12.20 16x16x2 (4) 2 90 90 2:42 80 OOMs on 1 thread 43.2 CU (82.9kAU) 2.3
u-be303-archer2 UKESM AMIP 11.1 15.12.20 32x18x2 (9) 2 90 90 1:32 80 54 CU (103kAU) 4
u-bs251-archer2 GA7 N96 AMIP 11.6 16.01.21 16x12x2 (3) 2 30 30 0:30 19 18 CU 4
u-cb151 GC4 N96 ORCA025 11.6 15.01.21 32x20x1:32x20x1:6 (5:5:1) 1 30 30 1:10 43 151 CU 1.6
u-bo026 N1512 11.4 —.02.21 32x32x2 (80ppn) (26) 2 1 1 00:25 3600 CU —-
u-bo026 N1512 11.4 —.02.21 32x32x2 (80ppn) (26) 2 1 1 00:13 1800 CU (2.7MAU) 0.32 499200
u-cd936 N1280 11.6 —.05.21 96x64x2 + 2 IOS (96) 2 30 30 07:54 524 Averaged over 1 yr run 9,424 CU 0.25 1,206,736 ? Not run on A -
u-cb431 MetUM-GOML3 N216 10.3 29.09.21 32x24x2:64 (12:1) 2 : 64 30 10 1:10 94 182 CU 1.67

*(atm:ocean:xios (nodes))

**initialization and dump time removed

Note: SYPD extrapolated shorter runs

Notes for N1280: CU includes model failures.

Last modified 4 weeks ago Last modified on 29/09/21 08:36:21