Changes between Version 3 and Version 4 of UM/Configurations/UKESM/RelNotes1.0/AMIP


Ignore:
Timestamp:
28/03/19 13:50:39 (5 months ago)
Author:
jwalton
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • UM/Configurations/UKESM/RelNotes1.0/AMIP

    v3 v4  
    5252Output files created by the suite running on NEXCS may be archived to disk.  The options for requesting this can be found under the `postproc -> Post Processing - common settings` control panel.  Set `archive_command` to `NEXCS` and provide values for `archive_root_path` and `archive_name` in the subpanel `Archer Archiving` (sic) to specify the location of the archived files on NEXCS.   
    5353 
    54 Following archiving, the files may be optionally transferred to a remote machine such as JASMIN.  Provide values for `remote_host` (the address of the remote machine) and `transfer_dir` (the location of the archived files on the remote machine) in the subpanel `JASMIN Transfer`.  In addition, transferring must be turned on by setting `suite conf -> Build and Run -> PP Transfer` to `true`. 
     54Following archiving, the files may be optionally transferred to a remote machine such as JASMIN.  Provide values for `remote_host` (the address of the remote machine) and `transfer_dir` (the location of the archived files on the remote machine) in the subpanel `JASMIN Transfer`.  In addition, transferring must be turned on by setting `suite conf -> Tasks -> PP Transfer` to `true`. 
    5555 
    5656Note that, before transfer from NEXCS to JASMIN can work, some setup of communications is required - see [wiki:Docs/PostProcessingAppNexcsSetup#sshjasmin here] for more details. 
     
    5858==== Archer  
    5959 
    60 To run on Archer, the NERC platform, set `suite conf -> Host Machine -> Site at which model is being run` to  `Archer` and set these other `Machine Options`: 
     60To run on Archer, the NERC platform, set `suite conf -> Host Machine -> Site at which model is being run` to  `Archer`. 
    6161 
    62   * `Use Environment Modules` to `Custom module files`  
    63   * `Science Configuration Module Name` to `GC3-PrgEnv/2.0/90386` 
    64   * `Module file location` to `/work/y07/y07/umshared/moci/modules/modules` 
    65  
    66 In addition, the following tests (see [#Testsinthesuite below]) ''must'' be turned off when running on Archer: 
    67    
    68   * `Test restartability` 
    69   * `Test rigorous compiler option` 
    70   * `Test PE decomposition change`  
    71   * `Archive integrity` 
    72   * `CPMIP Analysis -> CPMIP load balancing analysis` 
     62In addition, the maximum number of processes per node (see `suite conf -> Domain Decomposition -> Atmosphere -> Max number of processes/node`) must be set to `24`.  This value must also be set for the `Max number of process/node` parameter in `suite conf -> Testing -> Processor Decomposition` and `suite conf -> Testing -> OpenMP` if the respective tests have been turned on (see [#Testsinthesuite below]). 
    7363 
    7464Output files created by the suite running on Archer may be archived to disk.  The options for requesting this can be found under the `postproc -> Post Processing - common settings` control panel.  Set `archive_command` to `Archer` and provide values for `archive_root_path` and `archive_name` in the subpanel `Archer Archiving` to specify the location of the archived files on Archer.   
    7565 
    76 Following archiving, the files may be optionally transferred to a remote machine such as JASMIN.  Provide values for `remote_host` (the address of the remote machine) and `transfer_dir` (the location of the archived files on the remote machine) in the subpanel `JASMIN Transfer`.  In addition, transferring must be turned on by setting `suite conf -> Build and Run -> PP Transfer` to `true`. 
     66Following archiving, the files may be optionally transferred to a remote machine such as JASMIN.  Provide values for `remote_host` (the address of the remote machine) and `transfer_dir` (the location of the archived files on the remote machine) in the subpanel `JASMIN Transfer`.  In addition, transferring must be turned on by setting `suite conf -> Tasks -> PP Transfer` to `true`. 
    7767 
    7868Note that, before transfer from Archer to JASMIN can work, some setup of communications (specifically, ''both'' [wiki:Docs/PostProcessingAppArcherSetup#sshpumatodtn between PUMA and Archer data transfer node], ''and'' [wiki:Docs/PostProcessingAppArcherSetup#sshdtntojasmin between Archer data transfer node and JASMIN]) is required. 
     
    8878When running on Met Office machines (including Monsoon), the suite will, by default, archive ''a single copy'' of its data to MOOSE.  For critical model runs, this setting may be changed to archive two copies of the data (i.e. duplex) by switching `non_duplexed_set` in `postproc -> Post Processing-common settings -> Moose Archiving` to `false`.  Further guidance on when to choose this option is available at http://www-twiki/Main/MassNonDuplexPolicy (note that this link only works from within the Met Office). 
    8979 
    90 == Compute resource usage 
    91  
    92 The compute resources used by the suite can be set via parameters on the `Machine Options` and `Domain Decomposition` control panels under `suite conf`.  The following discussion is specific to the Met Office HPC for the most part, but may still be helpful for users of other machines. 
    93  
    94 The type of compute node can be set via `suite conf -> Machine Options -> XC40 core type`: a `Haswell` node has 32 cores, while `Broadwell` has 36.   
    95  
    96 The suite is currently set up in `suite conf -> Domain Decomposition` to use 36 nodes (see [#Calculationofnodecount below] for more details on how this is calculated).  An alternative setup uses 19 nodes.  Parameter settings for both setups are: 
    97  
    98 ||=''Parameter''=||=''36 node suite''=||=''19 node suite''=|| 
    99 ||=Atmosphere: Processes East-West=||=32=||=32=|| 
    100 ||=Atmosphere: Processes North-South=||=18=||=18=|| 
    101 ||=IO Server Processes=||=0=||=0=|| 
    102 ||=OpenMP threads for the atmosphere=||=2=||=1=|| 
    103 ||=NEMO: Number of processes East-West=||=12=||=9=|| 
    104 ||=NEMO: Number of processes North-South=||=9=||=8=|| 
    105 ||=NEMO: Number of processes in XIOS server=||=6=||=6=|| 
    106 ||=OpenMP threads for the ocean=||=1=||=1=|| 
    107  
    108 Note that the ocean must be rebuilt (by setting `suite conf -> Build and Run -> Build Ocean` to `true`) whenever the NEMO parameters in the table are changed during a run.  
    109  
    110 Setting these parameters to other values may require load balancing to ensure that HPC resources are being used in the most efficient fashion.  
    111  
    112 === Calculation of node count 
    113  
    114 On `Domain Decomposition -> Atmosphere`, the number of processes used by the UM can be set via `Atmosphere: Processes East-West` and `Atmosphere: Processes North-South`; additional processes for the IO Server may be requested using `IO Server Processes`.  Finally, `OpenMP threads for the atmosphere` sets the number of threads for each process; multiplying this by the number of processes gives the number of compute tasks.   
    115  
    116 Using the parameter values for the ''36 node suite'', the number of tasks used by the UM is `(32 * 18 + 0) * 2 = 1152`. Dividing by the number of cores per node (in this case `36`) and rounding up (because different executables cannot run on the same node) gives `32` compute nodes used by the atmosphere. 
    117  
    118 A similar calculation may be performed for the settings on `Domain Decomposition -> Ocean` using `NEMO: Number of processes East-West`, `NEMO: Number of processes North-South` and `OpenMP threads for the ocean` to give `12 * 9 * 1 = 108` tasks, or `3` compute nodes used by the ocean. 
    119  
    120 Finally, on the same control panel, `NEMO: Number of processes in XIOS server` is set to `6`, which equates to `1` compute node used by XIOS.   
    121  
    122 Thus, the total number of nodes used by the suite is `32 + 3 + 1 = 36`. 
    123  
    12480== Tests in the suite 
    12581 
    12682The suite contains options for testing different aspects of the model including reproducible restarting, changes in processor decomposition, comparison to known good output and integrity of archived files.  Some of these tests may be of more interest to developers than general users of the model; they can be turned on or off via the `suite conf -> Testing` control panel. 
    12783 
    128 === Testing for PE decomposition change 
    129  
    130 Changing the PE decomposition will change the results of the model because of the behaviour of the chemistry solver within the UM. Thus, by default, the `Test PE decomposition change` test (see `suite conf -> Testing`) will fail, and so this test has been turned off. There is a version of the chemistry solver which does not change results; this can be selected by setting `l_ukca_asad_columns=.true.` in `app/um/rose-app.conf`. With this option selected, the PE decomposition change test should pass.  
    131  
    132 Note that we do not select this version of the chemistry solver by default because it has a performance overhead; specifically, it causes an atmosphere-only job to run about 10% slower than when running with `l_ukca_asad_columns=.false.`. 
    133  
    134 It should be noted that changing the PE decomposition for the ocean in UKESM will also change results because this changes results for both the iceberg code in NEMO and for the CICE code. This behaviour cannot be rectified by setting a single variable (that is, there is no analogue of `l_ukca_asad_columns` for NEMO and CICE).  
    135  
    13684== Science notes 
    13785 
    138 The historical release job differs from the first member of the CMIP6 historical ensemble (`u-bb075`) in the following ways: 
    139 * The anthropogenic SO2 emission ancillaries were produced using different methodologies. The resulting SO2 emission in the model is nearly identical but has differences at the bit level. 
    140 * It includes a fix for the aerosol plume scavenging diagnostic 38900-38932 to ensure that they bit-compare across differing processor decompositions. The difference appears at only a small number of points where the value is very close to zero. 
    141  
    142 The piControl release job differs from the CMIP6 piControl run (`u-aw310`) in the following ways: 
    143 * The anthropogenic emission height is different. In the release job all anthropogenic SO2 is emitted at the surface, consistent with the model's treatment of other anthropogenic emissions (except NOx emissions from aircraft which are supplied with a vertical emission profile). In the CMIP6 piControl, around 30% of anthropogenic emissions were released at 500m. Because the 1850 anthropogenic emissions are tiny relative to natural emissions from volcanoes and the ocean, this difference in emission height makes no meaningful difference to the SO2 burden or the aerosol simulation. 
    144 * The release job includes several diagnostic fixes which were not included in the CMIP6 piControl run. These include: 
    145    a. The fix to plume scavenging diagnostics described above 
    146    b. Corrections to diagnostics 30312, 30313, 30298 which were corrupted in the CMIP6 piControl 
    147    c. Addition of some other diagnostics to CMIP6 PP streams 
     86(none) 
    14887 
    14988== Known issues