wiki:UM/Configurations/HectorHadgem3aoR40

Version 1 (modified by annette, 6 years ago) (diff)

HadGEM3-AO r4.0 (xfuzb)

Job overview

This job is the coupled HadGEM3-AO release 4.0 from the Met Office. It uses vn7.6 of the UM with vn3.0 NEMO and vn4.1 CICE. The atomsphere resolution is N96 L85 and NEMO is ORCA1 L75.

This job is based on Met Office job id akapa (equivalent to job ajtzd). More information on the HadGEM3 coupled model development can be found on the collaboration wiki:

This version is set up for the Cray XE6 (HECToR phase 3).

Getting started

You will need to be registered for puma and have access to the UM and NEMO-CICE code repositories.

Using the UMUI, take a copy of the job configuration xfuzb. Before submitting the job it is essential that you change some basic settings, by going to the following UMUI windows:

  1. User Information and Submit Method → General Details

Set your Userid, Email Address and Tic Code

  1. FCM Configuration → FCM Extract directories and Output levels

Set an appropriate value for "Target machine root extract dir (UM_ROUTDIR)".
This is a directory on HECToR, typically /home/n02/n02/<userid>

  1. Input/Output? Control and Resources → Time Convention and SCRIPT Environment Variables

Check that the settings for $DATAW and $DATAM are as you require.

Save, Process and then Submit your job.

The job is set up to run on a total of 160 cores (5 nodes) with 96 cores for the atmosphere (8x12), 32 cores for the ocean (4x8 for NEMO and 32x1 for CICE), and a full node (32 cores) for OASIS.

Known issues

Recent fix included in the standard job

  1. NetCDF error when writing CICE history files:
    NetCDF: Numeric conversion not representable
    

This is an issue with the NetCDF libraries for CCE/8 and is fixed at NetCDF/4.2.0, linked with the hand-edit:

~umui/hadgem3/vn7.6/HG3AO40/hand_edits/load_netcdf4.2.0.ed

Editing jobs

The NEMO and CICE submodels are controlled separately to the atmosphere model.

A small number of changes can be made through the UMUI - in particular the model start date, run length and resubmission length are applied to all submodels. The number of processors and domain decompositions for each submodel are also controlled through the UMUI under User Information and Submit Method → General details. ' Note that unlike the UM atmosphere model the ocean model must be recompiled to alter the decomposition of NEMO or CICE.

All other changes including dump and diagnostic output frequency need to be made in the submodel control files.

  • FPP keys configuration file: FCM Configuration → FCM Options for NEMO / CICE

This specifices the code sections to include. See the submodel documentations for more information

  • Control namelist file: NEMO / CICE → Scientific Parameters and Sections → Links to NEMO / CICE model

This includes the output file frequency, control of diagnostics and the values of other scientific parameters.

More information is available on the NEMO-CICE trac wiki.

Limitations

On the Cray hardware each submodel (atmos, ocean and coupler) need to be run a separate set of nodes - thus a minimum of 3 is required.

Currently each submodel needs to run with the same number of cores per node (default 32).

Archiving of NEMO and CICE files is not activated in the standard job.

Contact the helpdesk with any other difficulties to do with running the coupled model.

Timing runs

Some different atmosphere and ocean decompositions were tested. The times shown are from the "Maximum elapsed wallclock time" from the UM timer. The model was run for 3 days with daily atmosphere, ocean and sea-ice dumps.

Nodes  Total pes   = Atmos cfg   + NEMO CICE cfg   + OASIS -> Time (s)
 
4      128           64 (8x8)      32 (4x8 - 32x1)   32       1112.834
5      160           96 (8x12)     32 (4x8 - 32x1)   32        871.852
6      192           128 (8x16)    32 (4x8 - 32x1)   32        777.669

The dumps after 3 days were found to be identical in all cases.

When altering the ocean decomposition however the dumps differed even with bit-reproducible options in the NEMO namelist file (nbit_cmp=1 and nsolv=2). Under the pervious hardware (Phase 2b) and pathscale compiler the ocean model did bit-reproduce.

Porting notes

Job changes

A summary of the changes made to the Met Office job to run on the XE6 system. See also the list of branches below.

HECToR details: User and machine details including username, tic-code, machine name and job submission information ('qsub' for PBS pro) Atmos to ocean pe decomposition: This was modified to to 8x12 for the atmosphere, 4x8 for NEMO and 32x1 for CICE. NEMO and CICE are compiled into a single executable and run sequentially. This adds up tp 160 cores (5 nodes) including 1 node (32 cores) for OASIS. This was found to be the best decomposition of those tested. Job directories: HECToR output directory set to $DATADIR/um/$RUNID and puma extract directory set to /work/n02/n02/username/um. Path to local umui files: /home/umui/hadgem3/vn7.6/HG3AO40/. This contains hand edits (hand_edits/), compile overrides (overrides/), user STASH master files (preSTASHmaster/), coupling macros (macros/), NEMO configuration files (nemo_cfg/) and CICE configuration files (cice_cfg/). Extra hand-edits for puma: Currently a hand-edit is required for submission of the coupled model to HECToR (vn7.5_oasis_nproc.ed) and for archiving (archiving_7.6). Location of input data files on HECToR: $UMDIR/vn7.6/HG3AO40. This contains start dumps, ancillary and forcing files plus NEMO and CICE control namelists (nemo_ctl/ and cice_ctl/). Byte swap CICE binary restart file: iced_start_abwORCA1_sep_swapped.bin Location of the OASIS build on HECToR: /work/n02/n02/hum/oasis/oasis3_2-5/prism/crayxe6_cce. Coupling macro: Point to version for HECToR (uses &END rather than / in namelists): cpl_macro_hadgem3_3hr. FCM settings for puma: Fill in "container file name and location", "bindings location" and "subversion URL" for puma. Puma versions of branches: Replace Met Office versions of branches with puma equivalents (see branches on puma below.) Include extra atmos branches: For running on HECToR fcm:um_br/pkg/Config/VN7.6_ncas/src and fcm:um_br/dev/jeff/VN7.6_hector_monsoon_archiving/src. Include extra NEMO branches: These contain fixes for HECToR. fcm:nemo_br/dev/annette/VN3.0_fixes_to_nemo_trunk/NEMO and fcm:nemo_br/dev/annette/VN3.0_coupled_fixes/NEMO Include extra CICE branches: Fix for HECToR. fcm:cice_br/dev/annette/VN4.1_dbl_notation_fix/cice

Compile options for NEMO and CICE: Configuration files to extract NEMO v3.0 and CICE v4.1 from puma repository and set compiler flags for pathscale on HECToR: nemo_XE6_cce_3.0_base.cfg (NEMO) and cice4.1_base_XE6_cce.cfg (CICE).

Linking to OASIS libraries on HECToR: Update NEMO library flags set in the UMUI for HECToR (remove netcdf and IBM-specific options). Atmosphere options are specified in two compile override files: oasis_file_hector_cce_7.6 and oasis_mach_hector_cce_7.6.

Tidying up: IBM-specific environment variables were removed, FCM output was reduced from '3' to '1', output prints were reduced from "operational" to "normal", outputting of basis files was turned off, and pe output files were set to be deleted on successful completion.

Switch off user-script releases

Branches on puma

A list of the branches included in this job and the equivalents on puma.

UM Met Office branch Rev Puma branch Rev

fcm:um_br/dev/frme/VN7.6_rhcrit_para_bugfix/src 22337 fcm:um_br/dev/matthew_miz/VN7.6_rhcrit_para_bugfix_ukmo/src 4036 fcm:um_br/dev/frrh/VN7.6_coupling_comp_opts/src 22218 fcm:um_br/dev/annette/VN7.6_coupling_comp_opts_ukmo/src 4824 fcm:um_br/dev/hadci/VN7.6_hadgem3_specials/src 22391 fcm:um_br/dev/annette/VN7.6_hadgem3_specials_ukmo/src 4826 fcm:um_br/dev/hadci/VN7.6_restart_fix/src 22275 fcm:um_br/dev/annette/VN7.6_restart_fix_ukmo/src 4828 fcm:um_br/dev/hadco/VN7.6_incrCLO/src 22233 fcm:um_br/dev/matthew_miz/VN7.6_incrCLO_ukmo/src 4032 fcm:um_br/dev/hadco/VN7.6_reinstate_ISCCP/src 22237 fcm:um_br/dev/matthew_miz/VN7.6_reinstate_ISCCP_ukmo/src 4030 fcm:um_br/dev/hadke/VN7.6_rcf_polaravg_landfields/src 22292 fcm:um_br/dev/matthew_miz/VN7.6_rcf_polaravg_landfields_ukmo/src 4034 fcm:um_br/dev/frid/VN7.6_baresoilbugfix/src 22455 fcm:um_br/dev/matthew_miz/VN7.6_baresoilbugfix_ukmo/src 4038 fcm:um_br/dev/hadci/VN7.6_topmelt_stash_fix/src 22730 fcm:um_br/dev/annette/VN7.6_topmelt_stash_fix_ukmo/src 4831 fcm:um_br/dev/hadco/VN7.6_dust_tuning/src 23270 fcm:um_br/dev/matthew_miz/VN7.6_dust_tuning_ukmo/src 4040 fcm:um_br/dev/hadaw/VN7.6_inland_basins/src 22972 fcm:um_br/dev/matthew_miz/VN7.6_inland_basins_ukmo/src 4042

fcm:um_br/pkg/Config/VN7.6_ncas/src fcm:um_br/dev/jeff/VN7.6_hector_monsoon_archiving/src

NEMO Met Office branch Rev Puma branch Rev

fcm:ioipsl_br/dev/hadci/VN3.0_CF_comp 2213 fcm:ioipsl_br/dev/Share/VN3.0_CF_comp_ukmo 532 fcm:ioipsl_br/dev/hadci/VN3.0_defprec 2060 fcm:ioipsl_br/dev/Share/VN3.0_defprec 546 fcm:nemo_br/dev/hadci/VN3.0_18Cisotherm/NEMO 2214 fcm:nemo_br/dev/Share/VN3.0_18Cisotherm_ukmo/NEMO 559 fcm:nemo_br/dev/hadci/VN3.0_CF_comp/NEMO 3316 fcm:nemo_br/dev/Share/VN3.0_CF_comp_ukmo/NEMO 1638 fcm:nemo_br/dev/hadci/VN3.0_PEchange/NEMO 2057 fcm:nemo_br/dev/hadci/VN3.0_diaptr_new/NEMO 3174 fcm:nemo_br/dev/annette/VN3.0_diaptr_new_ukmo/NEMO 973 fcm:nemo_br/dev/hadci/VN3.0_hadgem3/NEMO 3716 fcm:nemo_br/dev/Share/VN3.0_hadgem3_ukmo/NEMO 1222 fcm:nemo_br/dev/hadci/VN3.0_karamld/NEMO 2056 fcm:nemo_br/dev/Share/VN3.0_karamld_ukmo/NEMO 556 fcm:nemo_br/dev/hadci/VN3.0_restart_date/NEMO 2624 fcm:nemo_br/dev/annette/VN3.0_restart_date_ukmo/NEMO 733 fcm:nemo_br/dev/hadom/VN3.0_ORCA1_L75/NEMO 3732 fcm:nemo_br/dev/annette/VN3.0_ORCA1_L75_ukmo/NEMO 1646 fcm:nemo_br/dev/hadom/VN3.0_ORCAL75_10m_mindepth/NEMO 3077 fcm:nemo_br/dev/matthew_miz/VN3.0_ORCAL75_10m_mindepth_ukmo/NEMO 1059 fcm:nemo_br/dev/hadci/VN3.0_avt_rnf_fix/NEMO 3347 fcm:nemo_br/dev/malcolm/VN3.0_avt_rnf_fix_ukmo/NEMO 1126 fcm:nemo_br/dev/hadci/VN3.0_tvd_diaptr_fix/NEMO 3285 fcm:nemo_br/dev/malcolm/VN3.0_tvd_diaptr_fix_ukmo/NEMO 1124

fcm:nemo_br/dev/annette/VN3.0_fixes_to_nemo_trunk/NEMO 547 fcm:nemo_br/dev/annette/VN3.0_coupled_fixes/NEMO 957

(Note: VN3.0_PEchange is included in VN3.0_fixes_to_nemo_trunk)

CICE Met Office branch Rev Puma branch Rev

fcm:cice_br/dev/Share/VN4.1_HadCICERun/cice 329 fcm:cice_br/dev/charris/VN4.1_HadCICERun_ukmo/cice 1306 fcm:cice_br/dev/hadci/VN4.1_no_hbnew_errors/cice 318 fcm:cice_br/dev/charris/VN4.1_no_hbnew_errors_ukmo/cice 1312 fcm:cice_br/dev/hadci/VN4.1_no_vert_check/cice 317 fcm:cice_br/dev/charris/VN4.1_no_vert_check_ukmo/cice 1314

fcm:cice_br/dev/annette/VN4.1_dbl_notation_fix/cice 1803