Changes between Initial Version and Version 1 of UM/Configurations/HectorHadgem3aoR40


Ignore:
Timestamp:
26/04/13 17:05:21 (7 years ago)
Author:
annette
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • UM/Configurations/HectorHadgem3aoR40

    v1 v1  
     1= HadGEM3-AO r4.0 (xfuzb) =  
     2 
     3== Job overview ==  
     4 
     5This job is the coupled HadGEM3-AO release 4.0 from the Met Office. It uses vn7.6 of the UM with vn3.0 NEMO and vn4.1 CICE. The atomsphere resolution is N96 L85 and NEMO is ORCA1 L75. 
     6 
     7This job is based on Met Office job id akapa (equivalent to job '''ajtzd'''). More information on the HadGEM3 coupled model development can be found on the collaboration wiki: 
     8* [http://collab.metoffice.gov.uk/twiki/bin/view/Project/CAPTIVATE/HadGEM3Evolution HadGEM3 model evolution]  
     9* [http://collab.metoffice.gov.uk/twiki/bin/view/Project/CAPTIVATE/HadGEM3ajtzd ajtzd] 
     10 
     11This version is set up for the Cray XE6 (HECToR phase 3). 
     12 
     13== Getting started ==  
     14 
     15You will need to be [http://ncas-cms.nerc.ac.uk/index.php/puma registered for puma] and have access to the UM and NEMO-CICE code repositories. 
     16 
     17Using the UMUI, take a copy of the job configuration '''xfuzb'''. Before submitting the job it is essential that you change some basic settings, by going to the following UMUI windows: 
     18 
     191. ''User Information and Submit Method -> General Details'' [[br]] 
     20 
     21    Set your Userid, Email Address and Tic Code 
     22 
     232. ''FCM Configuration -> FCM Extract directories and Output levels'' [[br]] 
     24 
     25   Set an appropriate value for "Target machine root extract dir (UM_ROUTDIR)". [[br]] 
     26   This is a directory on HECToR, typically /home/n02/n02/<userid> 
     27 
     283. ''Input/Output Control and Resources -> Time Convention and SCRIPT Environment Variables'' [[br]] 
     29 
     30   Check that the settings for $DATAW and $DATAM are as you require. 
     31 
     32Save, Process and then Submit your job. 
     33 
     34The job is set up to run on a total of 160 cores (5 nodes) with 96 cores for the atmosphere (8x12), 32 cores for the ocean (4x8 for NEMO and 32x1 for CICE), and a full node (32 cores) for OASIS. 
     35 
     36== Known issues ==  
     37 
     38Recent fix included in the standard job 
     39 
     401. NetCDF error when writing CICE history files: [[br]] 
     41{{{ 
     42NetCDF: Numeric conversion not representable 
     43}}} 
     44      
     45   This is an issue with the NetCDF libraries for CCE/8 and is fixed at NetCDF/4.2.0, linked with the hand-edit: [[br]] 
     46{{{ 
     47~umui/hadgem3/vn7.6/HG3AO40/hand_edits/load_netcdf4.2.0.ed 
     48}}} 
     49  
     50== Editing jobs == 
     51 
     52The NEMO and CICE submodels are controlled separately to the atmosphere model. 
     53 
     54A small number of changes can be made through the UMUI - in particular the  model start date, run length and resubmission length are applied to all submodels. The number of processors and domain decompositions for each submodel are also controlled through the UMUI under ''User Information and Submit Method -> General details''. 
     55' 
     56Note that unlike the UM atmosphere model the ocean model '''must be recompiled''' to alter the decomposition of NEMO or CICE. 
     57 
     58All other changes including '''dump and diagnostic output frequency''' need to be made in the submodel control files. 
     59 
     60* FPP keys configuration file: ''FCM Configuration -> FCM Options for NEMO / CICE'' [[br]] 
     61 
     62  This specifices the code sections to include. See the submodel documentations for more information 
     63 
     64* Control namelist file:   ''NEMO / CICE -> Scientific Parameters and Sections -> Links to NEMO / CICE model'' [[br]] 
     65 
     66  This includes the output file frequency, control of diagnostics and the values of other scientific parameters. 
     67 
     68More information is available on the [http://puma.nerc.ac.uk/trac/NEMOCICE NEMO-CICE trac wiki]. 
     69 
     70== Limitations ==  
     71 
     72On the Cray hardware each submodel (atmos, ocean and coupler) need to be run a separate set of nodes - thus a minimum of 3 is required. 
     73 
     74Currently each submodel needs to run with the same number of cores per node (default 32). 
     75 
     76Archiving of NEMO and CICE files is not activated in the standard job. 
     77 
     78Contact the helpdesk with any other difficulties to do with running the coupled model. 
     79  
     80== Timing runs == 
     81 
     82Some different atmosphere and ocean decompositions were tested. The times shown are from the "Maximum elapsed wallclock time" from the UM timer. The model was run for 3 days with daily atmosphere, ocean and sea-ice dumps. 
     83{{{ 
     84Nodes  Total pes   = Atmos cfg   + NEMO CICE cfg   + OASIS -> Time (s) 
     85  
     864      128           64 (8x8)      32 (4x8 - 32x1)   32       1112.834 
     875      160           96 (8x12)     32 (4x8 - 32x1)   32        871.852 
     886      192           128 (8x16)    32 (4x8 - 32x1)   32        777.669 
     89}}} 
     90 
     91The dumps after 3 days were found to be identical in all cases. 
     92 
     93When altering the ocean decomposition however the dumps differed even with bit-reproducible options in the NEMO namelist file (nbit_cmp=1 and nsolv=2). Under the pervious hardware (Phase 2b) and pathscale compiler the ocean model did bit-reproduce. 
     94  
     95== Porting notes ==  
     96 
     97=== Job changes ===  
     98 
     99A summary of the changes made to the Met Office job to run on the XE6 system. See also the list of branches below. 
     100 
     101    HECToR details: 
     102    User and machine details including username, tic-code, machine name and job submission information ('qsub' for PBS pro) 
     103    Atmos to ocean pe decomposition: 
     104    This was modified to to 8x12 for the atmosphere, 4x8 for NEMO and 32x1 for CICE. NEMO and CICE are compiled into a single executable and run sequentially. This adds up tp 160 cores (5 nodes) including 1 node (32 cores) for OASIS. This was found to be the best decomposition of those tested. 
     105    Job directories: 
     106    HECToR output directory set to $DATADIR/um/$RUNID and puma extract directory set to /work/n02/n02/username/um. 
     107    Path to local umui files: 
     108    /home/umui/hadgem3/vn7.6/HG3AO40/. This contains hand edits (hand_edits/), compile overrides (overrides/), user STASH master files (preSTASHmaster/), coupling macros (macros/), NEMO configuration files (nemo_cfg/) and CICE configuration files (cice_cfg/). 
     109    Extra hand-edits for puma: 
     110    Currently a hand-edit is required for submission of the coupled model to HECToR (vn7.5_oasis_nproc.ed) and for archiving (archiving_7.6). 
     111    Location of input data files on HECToR: 
     112    $UMDIR/vn7.6/HG3AO40. This contains start dumps, ancillary and forcing files plus NEMO and CICE control namelists (nemo_ctl/ and cice_ctl/). 
     113    Byte swap CICE binary restart file: 
     114    iced_start_abwORCA1_sep_swapped.bin 
     115    Location of the OASIS build on HECToR: 
     116    /work/n02/n02/hum/oasis/oasis3_2-5/prism/crayxe6_cce. 
     117    Coupling macro: 
     118    Point to version for HECToR (uses &END rather than / in namelists): cpl_macro_hadgem3_3hr. 
     119    FCM settings for puma: 
     120    Fill in "container file name and location", "bindings location" and "subversion URL" for puma. 
     121    Puma versions of branches: 
     122    Replace Met Office versions of branches with puma equivalents (see branches on puma below.) 
     123    Include extra atmos branches: 
     124    For running on HECToR fcm:um_br/pkg/Config/VN7.6_ncas/src and fcm:um_br/dev/jeff/VN7.6_hector_monsoon_archiving/src. 
     125    Include extra NEMO branches: 
     126    These contain fixes for HECToR. fcm:nemo_br/dev/annette/VN3.0_fixes_to_nemo_trunk/NEMO and fcm:nemo_br/dev/annette/VN3.0_coupled_fixes/NEMO 
     127    Include extra CICE branches: 
     128    Fix for HECToR. fcm:cice_br/dev/annette/VN4.1_dbl_notation_fix/cice 
     129 
     130    Compile options for NEMO and CICE: Configuration files to extract NEMO v3.0 and CICE v4.1 from puma repository and set compiler flags for pathscale on HECToR: nemo_XE6_cce_3.0_base.cfg (NEMO) and cice4.1_base_XE6_cce.cfg (CICE). 
     131 
     132    Linking to OASIS libraries on HECToR: Update NEMO library flags set in the UMUI for HECToR (remove netcdf and IBM-specific options). Atmosphere options are specified in two compile override files: oasis_file_hector_cce_7.6 and oasis_mach_hector_cce_7.6. 
     133 
     134    Tidying up: IBM-specific environment variables were removed, FCM output was reduced from '3' to '1', output prints were reduced from "operational" to "normal", outputting of basis files was turned off, and pe output files were set to be deleted on successful completion. 
     135 
     136    Switch off user-script releases 
     137 
     138  
     139Branches on puma 
     140 
     141A list of the branches included in this job and the equivalents on puma. 
     142 
     143UM  
     144Met Office branch                                      Rev     Puma branch                                                       Rev 
     145 
     146fcm:um_br/dev/frme/VN7.6_rhcrit_para_bugfix/src        22337   fcm:um_br/dev/matthew_miz/VN7.6_rhcrit_para_bugfix_ukmo/src       4036 
     147fcm:um_br/dev/frrh/VN7.6_coupling_comp_opts/src        22218   fcm:um_br/dev/annette/VN7.6_coupling_comp_opts_ukmo/src           4824 
     148fcm:um_br/dev/hadci/VN7.6_hadgem3_specials/src         22391   fcm:um_br/dev/annette/VN7.6_hadgem3_specials_ukmo/src             4826 
     149fcm:um_br/dev/hadci/VN7.6_restart_fix/src              22275   fcm:um_br/dev/annette/VN7.6_restart_fix_ukmo/src                  4828 
     150fcm:um_br/dev/hadco/VN7.6_incrCLO/src                  22233   fcm:um_br/dev/matthew_miz/VN7.6_incrCLO_ukmo/src                  4032 
     151fcm:um_br/dev/hadco/VN7.6_reinstate_ISCCP/src          22237   fcm:um_br/dev/matthew_miz/VN7.6_reinstate_ISCCP_ukmo/src          4030 
     152fcm:um_br/dev/hadke/VN7.6_rcf_polaravg_landfields/src  22292   fcm:um_br/dev/matthew_miz/VN7.6_rcf_polaravg_landfields_ukmo/src  4034 
     153fcm:um_br/dev/frid/VN7.6_baresoilbugfix/src            22455   fcm:um_br/dev/matthew_miz/VN7.6_baresoilbugfix_ukmo/src           4038 
     154fcm:um_br/dev/hadci/VN7.6_topmelt_stash_fix/src        22730   fcm:um_br/dev/annette/VN7.6_topmelt_stash_fix_ukmo/src            4831 
     155fcm:um_br/dev/hadco/VN7.6_dust_tuning/src              23270   fcm:um_br/dev/matthew_miz/VN7.6_dust_tuning_ukmo/src              4040 
     156fcm:um_br/dev/hadaw/VN7.6_inland_basins/src            22972   fcm:um_br/dev/matthew_miz/VN7.6_inland_basins_ukmo/src            4042 
     157                                                               fcm:um_br/pkg/Config/VN7.6_ncas/src 
     158                                                               fcm:um_br/dev/jeff/VN7.6_hector_monsoon_archiving/src 
     159 
     160NEMO  
     161Met Office branch                                      Rev     Puma branch                                                       Rev 
     162 
     163fcm:ioipsl_br/dev/hadci/VN3.0_CF_comp                  2213    fcm:ioipsl_br/dev/Share/VN3.0_CF_comp_ukmo                        532 
     164fcm:ioipsl_br/dev/hadci/VN3.0_defprec                  2060    fcm:ioipsl_br/dev/Share/VN3.0_defprec                             546 
     165fcm:nemo_br/dev/hadci/VN3.0_18Cisotherm/NEMO           2214    fcm:nemo_br/dev/Share/VN3.0_18Cisotherm_ukmo/NEMO                 559 
     166fcm:nemo_br/dev/hadci/VN3.0_CF_comp/NEMO               3316    fcm:nemo_br/dev/Share/VN3.0_CF_comp_ukmo/NEMO                     1638 
     167fcm:nemo_br/dev/hadci/VN3.0_PEchange/NEMO              2057     
     168fcm:nemo_br/dev/hadci/VN3.0_diaptr_new/NEMO            3174    fcm:nemo_br/dev/annette/VN3.0_diaptr_new_ukmo/NEMO                973 
     169fcm:nemo_br/dev/hadci/VN3.0_hadgem3/NEMO               3716    fcm:nemo_br/dev/Share/VN3.0_hadgem3_ukmo/NEMO                     1222 
     170fcm:nemo_br/dev/hadci/VN3.0_karamld/NEMO               2056    fcm:nemo_br/dev/Share/VN3.0_karamld_ukmo/NEMO                     556 
     171fcm:nemo_br/dev/hadci/VN3.0_restart_date/NEMO          2624    fcm:nemo_br/dev/annette/VN3.0_restart_date_ukmo/NEMO              733 
     172fcm:nemo_br/dev/hadom/VN3.0_ORCA1_L75/NEMO             3732    fcm:nemo_br/dev/annette/VN3.0_ORCA1_L75_ukmo/NEMO                 1646 
     173fcm:nemo_br/dev/hadom/VN3.0_ORCAL75_10m_mindepth/NEMO  3077    fcm:nemo_br/dev/matthew_miz/VN3.0_ORCAL75_10m_mindepth_ukmo/NEMO  1059 
     174fcm:nemo_br/dev/hadci/VN3.0_avt_rnf_fix/NEMO           3347    fcm:nemo_br/dev/malcolm/VN3.0_avt_rnf_fix_ukmo/NEMO               1126 
     175fcm:nemo_br/dev/hadci/VN3.0_tvd_diaptr_fix/NEMO        3285    fcm:nemo_br/dev/malcolm/VN3.0_tvd_diaptr_fix_ukmo/NEMO            1124 
     176                                                               fcm:nemo_br/dev/annette/VN3.0_fixes_to_nemo_trunk/NEMO            547 
     177                                                               fcm:nemo_br/dev/annette/VN3.0_coupled_fixes/NEMO                  957 
     178(Note: VN3.0_PEchange is included in VN3.0_fixes_to_nemo_trunk) 
     179 
     180CICE  
     181Met Office branch                                      Rev     Puma branch                                                       Rev 
     182 
     183fcm:cice_br/dev/Share/VN4.1_HadCICERun/cice            329     fcm:cice_br/dev/charris/VN4.1_HadCICERun_ukmo/cice                1306 
     184fcm:cice_br/dev/hadci/VN4.1_no_hbnew_errors/cice       318     fcm:cice_br/dev/charris/VN4.1_no_hbnew_errors_ukmo/cice           1312 
     185fcm:cice_br/dev/hadci/VN4.1_no_vert_check/cice         317     fcm:cice_br/dev/charris/VN4.1_no_vert_check_ukmo/cice             1314 
     186                                                               fcm:cice_br/dev/annette/VN4.1_dbl_notation_fix/cice               1803 
     187