The OpenIFS project is an initiative from ECMWF that will deliver a portable version of its Integrated Forecasting System to the academic community.
NCAS-CMS has agreed to host an OpenIFS repository on PUMA. This will allow those in the academic community with OpenIF licences to access OpenIFS and run it on HECToR. The initial installation of OpenIFS has been made into the repository.
OpenIFS currently serializes its output through a single processor. This can result in a major performance bottleneck when large volumes of data are being output which will likely be the case for high resolution OpenIFS integrations. NCAS-CMS is leading a 6-month HECToR dCSE project (NAG) in collaboration with ECMWF with the aim of implementing a more sophisticated I/O model in OpenIFS (on HECToR) in which I/O is performed in parallel and asynchronously with computation. The method is currently used by the operational IFS and uses the so-called Fields Database (FDB). OpenIFS writes data in GRIB format and for ease of use, the project will also see the implementation of MARS client on HECToR. We are investigating the installation of Metview also. The FDB maintains a metadata index which stores the location of the model fields in output files for later retrieval by the MARS client. FDB has been shown to scale well on IBM P6/GPFS architecture and we are hopeful that the same will be the case on Cray/Lustre? and the project will determine if this is the case.
The project has five work packages covering software installation and testing, model verification, development of metrics for verification, optimization an performance, and product delivery. Mark Richardson (NAG) and Glenn Carver (ECMWF) are our partners and have been working on installing the FDB software (and its dependencies) on HECToR for use in modules by the community as part of work package 1. That work is nearly completed.
We are beginning to write the scripts needed to extract the source code, mirror it to HECToR and build the model executable on the supercomputer which will be the basis of the job submission system.
The project timeline is presented in the attached file (openifs_fdb_hector.pdf).
Considerable progress has been made with this project. The OpenIFS repository has been created on PUMA and the OpenIFS code installed. Mark has installed the necessary grib libraries on HECToR and the FDB libraries under a package account for maintainability and has set up environment modules to help manage the software installation. The FDB calls have been activated in the OpenIFS code and we have successfully extracted and built the model on HECToR. Glenn has made available several models at differing resolutions for testing. We have concentrated on the t1279 model (about 25km global resolution). Several scenarios have been investigated - the results are summarized in the figure below. A baseline performance with minimal IO is established for 1 and 8 OMP threads the model scales well, especially when run with 8 OMP threads (see figure below right). OpenIFS running with is 'single-writer' output scheme shows the characteristic performance slow down as a single processor throttles the rest in order to gather and output data. Our test generates hourly output for a total of 58GB/model day; time spent performing output is a very significant portion of the total (~25% of wallclock for 1024 processors, rising to ~40% wallclock for 4096 processors). Running with the FDB scheme shows an impressive performance improvement, whereby the time taken for asynchronous multiprocessor output is virtually hidden and amounts to not more than 5% of the wallclock time for the run.
How to get on IOFS job running on HECToR
There are a couple of preliminary administrative tasks which need to be taken care of first.
- Your institution will need an OpenIFS license - check with Glenn Carver (ECMWF)
- You will need an account on HECToR
- You will need to be a member of the oifs package account on HECToR. Contact CMS to arrange to be added to this group.
- You will need to have access to the OpenIFS code repository on PUMA - contact CMS
It will be helpful if you have some knowledge of the Flexible Configuration Management (FCM) system. We use FCM to manage the OpenIFS code and in addition to manage the OpenIFS build. Much of the work has been done which will enable you to get started on HECToR, but it will benefit you and make your work more efficient if you become familiar with some simple FCM operations, such as commit, check-out, add, revert, branch-create. FCM is described in depth in complete online documentation at http://metomi.github.io/fcm/doc/ and an FCM tutorial is available at http://cms.ncas.ac.uk/wiki/Fcm.
Setting the HECToR environment
You will need to set up the correct environment to build and run OIFS jobs. Since HECToR has different hardware on its compute and service nodes, there is an extra consideration when using some OIFS-related utilities on the service nodes (more later). Setting the environment simply involves loading the appropriate environment module; at the HECToR command line, type
module use ~oifs/modules module load openifs_cce/0.0.2
Loading the openifs module sets some paths required for the build and run and has some checking to ensure that there is no conflict with currently loaded modules. Typing
module show openifs_cce/0.0.2
gives details of what the module does, namely
module-whatis CCE versions of support libraires to build OpenIFS conflict PrgEnv-gnu conflict PrgEnv-pgi prereq cce/8.1.8 module load grib_api_cce/0.0.2 module load fdb_cce/0.0.2 prepend-path PATH /work/n02/n02/hum/fcm/bin setenv OIFS_IFSDATA /work/y07/y07/oifs/data/ecmwf/ifsdata setenv OIFS_ARCH x86_64 setenv OIFS_COMP cce_fdb setenv OIFS_BUILD opt
If your normal environment is not set up with the cray programming environment, then an attempt to load openifs_cce/0.0.2 will fail, and it will also fail unless you have the Cray cce1/8.1.8 compiler loaded. Use module swap to ensure an appropriate starting module configuration.
We have created two other modules required for OpenIFS, shown below:
module show grib_api_cce/0.0.2 ------------------------------------------------------------------- /usr/local/packages/oifs/modules/grib_api_cce/0.0.2: conflict PrgEnv-gnu conflict PrgEnv-pgi conflict PrgEnv-pathscale prereq cce/8.1.8 ECMWF library GRIB_API on HECToR Phase 3 Open IFS support libraries installed at /work/y07/y07/oifs/install Platform is cce_8.1.8-il Version of library GRIB_API is 1.9.18 setenv GRIB_API_PATH /work/y07/y07/oifs/install/grib_api/1.9.18/cce_8.1.8-il prepend-path PATH /work/y07/y07/oifs/install/grib_api/1.9.18/cce_8.1.8-il/bin prepend-path LD_LIBRARY_PATH /work/y07/y07/oifs/install/grib_api/1.9.18/cce_8.1.8-il/lib setenv GRIB_DEFINITION_PATH /work/y07/y07/oifs/install/grib_api/1.9.18/cce_8.1.8-il/share/grib_api/definitions setenv GRIB_SAMPLES_PATH /work/y07/y07/oifs/install/grib_api/1.9.18/cce_8.1.8-il/share/grib_api/ifs_samples/grib1_mlgrib2 module-whatis Support library for ECMWF software : GRIB_API on HECToR Phase 3 -------------------------------------------------------------------
module show fdb_cce/0.0.2 ------------------------------------------------------------------- /usr/local/packages/oifs/modules/fdb_cce/0.0.2: conflict PrgEnv-gnu conflict PrgEnv-pgi conflict PrgEnv-pathscale prereq cce/8.1.8 prereq grib_api_cce/0.0.2 ECMWF library FDB on HECToR Phase 3 Open IFS support libraries installed at /work/y07/y07/oifs/install Platform is cce_8.1.8-il Version of library FDB is 5.0.0 Version of library ECKIT is 0.3.0 setenv ECKIT_PATH /work/y07/y07/oifs/install/eckit/0.3.0/cce_8.1.8-il prepend-path LD_LIBRARY_PATH /work/y07/y07/oifs/install/eckit/0.3.0/cce_8.1.8-il/lib setenv FDB_PATH /work/y07/y07/oifs/install/fdb/5.0.0/cce_8.1.8-il prepend-path PATH /work/y07/y07/oifs/install/fdb/5.0.0/cce_8.1.8-il/bin prepend-path LD_LIBRARY_PATH /work/y07/y07/oifs/install/fdb/5.0.0/cce_8.1.8-il/lib module-whatis ECMWF library FDB on HECToR Phase 3 ------------------------------------------------------------------
Check out the OIFS code
On HECToR create a suitably named directory in your /home space. We recommend building in /home which handles small files more efficiently than does /work for a quicker build. Now check out the OIFS code tree with fcm co. Here is an example
fcm co fcm:oifs/branches/dev/grenville/38r1v02_fdb_updates@16
This cumbersome command can be made simpler by use of FCM keywords and of course you can use aliasing to reduce the amount of typing required. However, it is instructive to see the command in its entirety (actually we have already defined the keyword oifs to save some typing). The text after fcm co is the URL for the code which exposes some of the structure of the underlying code repository. In this case the code is in the branches section of the repository owned by developer grenville; the branch is called 38r1v02_fdb_updates and the code is at (@) revision number 16. This is just standard FCM (svn) terminology.
This is the simplest way to get OIFS code - it will often be the case that you wish to extract from several branches and possibly a working copy - see later.
Build the OIFS executable
After executing this command on HECToR, you will have a complete copy of the code in your /home space. A set of configuration files is provided in the oifs/make sub-directory for several possible compilers and options. When you loaded the openifs_cce/0.0.2 module, you set several environment variables (namely, OIFS_ARCH, OIFS_COMP, and OIFS_BUILD) which determine the particular configuration files to choose. In the make directory type
fcm make -j 4 -vv -f cfg/oifs.cfg
this runs the compile and link phases of the build, using 4 processors (-j 4), with verbose output (-vv) based on the configuration file in cfg/oifs.cfg (the argument of -f option) (see the FCM documentation for more options for the fcm make command). This command may take some time to complete.
The executable, currently called master.exe, will be in make/opt/oifs/bin.
Getting data to run the model
Now you have an executable, you will need some data. The source for data is Glenn Carver at ECMWF; Glenn can provide start data for configurations at several resolutions and for various times. We have several jobs available for model testing at t159, t511 and t1279 resolution. Each job comes with data required to start the model, a namelist defining a set of model parameters, a PBS script (called oifs.job) set up for a "standard" run and an example trace output file. We will give a description of the important parts of the PBS script used to submit jobs to the compute nodes.
Running the model
The file has the usual PBS directives for reserving HPC resource, in this case we have requested 1024 processors with fully populated nodes, for a time limit of 25 minutes. You should edit this to specify your HECToR account and email details and change the HPC resource request as appropriate.
#!/bin/bash --login # # Jobname #PBS -N openifs # Total no. of tasks (MPIxOpenMP) for job #PBS -l mppwidth=1024 # No. of tasks per node (1-32) #PBS -l mppnppn=32 # Time for the job #PBS -l walltime=00:25:00 # Account #PBS -A n02-cms # Squash output & error output together #PBS -j oe # Mail options #PBS -M email@example.com #PBS -m ae
The remainder of the PBS script refers more specifically to your job.
set -x # Make sure any symbolic links are resolved to absolute path export PBS_O_WORKDIR=$(readlink -f $PBS_O_WORKDIR) # Change to the directory the job was submitted from cd $PBS_O_WORKDIR
Currently, you will need to set OIFS_HOME to your oifs code tree
# Root directory of OpenIFS export OIFS_HOME=$HOME/oifs_extract/extract/oifs # use assign statement to read f77 unformatted files as big-endian # as model was not compiled with -h byteswapio export FILENV=$PBS_O_WORKDIR/.myassign assign -N swap_endian g:su
The following section allows you to specify the compute decomposition - this example will use 128 MPI tasks (NPROC), each with 8 OMP threads (NTHREADS). Note NPROC x NTHREADS = 1024 which was the number specified for mppwidth in the header section of the file. EXPID refers to the ECMWF supplied experiment id. RUN is simply a means of distinguishing output from different runs (the names of output directories will be distinguished by RUN). TSTEP is the model time step in seconds (the best value has been determined by ECMWF in general). You probably won't change MASTER or NAMELIST.
The script ifs_run.sh sets up the aprun command based on its arguments. The model resolution is specified through -r (t511 in this case) and the length of the model run (in model days, one day in this case) is specified through the -f option
# Run model EXPID=fw13 NPROC=128 NTHREADS=8 RUN=9 TSTEP=600 MASTER=$OIFS_HOME/make/opt/oifs/bin/master.exe NAMELIST=ecmwf/$EXPID.namelistfc $OIFS_HOME/make/cfg/ifs_run.sh -m $MASTER -r 511 -e $EXPID -s $TSTEP -n $NPROC \ -t $NTHREADS -f d1 -x $RUN -l $NAMELIST exit 0
Once you have made the edits appropriate for your job type
You can run the model in one of two IO modes. A namelist setting (more later) is all that is required to switch between the model generating its output synchronously through a single processor or asynchronously through multiple processors. Jobs which output large volumes of data benefit significantly in terms of reduced wall clock time by using asynchronous IO (the FDB IO model). The model produces output which is in two sub-directories of the directory from which the model was submitted. For the above example the following files appear
-rwxr-xr-x 1 grenvill n02 159249451 Sep 27 10:18 master.exe -rw-r--r-- 1 grenvill n02 4587 Sep 27 10:18 fort.4 drwxr-sr-x 3 grenvill n02 4096 Sep 27 10:24 fdb9 -rw------- 1 grenvill n02 784401 Sep 27 10:26 openifs.o1706747 drwxr-sr-x 2 grenvill n02 4096 Sep 27 10:28 output9
If run in single-writer mode, the grib output is in output9 (note we set RUN=9 in the job submission script which is where the 9 comes from in the output directory names) and in FDB mode the output is in fdb9. In either case the output is in grib format - the organization of data in the output files differs in FDB and non-FDB runs, but the field data is identical (bit compares) in each case.