wiki:Projects/OpenIFS-IO

Version 7 (modified by grenville, 7 years ago) (diff)

OpenIFS IO

The OpenIFS project is an initiative from ECMWF that will deliver a portable version of its Integrated Forecasting System to the academic community.

NCAS-CMS has agreed to host an OpenIFS repository on PUMA. This will allow those in the academic community with OpenIF licences to access OpenIFS and run it on HECToR. The initial installation of OpenIFS has been made into the repository.

OpenIFS currently serializes its output through a single processor. This can result in a major performance bottleneck when large volumes of data are being output which will likely be the case for high resolution OpenIFS integrations. NCAS-CMS is leading a 6-month HECToR dCSE project (NAG) in collaboration with ECMWF with the aim of implementing a more sophisticated I/O model in OpenIFS (on HECToR) in which I/O is performed in parallel and asynchronously with computation. The method is currently used by the operational IFS and uses the so-called Fields Database (FDB). OpenIFS writes data in GRIB format and for ease of use, the project will also see the implementation of MARS client on HECToR. We are investigating the installation of Metview also. The FDB maintains a metadata index which stores the location of the model fields in output files for later retrieval by the MARS client. FDB has been shown to scale well on IBM P6/GPFS architecture and we are hopeful that the same will be the case on Cray/Lustre? and the project will determine if this is the case.

The project has five work packages covering software installation and testing, model verification, development of metrics for verification, optimization an performance, and product delivery. Mark Richardson (NAG) and Glenn Carver (ECMWF) are our partners and have been working on installing the FDB software (and its dependencies) on HECToR for use in modules by the community as part of work package 1. That work is nearly completed.

We are beginning to write the scripts needed to extract the source code, mirror it to HECToR and build the model executable on the supercomputer which will be the basis of the job submission system.

The project timeline is presented in the attached file (openifs_fdb_hector.pdf).

Aug. 8th

Considerable progress has been made with this project. The OpenIFS repository has been created on PUMA and the OpenIFS code installed. Mark has installed the necessary grib libraries on HECToR and the FDB libraries under a package account for maintainability and has set up environment modules to help manage the software installation. The FDB calls have been activated in the OpenIFS code and we have successfully extracted and built the model on HECToR. Glenn has made available several models at differing resolutions for testing. We have concentrated on the t1279 model (about 25km global resolution). Several scenarios have been investigated - the results are summarized in the figure below. A baseline performance with minimal IO is established for 1 and 8 OMP threads the model scales well, especially when run with 8 OMP threads (see figure below right). OpenIFS running with is 'single-writer' output scheme shows the characteristic performance slow down as a single processor throttles the rest in order to gather and output data. Our test generates hourly output for a total of 58GB/model day; time spent performing output is a very significant portion of the total (~25% of wallclock for 1024 processors, rising to ~40% wallclock for 4096 processors). Running with the FDB scheme shows an impressive performance improvement, whereby the time taken for asynchronous multiprocessor output is virtually hidden and amounts to not more than 5% of the wallclock time for the run.

How to get on IOFS job running on HECToR

There are a couple of preliminary administrative tasks which need to be taken care of first.

  • Your institution will need an OpenIFS license - check with Glenn Carver (ECMWF)
  • You will need an account on HECToR
  • You will need to be a member of the oifs package account on HECToR. Contact CMS to arrange to be added to this group.
  • You will need to have access to the OpenIFS code repository on PUMA - contact CMS

It will be helpful if you have some knowledge of the Flexible Configuration Management (FCM) system. We use FCM to manage the OpenIFS code and in addition to manage the OpenIFS build. Much of the work has been done which will enable you to get started on HECToR, but it will benefit you and make your work more efficient if you become familiar with some simple FCM operations, such as commit, check-out, add, revert, branch-create. FCM is described in depth in complete online documentation at http://metomi.github.io/fcm/doc/ and an FCM tutorial is available at http://cms.ncas.ac.uk/wiki/Fcm.

Check out the OIFS code

You will need to set up the correct environment to build and run OIFS jobs. Since HECToR has different hardware on its compute and service nodes, there is an extra consideration when using some OIFS-related utilities on the service nodes (more later). Let's assume that you are interested in building and running the model only for now. At the HECToR command line, type

module use ~oifs/modules
module load openifs_cce/0.0.2

Loading the openifs module sets some paths required for the build and run and has some checking to ensure that there is no conflict with currently loaded modules. Typing

module show openifs_cce/0.0.2

gives details of what the module does, namely

module-whatis    CCE versions of support libraires to build OpenIFS 
conflict         PrgEnv-gnu 
conflict         PrgEnv-pgi 
prereq   cce/8.1.8 
module           load grib_api_cce/0.0.2 
module           load fdb_cce/0.0.2 
prepend-path     PATH /work/n02/n02/hum/fcm/bin 
setenv           OIFS_IFSDATA /work/y07/y07/oifs/data/ecmwf/ifsdata 
setenv           OIFS_ARCH x86_64 
setenv           OIFS_COMP cce_fdb 
setenv           OIFS_BUILD opt 

If your normal module set up is with the gnu or pgi programming environments, then an attempt to load openifs_cce/0.0.2 will fail, and it will also fail unless you have the Cray cce1/8.1.8 compiler loaded. Use module swap to ensure an appropriate starting module configuration.

On HECToR create a suitably named directory in your /home space. We recommend building in /home which handles small files more efficiently than does /work for a quicker build. Now check out the OIFS code tree with fcm co. Here is an example

fcm co fcm:oifs/branches/dev/grenville/38r1v02_fdb_updates@16

This cumbersome command can be made simpler by use of FCM keywords and of course you can use aliasing to reduce the amount of typing required. However, it is instructive to see the command in its entirety (actually we have already defined the keyword oifs to save some typing). The text after fcm co is the URL for the code which exposes some of the structure of the underlying code repository. In this case the code is in a branch owned by developer grenville; the branch is called 38r1v02_fdb_updates and the code is at (@) revision number 16. This is just standard FCM (svn) terminology.

This is the simplest way to get OIFS code - it will often be the case that you wish to extract from several branches and possibly a working copy - see later.

After executing this command on HECToR, you will have a complete copy of the code in your /home space. A set of configuration files is provided in the oifs/make sub-directory for several possible compilers and options. When you loaded the openifs_cce/0.0.2 module, you set several environment variables which determine the particular configuration files to choose. In the make directory type

fcm make -j 4 -vv -f cfg/oifs.cfg

which runs the compile and link phases of the build, using 4 processors (-j 4), with verbose output (-vv) based on the configuration file in cfg/oifs.cfg (the -f option) (see the FCM documentation for more options for the fcm make command).

Attachments (4)

Download all attachments as: .zip