wiki:Projects/NEMOVAR/Notes

Version 71 (modified by cthomas, 4 years ago) (diff)

Notes on NEMO/NEMOVAR cycling suite for assimilation

Introduction

Operation of the NEMO/NEMOVAR assimilation code is controlled by the Rose suite labelled puma-aa184, which is based on puma-aa164 (itself a copy of puma-aa145). The purpose of this suite is to assimilate data into a high-resolution ocean model using a cycling framework.

The setup of the suite is described in Suite setup and its operation is described in Suite operation. More detail on the NEMO/NEMOVAR operation can be found in the NEMO/NEMOVAR section.

Suite setup

Structure

The structure of the suite is the standard one:

  • suite.rc: suite structure and operation
  • rose-suite.info: basic suite info
  • rose-suite.conf: configuration of variables used in the suite (including paths), picked up in suite.rc
  • meta/rose-meta.conf: information about the variables defined in rose-suite.conf

Information on Rose can be found here and a guide to cylc can be found here.

The suite itself is quite simple and acts as a springboard for many other processes to run.

Variables

These are the variables that are defined in rose-suite.conf and used in suite.rc.

  • START_POINT: Start point of run
  • RUN_LENGTH: Length of run
  • CYCLE_LENGTH: Length of assimilation cycle
  • WALL_CLOCK_LIMIT: Wall clock limit
  • CALENDAR: Calendar used (default '365day')
  • CICE_COL: Number of CICE columns
  • CICE_MAXBK: Maximum number of blocks per processor for CICE
  • CICE_ROW: Number of CICE rows
  • COMPUTE_HOST: Host to use for compilation and/or model run
  • EXTRACT_HOST: Host to use for code extraction
  • NEMO_IPROC: Number of NEMO processes E-W
  • NEMO_JPROC: Number of NEMO processes N-S
  • RUNID: Prefix run ID for output files
  • BUILD: Build model? (Y/N)
  • RUN: Run model? (Y/N)
  • GROUP: User group

The most variables which change most frequently are START_POINT, RUN_LENGTH, CYCLE_LENGTH and WALL_CLOCK_LIMIT (which needs to be larger for a longer assimilation cycle).

When assimilating large files it may also be necessary to increase the memory limit. This can be done by altering ConsumableMemory in the relevant place in suite.rc.

Apps included

The suite contains several apps which are used to perform auxiliary tasks such as retrieving satellite data for a particular day. The apps are listed in rose-suite.conf:

  • file:app/daily_fluxes
  • file:app/daily_observations
  • file:app/fixed_ancillary
  • file:app/fixed_restarts
  • file:app/nemovar
  • file:app/nemo_cice
  • file:app/fcm_make_nemo

By default each app points towards a location on SVN. The SVN locations are all subdirectories of the following directory:

svn://puma.nerc.ac.uk/NEMOCICE_svn/UKMO/branches/dev/annette/r5062_standalone_apps_MONSooN/config/Rose/apps/

Each subdirectory can be checked out locally and modified, in which case the path specified in rose-suite.conf must be changed to point to the local version.

The subdirectories all contain a rose-app.conf file along with other files/subdirectories. Each app is discussed in more detail below.

Daily fluxes

The app is called daily_fluxes and corresponds to the subdirectory operational_atmospheric_forcing. This app copies the daily flux files. For day D and cycle length C, flux files from days (D - 1) to (D + C) must be present in the directory INPUT_DIR.

  • app-interface.conf: defines IN_ADDITIONAL_DAYS and OUTDIR_FLUXES
  • rose-app.conf: sets the default command to prepare_fluxes, defines INPUT_DIR
  • bin/prepare_fluxes: Script to copy flux files from INPUT_DIR to OUTDIR_FLUXES and unzip them
  • opt/rose-app-FOAM_v13.conf: defines INPUT_DIR (this is not used)

The files in INPUT_DIR must be listed in prepare_fluxes:

set -A FILESTEMS <filestem1> <filestem2> ...

where each filestem is the stem before the particular date in the filename. For example, if the stem is flux then the full filename for 1/1/11 would be flux_y2011m01d01.nc.

prepare_fluxes then copies and unzips the relevant flux file.

It is important to ensure the namelist reflects the contents of these files. This part of the namelist is modified in the NEMO-CICE rose-app.conf (see later).

Daily observations

The app is called daily_observations and corresponds to the subdirectory hindcast_observations. This app copies the daily observation files for assimilation into the particular cycle. Everything in the opt directory controls a different set of observations.

  • app-interface.conf: defines OUTDIR_OBSERVATIONS
  • rose-app.conf: defines INBASEDIR, OUTDIR and EXEC_DIR and sets a large number of namelist parameters
  • bin/NemoQcProg_ExtractAndProcess: Script to copy observation files in place and run the NEMO QC to file convert to 'feedback' format
  • bin/NemoScr_ObsPreProc: Script to run the NEMOQC code on operational observations (not used in this case)
  • opt/rose-app-{altimeter,ghrsst_avhrr,ghrsst_metop,profile,seaice1,seaice2,surface}.conf: (re)defines INSUBDIR, INOBSFILE and OUTOBSFILE and sets a namelist parameter

The observations are as follows:

  • Altimeter: satellite altimetry; AVISO and/or "realtime" measurements
  • GHRSST: sea surface temperature (SST) from two satellites (AVHRR, METOP)
  • Profile: profile measurements
  • Sea ice: sea ice
  • Surface: in-situ SST

The location of these files is controlled in the relevant rose-app-*.conf file; both an input directory (INBASEDIR) and file stem (INOBSFILE) are required. Similarly to the fluxes, the filenames must contain the date. Note that profile data may be monthly instead of daily.

The script NemoQcProg_ExtractAndProcess prepares the files by copying them, unzipping them and running NemoQcProg_ExtractAndProcess.exe, which applies some quality control.

Ancillary information

The app is called fixed_ancillary and corresponds to the subdirectory GO5_orca025_ancillary. This app copies ancillary files (usually one-offs that don't change during the suite's operation) into the output directory.

  • rose-app.conf: defines a variety of environment variables (pointing to NEMO data directories) and creates softlinks. The environment variables are: NEMO_ANCIL, NEMO_FORCE, NEMO_GRIDS, NEMO_INPUTS, NEMO_EXECS and NEMO_IODEF
  • bin/copy_ancilliary: empty
  • bin/operational_ancilliary: creates a large number of softlinks

Example ancillary files include bathymetry, x/y positions, rivers, and XML configuration of NEMO.

Initial restarts

The app is called fixed_restarts and corresponds to the subdirectory initial_restarts. This app copies the initial restarts which are used as the background state on the first assimilation. If the cycling starts on data D, the restarts must be present for that date. In further iterations of the cycle, restarts are taken to be the analysis from the previous iteration; see Suite operation below.

  • rose-app.conf: defines RESTART_DIR
  • bin/copy_restarts: creates a link from RESTART_DIR to OUTDIR_RESTARTS

NEMOVAR

The app is called nemovar and corresponds to the subdirectory nemovar. This app configures NEMOVAR.

  • rose-app.conf: defines a large number of namelist parameters
  • app-interface.conf: defines INDIR_ANCILLARY, INDIR_ALTBIAS, INDIR_INNOVATIONS, OUTDIR_ALTBIAS and OUTDIR_INCREMENTS
  • bin/helper_functions.sh: two helper functions
  • bin/init_nemovar.sh: initialise NEMOVAR
  • bin/run_nemovar.sh: run NEMOVAR
  • bin/run_sstbias.sh: calculate SST Bias
  • file/iodef.xml: standard NEMO file (controls fields and outputs)
  • file/xmlio_server.def: standard NEMO file
  • opt/rose-app-orca025.conf: defines a large number of namelist parameters
  • opt/rose-app-sstbias-orca025.conf: defines a large number of namelist parameters

The two scripts init_nemovar.sh and run_nemovar.sh are very important and are discussed in more detail in the NEMO/NEMOVAR section.

NEMO-CICE

The app is called nemo_cice and corresponds to the subdirectory GO5_nemo_GSI6_cice. This app configures NEMO-CICE. There are two stages: obsoper and IAU. A lot more detail can be found in the Suite operation and NEMO/NEMOVAR sections.

  • rose-app.conf: defines a large number of namelist parameters, including how to read in flux files
  • app-interface.conf: defines INDIR_EXEC, INDIR_ANCILLARY, INDIR_FLUXES, INDIR_BACKGROUND_RESTARTS, obsoper variables (INDIR_OBSERVATIONS, INDIR_ALTBIAS, OUTDIR_INNOVATIONS, OUTDIR_ASSIM_BACKGROUND), iau variables (INDIR_INCREMENTS, OUTDIR_ANALYSIS_RESTARTS)
  • bin/helper_functions.sh: two helper functions
  • bin/run_nemo_cice: patched version removes handle_observations and processing profb
  • bin/update_nemo_nl: script to update namelists
  • meta/rose-meta.conf: large number of definitions including of namelist variables
  • opt/rose-app-iau.conf: namelist definitions for iau
  • opt/rose-app-obsoper.conf: namelist definitions for obsoper

The run_nemo_cice script is particularly important here and is described more in the later sections.

Build

The app is called fcm_make_nemo and corresponds to the subdirectory fcm_make_puma_GO5_nemo_GSI6_cice. This app builds NEMO and related programs.

  • rose-app.conf: empty
  • file/fcm-make.cfg: links to other .cfg files
  • file/fcm-make-GO5.cfg: for compiling NEMO
  • file/fcm-make-GSI5.cfg: for compiling CICE
  • file/pwr6-xlf-opt.cfg: similar to a Makefile

Suite operation

This is a rough description of what happens when the suite runs. An example cylc dependency tree can be seen on this page. The suite operates in two stages: the first stage involves compilation and setup and the second stage is where the cycling occurs.

First stage:

  • build nemo
  • run fixed_ancillary (pulls in ancillary information)
  • run fixed_restarts (the background)
  • if all successful, run nemo_cice_obsoper (the observation operator step)

Remaining stages:

  • run daily_fluxes
  • run DAILY_OBSERVATIONS
  • if all successful, run nemo_cice_obsoper (produces innovations & forecast background)
  • DAILY_OBSERVATIONS 'suicides' if nemo_cice_obsoper succeeds
  • if nemo_cice_obsoper succeeds, run nemovar (produces increments)
  • if nemovar succeeds, run nemo_cice_iau at a restart 'P1D' (incremental analysis update - produces analysis restarts)
  • if nemo_cice_iau succeeds, run nemo_cice_obsoper again

In the first cycle the following three stages occur:

Stage 1:

Required input Program run Output
Ancillary NEMO-CICE obsoper Innovations
Background (restarts) Forecast background
Observations
Fluxes

Stage 2:

Required input Program run Output
Innovations NEMOVAR Increments
Forecast background

Stage 3:

Required input Program run Output
Ancillary NEMO-CICE obsoper Analysis (restarts)
Background (restarts)
Increments
Fluxes

In subsequent cycles the required input in the first stage is

Required input
Ancillary
Analysis (restarts)
Observations
Fluxes

Runtime

When rose suite-run is executed a cylc task is started. This populates the $HOME/cylc-run/puma-aa164 directory with the following:

  • The app directory contains subdirectories which are listed in the original rose-suite.conf: daily_fluxes, daily_observations, fcm_make_nemo, fixed_ancillary, fixed_restarts, nemo_cice, nemovar
  • suite.rc is copied from the suite directory, adding in the definitions from rose-suite.conf This file is then processed by substituting all of the variables to make suite.rc.processed
  • Two softlinks are also created to the $DATADIR/cylc-run/puma-aa164/{work,share} directories
  • Log files appear as always, the latest of which is linked to by log
  • cylc-suite.db, cylc-suite-env, rose-suite.info
  • state directory with current states at different times
  • meta/rose-meta.conf is the same as in the suite directory

Input data

Information on the input data can be found here.

Output data

Output is placed in $DATADIR/cylc-run/puma-aa164/work and $DATADIR/cylc-run/puma-aa164/share.

As stated on the Instructions page the data take up at the moment:

  • work: 90 GB per day
  • share: 4 GB initial (fcm_make_nemo and data/ancillary) + 5.5 GB per run day

The contents of each directory are described in the next two sections.

share

The subdirectories are:

  • cycle: output from each cycle (subdirectories analysis, assim_background, background, fluxes, increments, innovations, observations, updated_altbias)
  • data: ancillary data (various .nc)
  • fcm_make_nemo: output of the build procedure (e.g. .o, .mod, .f90 files)

work

There are subdirectories for each day run over. Each subdirectory contains the rose-app-run.conf file that was present in the original directory. The additional files produced in the default setup are as follows. NNNN indicates the particular subjob number (from 0000 to 0192 in the default setup) and DDDD indicates a date.

  • daily_fluxes: nothing else in here
  • daily_observations_{altimeter,ghrsst_avhrr,ghrsst_metop,profile,seaice1,seaice2,surface}: various namelist and .nc files
  • fcm_make_nemo: the .cfg files listed above
  • fixed_ancillary: nothing else in here
  • fixed_restarts: nothing else in here
  • nemo_cice_iau: some NEMO input files, altbias_NNNN.nc, coordinates_NNNN.nc, geothermal_heating_NNNN.nc, nemo_nemovari.DDDD.nc, nemo_nemovaro_00000072_analysis_restart_NNNN.nc, nemo_nemovaro_00000072_pcbias_NNNN.nc, pcbias_NNNN.nc, restart_NNNN.nc
  • nemo_cice_obsoper: some NEMO input files, altbias_NNNN.nc, assim_background_state_DI_NNNN.nc, assim_background_state_Jb_NNNN.nc, coordinates_NNNN.nc, geothermal_heating_NNNN.nc, nemo_nemovari.DDDD.nc, nemo_nemovaro_00000072_analysis_restart_NNNN.nc, nemo_nemovaro_00000072_pcbias_NNNN.nc, pcbias_NNNN.nc, restart_NNNN.nc, seaicefb_01_fdbk_NNNN.nc, seaicefb_02_fdbk_NNNN.nc, slafb_01_fdbk_NNNN.nc, sstfb_01_fdbk_NNNN.nc, sstfb_02_fdbk_NNNN.nc, sstfb_03_fdbk_NNNN.nc
  • nemovar: some NEMO input files, altbiasout_NNNN.nc, assim_background_state_Jb_NNNN.nc, background.mld.T_NNNN.nc, normalization.lookup_NNNN.nc, orca025l75_00000001_restart_NNNN.nc, orca025l75_00000001_restart_NNNN_NNNN.nc, ratio.out_NNNN.nc

NEMO/NEMOVAR

nemo_cice

There are two parts to this: nemo_cice_obsoper and nemo_cice_iau. They inherit some common features which are described below.

suite.rc

Common variables in suite.rc:

            ROSE_TASK_APP = nemo_cice
            CYCLE_LENGTH = {{CYCLE_LENGTH}} #'P1D'
            CICE_NPROC    = $NEMO_NPROC

            INDIR_EXEC=$EXEC_DIR
            INDIR_ANCILLARY=$ANCILLARY_DIR
            INDIR_FLUXES=$FLUX_DIR
            INDIR_BACKGROUND_RESTARTS=$BACKGROUND_DIR

App

app/nemo_cice - pulled in from SVN

Some extracts from rose-app.conf:

[command]
default=run_nemo_cice

[env]
EXEC_UTIL=/projects/jomp/danlea/NEMO/NEMO/bin
EXEC_NEMOVAR=/projects/jomp/danlea/NEMO/NEMOVAR/bin

[namelist:namrun]
cn_exp='orca025l75'
cn_ocerst_in='restart'
cn_ocerst_out='analysis_restart'
ln_clobber=.true.
ln_dimgnnn=.false.
ln_mskland=.false.
ln_rstart=.true.
nn_chunksz=0
nn_date0=20100715
nn_istate=0
nn_it000=1
nn_itend=72
nn_leapy=1
nn_no=0
nn_rstctl=0
nn_stock=600
nn_stocklist=0
nn_write=300

Contents of app-interface.conf:

[default]
INDIR_EXEC=
INDIR_ANCILLARY=svn://fcm3/NEMO_svn/UKMO/branches/dev/chughes/r11231_standalone_apps/config/Rose/interface_types/nemo_cice_nemovar_ancillary.txt@11454
INDIR_FLUXES=svn://fcm3/NEMO_svn/UKMO/branches/dev/chughes/r11231_standalone_apps/config/Rose/interface_types/sbc_core_forcing.txt@11454
INDIR_BACKGROUND_RESTARTS=svn://fcm3/NEMO_svn/UKMO/branches/dev/chughes/r11231_standalone_apps/config/Rose/interface_types/nemo_cice_restarts.txt@11454

[obsoper]
INDIR_OBSERVATIONS=
INDIR_ALTBIAS=svn://fcm3/NEMO_svn/UKMO/branches/dev/chughes/r11231_standalone_apps/config/Rose/interface_types/altbias.txt@11458
OUTDIR_INNOVATIONS=
OUTDIR_ASSIM_BACKGROUND=

[iau]
INDIR_INCREMENTS=
OUTDIR_ANALYSIS_RESTARTS=svn://fcm3/NEMO_svn/UKMO/branches/dev/chughes/r11231_standalone_apps/config/Rose/interface_types/nemo_cice_restarts.txt@11454

run_nemo_cice

The script that steers the job is called run_nemo_cice. It has different sections for nemo_cice_obsoper and nemo_cice_iau.

The common part governs some setup and the running of the model. Extracts from the common part are:

# MODE - ['standalone', 'obsoper', 'iau']
# CYCLE_LENGTH - ISO8601 duration specifying the length of this cycle
# RUNID - identifying string
# INDIR_EXEC - path to executable
# INDIR_FLUXES - directory containing fluxes
# INDIR_ANCILLARY - directory containing ancillary files
# INDIR_BACKGROUND_RESTARTS - directory containing background restarts
# OUTDIR_ANALYSIS_RESTARTS - directory in which we put analysis restarts
# NEMO_IPROC - NEMO processors east-west
# NEMO_JPROC - NEMO processors north-south
# NEMO_NPROC - total number of NEMO processors
# CICE_COL - number of columns for CICE
# CICE_ROW - number of rows for CICE

NEMO_NL=namelist
CICE_NL=ice_in

# Link in common inputs
mkdir -p $CYLC_TASK_WORK_DIR/fluxes
ln -sf $INDIR_FLUXES/* $CYLC_TASK_WORK_DIR/fluxes
ln -sf $INDIR_ANCILLARY/* $CYLC_TASK_WORK_DIR
ln -sf $INDIR_BACKGROUND_RESTARTS/* $CYLC_TASK_WORK_DIR

# Re-link the coordinate and geothermal heating files per processor
i=0
while [[ $i -lt $NEMO_NPROC ]]; do
    ln -sf coordinates.nc coordinates_$(printf "%04d" $i).nc
    ln -sf geothermal_heating.nc geothermal_heating_$(printf "%04d" $i).nc
    let i=i+1
done

# Modify main NEMO namelist
update_namelist $NEMO_NL cn_exp "'${RUNID}o'"
update_namelist $NEMO_NL ln_rstart .true.
update_namelist $NEMO_NL nn_rstctl 0
update_namelist $NEMO_NL nn_it000 1
update_namelist $NEMO_NL nn_itend ${TIME_STEPS_PER_CYCLE}
update_namelist $NEMO_NL nn_date0 "$(rose date -c -f %Y%m%d)"
update_namelist $NEMO_NL nn_leapy ${NEMO_LEAP_YEAR_FLAG}
update_namelist $NEMO_NL jpni ${NEMO_IPROC}
update_namelist $NEMO_NL jpnj ${NEMO_JPROC}
update_namelist $NEMO_NL jpnij ${NEMO_NPROC}
update_namelist $NEMO_NL nitiaufin ${TIME_STEPS_PER_DAY}

# Modify main CICE namelist
update_namelist $CICE_NL days_per_year ${DAYS_IN_YEAR}
update_namelist $CICE_NL history_file "'${RUNID}i.${CICE_HISTFREQ_N}${CICE_HISTFREQ}'" #CT - could remove?
update_namelist $CICE_NL ice_ic "'cice_restart.dat'"
update_namelist $CICE_NL incond_file "'${RUNID}i_ic'"
update_namelist $CICE_NL istep0 ${TIME_STEPS_TO_START}
update_namelist $CICE_NL npt ${TIME_STEPS_PER_CYCLE}
update_namelist $CICE_NL restart .true.
update_namelist $CICE_NL restart_file "'cice_restart.dat'"
update_namelist $CICE_NL use_leap_years ${CICE_LEAP_YEAR_FLAG}
update_namelist $CICE_NL year_init $(rose date -c -f %Y)

echo ${INDIR_EXEC}/nemo.exe:${NEMO_NPROC} > Ocean.conf
rose mpi-launch -f Ocean.conf
NEMO_RC=$?

nemo_cice_obsoper

Prerequisites

nemo_cice_obsoper depends on fcm_make_nemo, fixed_ancillary, fixed_restarts, daily_fluxes, DAILY_OBSERVATIONS. During the cycling it also depends on the nemo_cice_iau stage.

suite.rc

In suite.rc, all of the nemo_cice variables listed above are inherited. In addition the following environment variables are defined:

            ROSE_APP_OPT_CONF_KEYS = "obsoper"
            INDIR_ALTBIAS=$ALTBIAS_DIR
            INDIR_OBSERVATIONS=$OBSERVATIONS_DIR
            OUTDIR_INNOVATIONS=$INNOVATIONS_DIR
            OUTDIR_ASSIM_BACKGROUND=$ASSIM_BACKGROUND_DIR

From the rose documentation on ROSE_APP_OPT_CONF_KEYS: Each KEY in this space delimited list switches on an optional configuration in an application. The configurations are applied in first-to-last order.

App

Extracts from opt/rose-app-obsoper.conf:

[env]
MODE=obsoper

run_nemo_cice

Parts of run_nemo_cice that are specific to the obsoper stage.

Before the model has run:

    echo 'obsoper linking'
    # Altimeter bias
    ln -sf $INDIR_ALTBIAS/* $CYLC_TASK_WORK_DIR
    # Observations
    ln -sf $INDIR_OBSERVATIONS/* $CYLC_TASK_WORK_DIR

    function handle_observations {
        # Usage: handle_observations prefix namelist_file_list namelist_flags...
        PREFIX="$1"; shift;
        NL_FILE_LIST="$1"; shift;

        # Gather the list of observation files
        FILES=""
        for file in "${PREFIX}"_0*.nc ; do
            FILES="${FILES} '${file}'"
        done

        # If we find any observations
        if [ ! -z "${FILES}" ] ; then
            echo "Found ${PREFIX} observations ${FILES}"

            # Write the list of files into the namelist
            echo "Updating ${NL_FILE_LIST} with ${FILES}"
            update_namelist $NEMO_NL "${NL_FILE_LIST}" "${FILES}"

            # Set all the specified namelist flags (the remainder of the arguments) to true
            for FLAG in "${@}" ; do
                update_namelist $NEMO_NL ${FLAG} ".true."
            done
        fi
    }

    handle_observations seaice seaicefbfiles ln_seaicefb ln_seaice
    handle_observations sla slafbfiles ln_slafb ln_sla
    handle_observations sst sstfbfiles ln_sstfb ln_sst

After the model has run:

        echo copying innovations
        # copy innovations into the right place
        mkdir -p $OUTDIR_INNOVATIONS
        # rebuild the innovations into the output directory
        cd ${CYLC_TASK_WORK_DIR}
        for OBSFILE in sstfb slafb seaicefb; do
            echo rebuilding and copying ${OBSFILE}
            ${EXEC_NEMOVAR}/fbcomb.exe ${OUTDIR_INNOVATIONS}/${OBSFILE}_01.nc ./${OBSFILE}*fdbk_*.nc
        done
        cd -
        # copy 'assim background' file into the right place
        echo copying assim background
        mkdir -p $OUTDIR_ASSIM_BACKGROUND
        ln -sf ${CYLC_TASK_WORK_DIR}/assim_background_state_Jb*.nc ${OUTDIR_ASSIM_BACKGROUND}

Outputs

Once nemo_cice_obsoper succeeds, nemovar runs.

Example runs

Default setup

The default setup is to run for two days with one day of cycling, starting on 1/1/11. Much more information can be found here:

Note that for this particular date, no profile data are available, so the "daily_observations_profile" app will fail. The NEMO observation operator can still run without this data, however, and the suite should continue to run.

Different start date and run length

From the Instructions page:

To change the start date and run length, edit the START_POINT and RUN_LENGTH variables in the rose-suite.conf file. You will also need to edit the "daily_fluxes", "daily_observations" and "fixed_restarts" apps to point to the locations of the input data.

Adding other files

To do

  • Where is OUTDIR_RESTARTS set?
  • Check output in share and work directories