wiki:UM/GettingInitialData

Getting Initial Data

Start dumps, LBC files and ensemble data can be obtained from the Met Office archive system (MASS). This is accessed using the MOOSE client. This covers the latest five years. Start dumps from earlier periods can be constructed as GRIB files from ECMWF data.

To get the data, create a help desk ticket providing the following information,

  • dates and times required,
  • project name,
  • estimate of the data volume in GB,
  • where in MASS the data is located if it is not one of the standard sets below.

Getting data from the Met Office

The start dumps are now stored in MASS, which can be accessed via MONSooN or JASMIN using the MOOSE interface.

Instructions for setting up external access to MASS from JASMIN are available on the Collaboration Twiki.

To get data for 28/Sep/2011 we would type,

moo get moose:/opfc/atm/global/rerun/201109.file/20110928_qwqg00.T+0 .

(Note the dot). This takes about 15 to 20 minutes.

To find out whether the data exists,

moo ls moose:/opfc/atm/global/rerun/201109.file

Generally we prefer to get the output from the global runs - these have "qwqg" in the title; otherwise we get the global update - these have "qwqu" in the file name.

For further information on using MOOSE on MONSooN and JASMIN see: Getting started with MOOSE on MONSooN and JASMIN.

Older files

Older files are stored in a packed archive format - these have the extension ".pax". These usually contain several dumps and have names like,

coprr.udQU06.20060721.pax

where the "ud" denotes "unified model dump" and QU06 denotes global update output at 06Z. These have to be unpacked and the files relabelled. There is a script to do this in

 
#!/bin/ksh
# i will be like coprr.udQU06.20060725.pax
for i in *.pax
do
print $i
dat=${i%.*}                   # take .pax off the end
datestring=${dat##*.}        # remove bit to left of the rightmost dot
pax -r -f $i
cd opdaily/datawgl
for j in *
do
mv $j $TMPDIR/${datestring}_$j
done
cd -
done

This assumes a list of pax files and unpacks, relabels and stores the results in $TMPDIR.

UKV start dumps

The UKV start dumps are in MASS with path names of the type:

moose:/opfc/atm/ukv/rerun/201108.file/20110810_qwqv03.T+1

moose:/opfc/atm/ukv/rerun/201108.file/20110810_qwqv06.T+1 etc…

UKV forecasts and analyses

The UKV forecasts and analyses are in the MASS with path names of the type

moose:/opfc/atm/ukv/prodm/YYYY.pp/prodm_op_ukv_YYYYMMDD_AA_NNN.pp for the multilevel fields and

moose:/opfc/atm/ukv/prods/YYYY.pp/prods_op_ukv_YYYYMMDD_AA_NNN.pp

where YYYY = year e.g. 2011, MM = month e.g. 05, DD = day e.g. 21 AA = analysis time = 03, 09, 15 or 21 NNN = the part of the forecast - goes up in two's from 000 to 036. e.g. 000, 002, 004 …. 036

MOGREPS-R files

Ensemble fields are in:

moose:/opfc/atm/mogreps-g/rerun/YYYYMM.file moose:/opfc/atm/mogreps-r/rerun/YYYYMM.file

Where YYYY = year e.g. 2011, MM = month e.g. 07

At minimum, to re-run MOGREPS-R you need the analysis files in the mogreps-r path: E.g. 20110722_qwqy06.T+0.18km 20110722_qwqy18.T+0.18km

And all the perturbations files E.g. 20110722_perts.qwey06.oper??.pp1 where ?? Goes from 00 to 23 members.

Also in the mogreps-g path you need the boundary files: E.g. 20110722_qweg00.oper??.FRAMES.EY.tar.gz.

Getting data from ECMWF

The standard data I provide is from ECMWF's operational data. It is a deterministic forecast from the atmospheric model, version 1, analysis. It is not ERA. Data prior to X is in GRIB1 format, after that it is GRIB2. The GRIB start dumps are created by concatenating the following fields:

  • skin temperature,
  • surface pressure,
  • land-sea mask,
  • geopotential,
  • temperature,
  • specific humidity,
  • U and V wind velocities

In contrast a UM dump contains typically a hundred fields. So the GRIB start dumps need to be reconfigured.

W. McGinty

Last modified 3 months ago Last modified on 06/07/17 16:37:51