| Guide to the Met Office IDL library |
|
|
Guide to the Met Office IDL libraryJonathan Gregorywith grateful acknowledgements to Amoon Osman, Penny Boorman, Ben Booth, Jon Croome, Denise Cresswell, Roger Milton and AJ Watling for text and examples, and to Simon Tett, Mark Webb, Tremain Knight, Mark Hedley and the Met Office AVD (formerly graphics) team for collaboration on development of the library over many years2nd April 2004, 4th October 2011Introduction
What are IDL and PV-WAVE?IDL is a powerful programming language, supporting concepts familiar from other languages such as Fortran and C (variables of different data types, flow control with if-statements, do-loops, etc.). It was designed for data analysis and visualisation. Unlike Fortran and C, IDL doesn't have to be compiled as a separate operation before being run. That has the advantage that you can use IDL interactively at the command line, as well as writing programs in the language. It's a bit like a Unix shell: the commands that you type interactively at the Unix command line can also be used in a Unix shell script (a program), and you don't have to compile the script before running it. This approach has the drawback that IDL is slower than properly compiled and optimised languages like Fortran and C. PV-WAVE is a similar language to IDL. IDL is the original. The firm which originally produced PV-WAVE bought the rights to develop the language. Since then the two languages have diverged, but they are still very similar. Both languages are proprietary, meaning you have to buy a licence to use them. The Met Office used PV-WAVE throughout the 1990s, and migrated to IDL during the 2000s. In the academic community IDL is much more common than PV-WAVE. What is the Met Office library?The Met Office library is a collection of routines written in IDL at the Met Office, originally written in PV-WAVE in the Hadley Centre in the early 1990s, with development continuing to the present day. Its main purpose is to support analysis and visualisation of PP data produced by the Unified Model (UM). PP is a binary file format peculiar to the Met Office and the UM. It's fairly self-describing but not as easy to use or as flexible as netCDF. Support for climate data stored in netCDF following the CF convention is available in the Met Office library, although we don't expect that the UM itself will be altered soon to write out netCDF data. But being able to read netCDF obviously makes the library more generally useful, because of the large number of model intercomparison projects, such as CMIP, and observational datasets for which CF-netCDF is the data format. The best way to find out what the Met Office library offers is to look through the Appendices containing alphabetical and categorised lists of its contents. It contains routines for the following major functional areas:
The library also includes a fairly large number of general-purpose (not PP-specific) routines which may be useful in a wider range of applications. The library is also called the "UKMO" library. This is a non-brand-compliant acronym for the organisation formerly known as the UK Meteorological Office. The library is organised into a "core" (approximately 520 routines and 66k lines of code), maintained by the Met Office AVD team, and the Hadley Centre "applications" (another 230 routines and 35k lines), which are not maintained in principle. However the distinction is somewhat blurred. The library works in PV-WAVE as well as IDL. To some extent it hides the differences between the two, and hence makes user programs more portable. In the Met Office, IDL is usually run with local modifications that add to and in some cases override the native language, in order that PV-WAVE programs can run without modification. The most important of these is the implementation of lists and associative arrays in IDL versions before 8.0; this extension to the IDL language is needed to support a couple of Met Office library routines and is therefore required outside the Met Office as well. Apart from this, native IDL is always used outside the Met Office. In IDL 8.0, lists and associative arrays (hashes) were introduced as part of the native language, so this Met Office language extension is no longer required. The library can be made available to institutes having collaborations with the Met Office or analysing data from the UM. If you would like to obtain the library for use outside the Met Office, please follow the link to download it at the top of this guide. It is free of charge but there is a licence agreement. What is the purpose of this document?This document is designed to help a new user with little programming experience quickly learn how to use the Met Office library. It is not an introduction to IDL, although the first few sections provide some basic information. To learn the language, read the online manuals, which are comprehensive, consult the NCAS-CMS IDL guide, or seek help from colleagues! The Met Office AVD team provides an introductory course to IDL for Met Office staff. Please send comments about this document to j.m.gregory at reading.ac.uk. Conventions in this documentThroughout this guide, $ indicates the (Unix/Linux) operating system prompt and IDL> the IDL or PV-WAVE prompt. If the languages are different, we use WAVE> instead for PV-WAVE. We use typewriter type for program text to be typed exactly as it appears. When we refer to "IDL" it means both IDL and PV-WAVE unless otherwise stated.
Start and exit IDLTo start IDL or PV-WAVE, you may require special instructions for your place of work.
If you want to access the English translations of stash codes (this is very useful for automatic labelling of plots, etc.) you should define the UM version you want in your Unix environment e.g. $ export UMVERSION=4.5before starting IDL.
At the IDL prompt, use the up-arrow key to recall the previous line entered, the down-arrow the next line in the history, In both IDL and PV-WAVE, when you start up the Met Office environment an extra startup file is executed before your own, if you have one. The UKMO startup file is $UKMO_IDL_STARTUP in the Met Office, and $UKMO_WAVE_STARTUP elsewhere (even for IDL). Your own startup file is named by the Unix environment variable IDL_STARTUP in all cases for IDL (or WAVE_STARTUP if you are using PV-WAVE). The UKMO setup defines various other environment variables. IU_LIB is the location of the core UKMO library. In the Met Office, IDL_PP_PATH, IDL_DATA_PATH and IDL_SHARED_PATH list directories to be searched by various functions. These environment variables are called WAVE_PP_PATH, WAVE_DATA_PATH and WAVE_SHARED_PATH outside the Met Office (even for IDL). They are copied to corresponding Met Office library system variables. The IDL_PATH/WAVE_PATH in IDL and PV-WAVE respectively are used to locate IDL/PV-WAVE commands. At the IDL> prompt you can enter IDL commands line by line at the prompt or you can use a program file. A program file is a file containing a set of IDL commands, which IDL executes line by line. It can be written either directly to a file whose name ends in .pro, or by using journal to record a session. To end a session, type: IDL> exit or WAVE> quit
Keeping a record of a sessionTo record all or part of an IDL sessionUse IDL> journal,'filename' which saves all following commands to file filename or, if you omit the filename, it will save it to a file called wavesave.pro/idlsave.pro in the current directory. Enter IDL> journal to end the recording The result is a file that contains a complete description of the IDL session, which can be rerun later as a @-command file, or as a compiled main program if you put an end statement at the end of it. To save the definitions and values of variablesUse IDL>SAVE, Filename = 'filename',var1,var2,/Verbose This will save the variables into the file filename. If Filename = 'filename' is omitted, the default filename is wavesave.dat/idlsave.dat). To save all current variables, use the keyword /Variables and don't specify any by name. This is useful if you want to stop a session and resume it later. To restore the saved variablesEnter IDL>RESTORE, Filename = 'filename',/Verbose The optional /Verbose keyword makes IDL give useful information about what it is saving or restoring.
Online helpTo access IDL's online manuals enter: IDL> ? For help on Met Office (UKMO) routines e.g. for the routine pp_diff: (note the quotation marks in IDL). If there is no match, you will get a list of all routines whose name or one-line description contains the specified string. This "keyword" matching can be deliberately selected with the /key keyword e.g.$ tidlman pp_diff # in the Met Office $ midlman pp_diff # outside the Met Office IDL>man,'pp_diff' $ midlman -k zonal To get information about the IDL session, use the command INFO in PV-WAVE or HELP in IDL, which have many similar options. Given variable names, INFO/HELP reports their type. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
One other nice use of INFO/HELP is that it will evaluate expressions:IDL> a=2 PRINT would tell you the result, but INFO/HELP gives you the data type of the result as well.IDL> help, 2 * 5 + (10 * COS(45)) If you are in the Met Office, for further help you can contact the AVD Team who are responsible for supporting IDL. They run an introductory course on IDL and maintain documentation on MetNet. There are Met Office netnews groups concerned with graphics.
Running an IDL programPrograms (also called routines) written in IDL and stored in files can be run by several different methods. This section applies to main programs i.e. those which don't start with a pro or function statement. (See Procedures and functions for routines other than main programs.) Specifying which file you want to run
If you give a name with no / in it, IDL will look for a file of that name, optionally with the suffix .pro, in the colon-separated list of directories in your WAVE_PATH/IDL_PATH Unix environment variable. (The path is also stored in the IDL system variable !path.) This is the same idea as searching for Unix commands in the Unix PATH. The first file of the right name is used. In IDL, unlike in Unix, the current directory is always assumed to be at the start of the path. In IDL (not in PV-WAVE), a directory in the path can be prefixed with +. This means all subdirectories of that directory will also be searched. You can modify the path by changing the WAVE_PATH/IDL_PATH environment variable, for instance in your shell startup file (.dtprofile, .bashrc, .profile or whatever). Compiled programsTo improve run-time, IDL does a kind of "compilation" of a program. PV-WAVE (but not IDL) also has an optional facility to store the compiled versions in separate files (suffix .cpr). This saves the (small) compilation time, but doesn't make the routines run any faster. If you give a filename by itself on the command line: IDL> filename IDL will compile the file and run it. For this to work, the file must contain a program concluding with an end statement. Such files can also be invoked with IDL> .run filename In both cases, IDL compiles the file into memory, and then runs it provided it's a main program. You need the .run command to recompile a program if you alter the program file because otherwise invoking the program again just by name will execute the existing version from memory, not the new version in the file. You can use the .com command to compile a program, but not run it. Scripts (not compiled)If you give a filename like this: IDL> @filename IDL will not compile the file or store it in memory. This method is just as if you had typed in the lines of the file one by one at the command line. Files of this kind must not have an end statement. Moreover, they can't contain block constructions (if-blocks, etc.) spanning several lines unless all these lines are joined up with & $ to make them effectively one long line. This is because the same restriction applies to lines entered at the IDL prompt; all block constructions must be completed e.g. IDL> fish=1 & if fish then begin & print,fish & print,'head' & endif How do you decide which to use? The first (compiled) method is necessary if you want to write procedures (subroutines) or functions. It also has the advantage of supporting block constructions properly. The second (not compiled) is convenient for a script of a few lines. In either method, variables defined by the program will still be available at the command line when the program finishes. Invoking a routine from the Unix command lineYet another method is to specify a file to the IDL command in Unix: $ midl filename This will execute the file indicated (no paths are searched) like an @-file.
IDL programming tips
Some programming tips for PV-WAVE and IDL.
Procedures and functions
Procedures and functions are an integral part of the structure of IDL. Almost all the routines in the Met Office library are procedures and functions. They encourage a modularised and flexible approach to programming.
Here is an example of simple IDL procedures and functions: pro doubleanumber,argument
As the procedure example shows, variables which are passed in as arguments (number in the example) can have their values modified in the calling routine (it changes from 46 to 92). This is like Fortran. A difference from Fortran is that when elements of arrays (not entire arrays) are passed in as arguments, a procedure or function can't alter their values in the calling routine. For example, IDL> array=[3,4,5] A single program file can contain more than one routine. If any of the routines in the file has the same name as the file, then invoking that routine will cause the all the routines in the file to be compiled (as for compiled main programs). The file can be (re)compiled using the .run command. Confusingly, .run does not execute any procedure or function; it just compiles them into memory. If an error occurs in a routine, program control stops at the line in the source file where the error occurred by default. This means IDL is "inside" the routine still, and you can examine its local variables. Putting in a stop will also allow you do that. You can only compile a routine when IDL is at the "main level" i.e. not still "inside" any routine. To get back to the main level, enter retall in IDL. You don't have to do this in IDL because .run does retall automatically. The Met Office routines change the default to so that control goes automatically back to the main level in case of a failure. Error-trapping is controlled by the on_error statement, which is included in the comm_error @-file appearing at the start of all the UKMO routines. Invoking a file with @ is like an "include" file in Fortran or C. This construction is used in program files to "include" blocks of lines which are common to many program files. Such files may contain whatever would compile correctly if it was explicitly typed into the program file in which it is included. Unlike when files are invoked from the command line with @, block constructions may span several lines, because the lines are going to be compiled rather than executed individually.
Printing and saving graphical outputIDL has a notion of a "current graphics device", which is where the pictures go. This might be your terminal (the "X" device) or some kind of file for printing (usually the PostScript device "PS"). The IDL command for changing device is set_plot. However, the UKMO library contains more powerful routines pr for changing from X output to file, and prend for changing back again. In addition to changing the device, these also set up the hardcopy page size properly and swap foreground and background colours when necessary (white on black for screen, black on white for paper). If a printer is named, pr sets up the device in an appropriate way. (This depends on a control file having been created at your site naming the various printers.) Examples of pr: IDL> pr,/ps,/land ; print to the default PostScript printer in landscape The default monochrome PostScript printer can be defined through the Unix environment variable PSPRINTER, and the default colour PS printer as CPSPRINTER. After changing to a printer device, you should reload the colour table if you are using one, as the printer may have a different number of colours from the X device. Then execute the command to draw the picture e.g. IDL> plot,indgen(10),0.5*indgen(10) The graphics will be written a file named wave.ps, wave.eps, etc. in the current directory. Examples of prend: IDL> prend ; print Other routines for graphical output include image_grab to capture an X window as a GIF, and gif4web for creating an animated GIF from a set of single-frame GIFs or pp_contour image files. See elsewhere for problems with printing pp_contour plots on colour PostScript printers.
PP examplesHere's a simple example of reading a PP field from a PP file and making a latitude-longitude plot of it. Following sections will explain the commands used in more detail. IDL> eg=pp_assoc('$WU_DATA/example') ; eg is a "handle" to the file
The following version of the program makes a "prettier" plot as a colour PostScript file. IDL> eg=pp_assoc('$WU_DATA/example')
If you enter Y, the file is printed. Here's another example, which makes a zonal-mean precipitation plot. Before starting IDL, enter export UMVERSION=4.5 in Unix, to specify which version of the UM the data comes from. (The example file contains HadCM3 data from UM version 4.5. UMVERSION could be set to a later version than 4.5.) IDL> eg=pp_assoc('$WU_DATA/example')
PP filesThe Unified Model writes out its data in the PP file format. There are two kinds of PP file:
Conversion from the first to the second kind is done automatically by the UM running at the Met Office, before PP files are archived. Hence at the Met Office you are only likely to encounter the second kind of PP file. The NCAS-CMS UM setup usually archives PP files of the first kind. Jeff Cole's (NCAS-CMS) program xconv can read both kinds. Jeff also has a program ff2pp to convert from the first to the second kind (in ~jeff/bin in Reading). To read the first (or only) PP field out of a PP file, you can use the UKMO IDL command ppa:
The variable z (you can call it what you like, of course) now contains the PP field, both its header and its data. There are more versatile methods for reading PP data; this is the simplest. If the filename contains no / e.g. aaxzca.pmj1sep.pp, IDL looks for a file of that name in the directories named by the colon-separated list in the IDL system variable !pp_path. This variable is set from the Unix environment variable WAVE_PP_PATH on startup. Hence you can change it to suit yourself in your Unix setup. At the Met Office, some PP files are archived with packing of the PP field data, to save space in MASS. By default, packed PP fields are automatically unpacked by ppa; this behaviour is controlled by the !ppa_acceptpacked system variable, as documented by man,'ppa'. Alternatively you may create unpacked versions of the files, using the Unix command unpack in the Met Office, installed as ppunpack at Reading (setup cgam to obtain access to it). Read its Unix man page for details. Special facilities have to be enabled for PP input from files which exceed about 2 Gbyte. This sounds a lot for a single file, but it's not unlikely for output from high-resolution models. Support for such "large files" is controlled by the !pp_largefile_extend system variable, as described by an Appendix.
PP fieldsIn the UKMO software, a PP field is an IDL "structure" variable. This is like a Fortran 90 or C structure. The structure variable is composed of many elements, which are mostly (long) integers and floating-point numbers. Each element has a "tag name" to identify it within the structure. IDL> help,z ; info in PV-WAVE instead of help will list all the elements in the structure (PP field) z. The first, for instance, has the tag name LBYR. This element is the year to which the field applies, or if it's a time mean, the year in which the mean starts. IDL> print, z.lbyr, z.lbyrd prints the start and end years of a time-mean. The data of the PP field is in the element called DATA. This is a two-dimensional floating-point array e.g. IDL> help, z.data ; info in PV-WAVE instead of help As with any array, you can subscript the data array e.g. IDL> print, z.data(*,2) to print the 3rd row. In IDL, array subscripts start from 0 (like C, unlike Fortran). The dimensions of the array are in the elements lbnpt and lbrow of the structure. IDL> print,z.lbnpt,z.lbrow PP fields are always two-dimensional in structure. The commonest kind are longitude-latitude fields, but there are also cross-sections e.g. latitude-height, latitude-time (often called Hovmüller), time-depth. When a field is really one-dimensional in intent, such as a single timeseries (a function of time only) or a zonal-mean field (a function of latitude only), it nonetheless has to be stored in a 2D PP field structure, with one of the dimensions of the data array having a size of one, usually the x-dimension with lbnpt=1. The information in the data element comes from the data record of the PP field in the PP file. All the other information comes from the preceding header record in the file (except that if the field has irregularly spaced x- and y-coordinates these come from the "extra data" in the data record, rather than the header). It is a fundamental feature of the UKMO library that the header and data are "carried around" together in PP structures, so that the data is always identified by its header "metadata" (data about data). UKMO routines that produce new fields by combining fields also modify the metadata so that the new field describes itself properly. The UKMO routines often work out how to process fields by looking at their metadata (e.g. to produce the title of a contour plot). It is essential that the metadata data is correct. A summary description of the field can be obtained with IDL> print,pp_guess_title(z) (This is what pp_contour uses for its titles.) You can see all of the PP field metadata conveniently with the command printfdr e.g. IDL> printfdr,z or you can use WAVE> info,z,/struct or IDL> help,z,/struct. The meanings of the header elements are described in detail in an Appendix. They start with LB (integers) and B (reals) because that is how they were named by the Fortran 77 PP package, which is no longer in regular use. Of particular interest are the UM stash code field.lbuser(3) and submodel ID field.lbuser(6). These elements indicate what quantity the field contains (temperature, pressure, etc.). To look up translations for them, you can use IDL> stash,z The "PP field code" (in the column "FC") is an older set of codes. Each stash code implies a particular field code. However, there are field codes for quantities which don't have stash codes assigned, because they aren't generated by the UM. There is a list in an Appendix. Other important routines for getting information from PP fields are pp_coords for working out the x- and y-coordinates (often longitude and latitude) of the grid points, pp_level for the vertical coordinate (hybrid height, hybrid sigma-pressure, ocean depth, etc.), pp_day for the date and time encoded as a Julian day number, pp_period for the length of the meaning period (e.g. a month, a year, a decade), decode_expt(field.lbexp) for the UM runid.
Missing dataMissing data in PP fieldsThe data array of a PP field can indicate points where data is "missing". When the field is plotted with pp_contour, for instance, areas of missing data are left blank, and when statistics are calculated, missing data values are omitted. Missing data in a PP field is indicated by a data value equal to the bmdi element of the field structure. Hence IDL> missing=where(field.data eq field.bmdi,count) ; can use pp_mdi for this gets a vector of indices to the missing points, and then replaces missing data with zero. The count test is to avoid an error if there are no missing data points. Likewise IDL> present=where(field.data ne field.bmdi,count) ; can use pp_mdi for this fetches the non-missing values into the 1D array data. The function fv will do this operation: IDL> data=fv(field) is equivalent. If you write programs that manipulate missing data values, it is essential to refer to bmdi to get the missing data indicator and not assume it has a particular value! This is because it does not always have the same value. The default for fields created by the UKMO library is different from the UM default, for instance. If operating on a vector of fields, note that you can't subscript a 3D data array. If field is a vector, the above example won't work. You have to loop over the individual fields. Missing data toleranceSome UKMO PP statistics routines have a keyword mdtolerance which affects their treatment of missing data. For instance, if fields is a vector of PP fields, then pp_avg(fields) returns a PP field in which each data value is the average of the corresponding points in all the fields. If any of the values at a particular point is missing, missing data is returned in the result field by default. This is completely "intolerant" of missing data. But pp_avg(fields,/mdtolerance) (equivalent to mdtolerance=1) returns a result wherever at least one of the input fields is not missing i.e. as tolerant as possible. Intermediate values such as mdtolerance=0.5 specify the maximum proportion of fields which can have missing data at a point if a non-missing result is to be returned. Non-PP routines supporting missing dataThe UKMO library provides routines avgmdi, stdevmdi and plotmdi, corresponding to avg/sum, stdev and plot, which respectively calculate average/sum and standard deviation and plot ordinary IDL arrays of data in which there may be missing data. avgmdi and stdevmdi support the mdtolerance keyword and plotmdi leaves gaps in plots where there is missing data. They all permit the missing data indicator to be specified, with a default set by the !pp system variable.
PP inputIf a PP file contains many PP fields (as it usually does) you need to specify which field you want. The best way to do this is by using ss to generate a "search expression" for ppa e.g.
extracts the PP field which has atmosphere stash code 3236 (1.5 m air temperature). ss has many keywords to specify the fields in various ways. If more than one keyword is given, all the conditions apply together (logical AND). Some keywords can accept a vector (1D array) of values. These values are accepted as alternatives (logical OR). For instance ss(period=1,fromdt=1991,todt=1998) ; annual means starting in years 1991-1997 If you are accessing the same file for many different variables, it is usual first to obtain a "handle" to it with pp_assoc IDL> sep=pp_assoc('aaxzca.pmj1sep') then get fields from it by using the handle to refer to the file IDL> sat=ppa(sep,ss(at=3236)) In fact the one-step command above is equivalent to sat=ppa(pp_assoc('aaxzca.pmj1sep'),ss(atmos=3236)). The name of the variable which stores the handle is up to you. Its actual contents are also not of interest. Its only function is to be used to refer to the file when you want to get PP fields out of it. You can conveniently see what is in the file using the command stashlist, which will produce a list of all the different stashcode/submodel combinations in the file, with their English descriptions e.g. IDL> stashlist,sep In order for this to work, you must have defined the UM version in the Unix environment variable $UMVERSION before starting IDL. stashlist,/dt will tell you all the distinct dates of fields in the file e.g. IDL> da=pp_assoc('abbda.decades') this file contains fields dated 1 Dec 1859, 1869 and 1879. stashlist,/period lists the meaning periods e.g. IDL> stashlist,da,/pe shows all the fields are decadal means (no other meaning period is shown). stashlist,/level lists the vertical level coordinates used in the file. This is most useful when specifying a particular quantity e.g. IDL> stashlist,da,ss(ocean=101),/lev to list the depth coordinates of the fields of ocean stashcode 101 (potential temperature). The output shows the vertical field code (see under LBVC in the Appendix) for the field as well as its coordinate value. If you have several PP files and a handle for each of them, you can then search them simultaneously with ppa e.g. IDL> salinity=ppa([file1,file2,file3],ss(fc=602)) where file1 etc. are the handles returned by pp_assoc. Data from long UM experiments is often stored (especially in the Hadley Centre) in "metasplit" or "stashsplit" directories, which contain fields from many UM PP files. The way in which these work is described in an Appendix. Data retrieved by query_masscam is delivered in this form. The UKMO routines regard a metasplit directory as a single "pool" of data, and you get fields from it just as though it were a single PP file. First, obtain a handle to the directory e.g.
Like ppa and pp_assoc, ss_assoc searches the !pp_path for the directory if there is no / in the name given. The handle annual can then be used as an input to ppa just as if it pointed to a single PP file. stashlist can be used just as for a single PP file to examine the contents of the directory. In the Hadley Centre, another way to get access to data is by using the Camelot Dataline database (it is intended to replace this with a more reliable MASS cache arrangement) to find it for you e.g.
returns a handle which points to all the annual-mean data from aaxzc online which Dataline knows about. Again, it "looks" like a single PP file to ppa. Apart from handles, another kind of pointer to fields is a PP header variable. This mechanism is used a lot internally in the UKMO routines. For example, IDL> annual=ss_assoc('aaxzc.000100') The headers variable is an array of PP headers, which are structures just like PP fields but with no data element. They know which PP file they refer to, so PP headers instead of a handle to the data source can be given as input to ppa.
CF-NetCDF input and outputNetCDF is becoming increasingly common as a file format for exchange of climate data. Its advantages are that it is portable (i.e. machine-independent) and supported by a freely available library of I/O routines with interfaces in many programming languages. NetCDF is a very adaptable format, and its authors intended that users would adopt conventions for particular applications. One such convention is COARDS, developed in 1995 for "the sharing and distribution of global atmospheric and oceanographic research data sets." A more recent one, which extends COARDS, is CF. CF has been adopted by several climate centres and projects, such as PCMDI and CMIP. "The purpose of the CF conventions is to require conforming datasets to contain sufficient metadata that they are self-describing in the sense that each variable in the file has an associated description of what it represents, including physical units if appropriate, and that each value can be located in space (relative to earth-based coordinates) and time." The Met Office IDL library can read netCDF files conforming to CF. Since the library was designed for PP fields, the interface to netCDF files makes their contents look like PP fields, just as if they were PP files. To access a netCDF file, open it with ncassoc e.g.
The suffix .nc is conventional for netCDF files, but not mandatory, and ncassoc does not assume it. Like pp_assoc and ss_assoc, ncassoc searches the !pp_path for the directory if there is no / in the name given. The handle it returns can be used as an input to ppa, stashlist, and any other routine which will accept a handle to PP fields. This handle is, in fact, a vector of PP headers. The data in CF-netCDF files is stored in variables which are identified by names such as air_temperature and precipitation_flux, stored in standard_name or long_name attributes. In order to identify the data as PP fields, these names have to be translated to stash codes or field codes. Although CF does not assign any significance to the names of netCDF variables, as a last resort ncassoc will attempt to match the variable names to stash codes or field codes, if it can't find an equivalent for the standard name or long name. The translation is done using the routine codename. The translation tables are in files identified by the system variable !codenamefile, and the default setting offers translations for quite a wide range of stash codes. If no translation is available for a particular name in your file, ncassoc will give a warning, such as No field code or stash code translation possible for variable xyz with standard_name="northward_ocean_freshwater_transport" and long_name="MEAD DIAGNOSTICS: SALINITY KG/S". This does not mean that ncassoc has failed, but the fields it constructs will have a stash code and field code of zero. You can remedy this by supplying your own translation table to supplement the standard one. See the online help for codename for instructions on doing this. If ncassoc fails to interpret a netCDF file or misinterprets it, this might be a bug, but it is more likely that the netCDF file is not CF-compliant or contains some error. Diagnosing or fixing these problems may take some familiarity with the conventions. You could first try verifying the file's correctness using the CF checker on the CF home page. If you have to alter the file, this can be done by converting it to plain text with the Unix command ncdump, editing the text, and regenerating the binary netCDF file with ncgen. Alternatively you can perform many useful operations on netCDF files using the NCO utilities. A netCDF variable can have more than two dimensions; hence it is likely that it might have three (longitude, latitude, vertical or time) or four. The library will interpret such a variable as a 1D or 2D array of PP fields. In the situation where both longitude and latitude are single-valued, which would be the case for an area average, it would be more convenient to have a single timeseries PP field than a vector of PP fields with one data value each. The routine pp_serialise will convert a vector of 1x1 PP fields for a series of times into a single PP timeseries field. You can use the Met Office IDL library to write CF-netCDF files containing a single data variable. That means the file can have only one quantity in it (characterised by stash code, field code and LBPROC), but this single data variable can contain many PP fields, for different vertical levels and times, if they logically constitute a 3D or 4D field (provided all the fields have the same grid). To write out data, use pp_cfwrite e.g. creates a file with a single data variable containing one 3D field, where z101 is an vector of ocean potential temperature PP fields. The stash code (101 in this case) is used to choose the CF standard name for the variable (sea_water_potential_temperature). The input to pp_cfwrite can be PP headers, rather than PP fields in memory, e.g.IDL> help,z101 ; info in PV-WAVE instead of help Z101 STRUCT = -> PP2V5403N288R144XY Array[20] IDL> stashlist,z101 Submodel Stash Description OCEAN 101 potential temperature (ocean) deg.c IDL> pp_cfwrite,'$HOME/thetao.nc',z101 will write out all surface air temperature fields from the PP input source to the CF-netCDF file. In order to do this, the PP fields are read into memory only one at a time. Hence, this operation can be performed for arbitrarily large data variables without exhausting IDL memory.IDL> pp_cfwrite,'$DATADIR/tas.nc',ppa(source,ss(at=3236),/all,/header)
3D fieldsA 3D field, such as temperature(longitude,latitude,height), cannot be contained in a single PP field. Instead, it is stored as a 1D vector of PP fields, each being 2D and having a vertical coordinate. You can read in more than one field at a time with ppa. The commonest need for this is to fetch an entire 3D field e.g. IDL> q=ppa(handle,ss(atmos=10,yr=1992),/all,/sort) fetches all the atmospheric specific humidity fields for 1992 into the variable q, which is a 1D array of PP fields. The /sort keyword is optional, but recommended. It makes sure the PP fields are put in the expected order of levels, which might not be the case if they come from several PP files or metasplit directories. IDL> help,q ; info in PV-WAVE When a 3D field is in memory as a 1D array of (2D) PP fields, you can subscript the array to get a single PP field e.g. IDL> pp_contour,q(3) ; to plot model level 4, if they're in order as above To read in a single one of the levels, you can specify the model level number IDL> q=ppa(handle,ss(atmos=10,yr=1992,lblev=4)) ; to get level 4 or you can specify the vertical coordinate. You don't have to be exactly accurate: IDL> q=ppa(handle,ss(atmos=10,yr=1992,lev=0.87)) ; to get sigma of about 0.87 You might also read in fields for many times at once e.g. IDL> t=ppa(handle,ss(atmos=3236),/all) to fetch all the surface air temperature fields. However, beware trying to read in too much data at once! If the ppa command tries to read hundreds of fields, at HadAM3 resolution it will take many Mbyte of memory. Too much memory usage generally has a bad effect on performance. Special routines are provided for processing a lot of data. If you refer to the elements of a vector of PP fields, it has an extra dimension for the number of fields, thus IDL> help,q(0).lbyr ; info in PV-WAVE Be careful to distinguish the number of elements in a vector of PP fields from the number of elements in the data array of a single PP field. In this example, IDL> print,n_elements(q) ; 19 fields Although q.data looks like a 3D array, you can't subscript it like one. For instance, index=where(q.data eq 0) will return a correct list of indices to zero values, but q.data(index)=1 will not succeed in setting these points to 1. This is dangerous because it will do something and not generate an error, but not what you expected. That's because (index) is subscripting just the data, which is a 2D array. To deal with these situations, you have to loop over fields and process each in turn. You can only read many fields into a single variable if they're all on the same grid. This is fine for 3D fields, but (for example), you can't read a temperature field and a velocity field into one variable: IDL> stash,[2,4],sub=1 The CF-netCDF file format is more flexible than PP, and is capable of storing 3- and 4-dimensional data variables. However, when they are interpreted as PP fields by ncassoc, they are made to look like arrays of 2D PP fields, as if they resided in PP files.
Time axes, timeseries and day numbersAll PP fields have time information in their metadata (the PP header), usually the start and end of the meaning period e.g. 00:00 on 1 Jan 2004 to 00:00 on 1 Jan 2005 for a mean of the year 2004 (see Appendix on PP headers). It is also possible for time to be one of the two axes of a PP field. For instance, in a Hovmüller plot, time and latitude are the axes. Several UKMO IDL routines (see collapsing dimensions) produce PP fields with time as one of the axes e.g. pa_hovmuller produces a time-latitude PP field from a series of latitude-longitude fields applying to different times. 2D fields with a time axis can be plotted with pp_contour just like any other 2D field, and one of the axes is drawn and labelled as time. A particularly important kind of field with a time axis is what we refer to as a "PP timeseries field". The field can contain timeseries, for the same set of times, from one or more locations or regions (e.g. daily precipitation amounts for a series of days from a number of observing stations). In a PP timeseries field, y is time and x indexes the individual timeseries in the field. The UM can produce timeseries fields and there are several UKMO IDL routines which make them. For instance,
For a timeseries field, the PP structure contains an element coords which specifies the region to which the timeseries applies. This is a 3D array subscripted as (timeseries number, 0=low coordinate 1=high coordinate, 0=longitude 1=latitude 2=vertical). Thus, in this example, ts.coords has dimensions (1,2,3); the longitude range of this timeseries is ts.coords(0,0,0) to ts.coords(0,1,0). Their values in this case are -1.875 (1.875 degW) and 358.125 (358.125 deg E), which are coincident, because the whole longitude range is covered by the field. These values are actually the longitude limits of the first and last grid box in each row of the latitude-longitude PP field. Timeseries can be plotted with tsplot, which draws and labels the x-axis as time. A histogram of the values in a timeseries can be plotted by tshplot. The routine tsx is used (a) for extracting particular timeseries from a PP field which contains several timeseries (e.g. the timeseries for Exeter alone from a PP field with lbnpt=3 containing timeseries for Oxford, Cambridge and Exeter) and/or (b) for extracting a particular range of dates from a timeseries field. It returns a new timeseries PP field with reduced dimensions. Since PP fields with a time axis are ordinary PP fields in terms of their structure, most routines for processing PP fields are applicable to them e.g. pp_ff for mathematical operations, pp_regrid for changing the time coordinates by linear interpolation. There are also some routines specially for processing timeseries, including tsderive for differentiation in time, tsintegrate for integration in time, tsfit for fitting a polynomial to the time-dependence, tslump for averaging together into longer time-intervals (e.g. converting a daily timeseries to an annual timeseries), tsfold for calculating a multiannual mean time axis from a time axis spanning several years, tsrecycle for producing an annually repeating time axis spanning several years, tscat for concatenating in time, pp_timeshift for shifting the dates of a time axis to a different year. The last is useful if you want to overplot timeseries from different years on the same time axis. The coordinates of a time axis are stored as "day numbers" (sometimes called "Julian day numbers"), which means the number of days (including fractions of a day) since a particular time. Double precision is used. The time used depends on the calendar applying to the PP field, as specified by its lbtim element. There are two options:
You can conveniently test the calendar of field with the function isppfield(field,/modeltime). Day numbers can be converted to "components of time" (year, month, day, hour, minute, second) using jul2dt. This returns variables with a !dt structure, which is native in PV-WAVE and implemented as part of the UKMO library in IDL. IDL> x=jul2dt(1.5) ; i.e. 1.5 days since midnight on 13 Sep 1752 The reverse transformation can be done with dt2jul, while var2jul converts components of time supplied individually (not as !dt structures) directly to day numbers. IDL> print,var2jul(1752,9,14,12) All these routines support both real-world and model calendars. To obtain the time coordinates (as Julian day numbers) of a PP field with a y-axis of time, use pp_coords(/ygen) as usual, or equivalently tsday. To obtain them as !dt structures, use tsdt.
Making up your own PP fieldsSometimes instead of reading PP fields from PP files you may wish to create PP fields from from non-PP data the program calculates or reads in from some other kind of file, such an ASCII file. The function makeppfield makes PP fields. The data is provided as a 2D array to make a single PP field, or a 3D array to make a vector of PP fields. For longitude-latitude fields, the data array should have longitude as the first dimension and latitude as the second. Here are two equivalent ways of making a PP field with 1-degree resolution from a data array dimensioned (360,180). IDL> help,data ; info instead of help in PV-WAVE In this field, the first gridline of longitude is at 0.5E, and the first gridline of latitude at 89.5S. If the data array has missing data in it, the value of the missing data indicator should be specified with the mdi keyword. If the data is a time-mean, it is strongly recommended to provide the beginning and end of the meaning period. For instance, for a mean from 1 Dec 2003 to 1 Mar 2004, using the model calendar, this would be done with makeppfield keywords dt=var2dt([2003,2004],[12,3],/modeltime). If the field is a cross-section, rather than latitude-longitude, use the code keyword to specify its LBCODE. See the Appendix on PP headers for information about LBCODE. If the field is a timeseries, specify keyword /timeseries and supply the time coordinates (as day numbers) through the ycoord keyword. If it's a single timeseries, the data array should still be 2D, with the first dimension having size unity.
PP outputTo write one or more PP fields contained in an IDL variable to a single PP file in one operation, you can use ppw e.g. IDL> ppw,heat,'$HOME/heat.pp' It is advisable to specify the directory, as done here. If it is not specified (heat.pp) the file is created in the first directory of the !pp_path. To write a number of variables to a file in more than one operation, you must first open the file, then write the variables, then close the file. IDL> ppop,lun,'$HOME/various.pp',/write,/get_lun ; chooses a spare unit To write variables to a metasplit directory, use ssw. If it does not exist, it must first be created with ss_mkdir. To create the pph file in the directory (see Appendix), either use the /header option of ssw, or execute makepph afterwards.
Graphics routines, especially pp_contour
The IDL command for "1D" plots (graphs, timeseries, etc.) is plot. The UKMO library offers a command plotmdi, which has all the functionality of plot and in addition
The UKMO library offers a procedure linekey for drawing the legend (line style, colour, symbol, etc.) for line plots. This routine may be more convenient than the IDL standard commands such as legend. The UKMO includes the following routines for drawing graphics from PP fields:
pp_contour is a particularly complex routine with many options. It can draw line contours, colour- or pattern-filled contours and "block" plots (drawing an individual rectangle for each gridbox). By default it constructs a title for the plot from the PP field metadata (the PP header) and draws a coastline on a latitude-longitude fields. For block and filled colour contours, a colour bar "legend" is drawn underneath the plot. Frequently useful pp_contour keywords are /filled, /block, interval, base_longitude, levels, nlevels, /overplot, /noerase, /spline, title, fline, bstyle. The last two of these expect information from two ancillary routines. fline is for specifying pattern-fill for contours (with lines and symbols) and should be given the output of shadestyle e.g. fline=shadestyle(less=0,spacing=0.1,ori=45) for shading values less than 0 with lines at 45 degrees. (This doesn't seem to work in IDL at present.) bstyle is for specifying options to do with choice of colours and the colour bar legend via the function style e.g. bstyle=style(/interpolate) to have colours chosen automatically at even spacing through the colour table so, for example, if you have a rainbow colour table loaded, the first colour will be red, the last violet and the others spread through the intermediate yellow and green. See the section on colour for further discussion. Some colour PostScript printers don't print exactly the right colours in the colour bar of a pp_contour block plot. In Reading, harold and james have this problem, but flash is OK. The problem arises because IDL uses different PostScript commands in the plot and in the colour bar, and the printer interprets them differently even though the colours are specified identically. pp_contour can be used to make a movie from a series of PP fields. It has its own movie file format, which can be played with play_movie or converted into an animated GIF with gif4web. To make a contour plot on some other map projection, use pp_transform e.g. IDL> pp_contour,pp_transform(field,7,pole_coords=[90,0],sf=0.65) for a north-polar stereographic contour plot. The scale factor sf is used to contract the field of view.
Colour
Colour tablesThe UKMO library deals with colour by means of "colour tables". A colour table defines a set of colours, each in terms of red, green, and blue. Routines producing graphics e.g. plotmdi, pp_contour refers to colours in the table by colour indices. IDL has a number of built-in colour tables. The UKMO library provides others, and users can design their own. Colour indices have values in the range 0-255 i.e. the colour table has 256 entries, and colour indices must be specified to UKMO routines as bytes or 2-byte integers. (See below concerning 4-byte RGB colours.) Note that "colour" is spelt color in the names of routines and keywords. The command tek_color loads a built-in colour table whose first few colours are 0=black, 1=white, 2=red, 3=green, 4=blue, 5=cyan, 6=magenta, 7=yellow. For example, the following commands: IDL> tek_color draw lines in various colours. If no colour keyword is given, the colour !p.color is used. This is called the "foreground" colour, and on the X device (output to the terminal) is usually set to the last colour in the table, which is white when IDL is started. The size of the currently loaded table is !d.table_size, so the index of the last colour is !d.table_size-1. The background of the plot is filled in with the background colour !p.background. Each graphics device has its own colour table. Hence you may have to reload the colour table when you change graphics device, for instance after you use pr to begin PostScript output. On returning to X with prend, the colour table you had before pr will still be in effect. Discrete and continuous colour tablestek_color can be called a "discrete" colour table, because adjacent colours are very different. It doesn't make sense to interpolate between colours in such a table. By contrast, restore_colors,'shortbow' loads an UKMO colour table which runs through the rainbow from violet to red. This is a "continuous" colour table, because you could sensibly interpolate between its entries to get intermediate colours. By default, restore_colors interpolates the colour table when it reads it in to use all the available colours on the current graphics device e.g. IDL> restore_colors,'shortbow' restore_colors assumes that the first colour in the table is black and the last white (for background and foreground) so it does not interpolate between those and the adjacent colours. When loading a discrete colour table with restore_colors, specify the /nointerp keyword. Other continuous colour tables are available through the IDL command loadct. These are built in to IDL. For instance IDL> loadct,0 ; Loads a grey colour table which runs from black to white The UKMO routine style allows colours in continuous colour tables to be specified by "fractional" indices for pp_contour. For instance IDL> pp_contour,field,bstyle=style([0.5,1.0],/interp,/fract),/block chooses colours from half-way along the colour table to its end, which is from green to red for the shortbow table. This behaviour is convenient for continuous colour tables because the actual indices to be used will depend on the size of the table as loaded into the current graphics device. Used-defined colour tablesTo modify the current colour table in memory, use get_colors and put_colors. To save a colour table to a file, use save_colors. To reload it, use restore_colors; the file must be in a directory listed in your !data_path.Specifying true colours to IDL and PV-WAVEBy default, the UKMO IDL library is run in true colour. That means colours are specified to IDL graphics routines not as colour indices, but as long integers with values red*256^2 + green*256 + blue, where red, green and blue are numbers in the range 0-255. The same applies to the settings of the !p.color, !p.background and any other system variables that specify a colour. For IDL commands (not UKMO routines) such as plot which have keywords specifying colours, color indices cannot be used in true-colour mode; they have to be translated to RGB true-colour values using the function ukmo_color_convert. Hence, if plot and oplot were used instead of plotmdi in the above example, we would code color=ukmo_color_convert(2) instead of color=2. Note that !p.color, the foreground colour, is a 4-byte true-colour value. Hence a vector of colours such as [!p.color,2,3] would be a vector of 4-byte integers and would not be interpreted as colour indices. The latter must be coded instead as [!p.color,ukmo_color_convert([2,3])]. If IDL is run in 8-bit pseudocolour mode (not available for PV-WAVE on Linux), rather than true-colour mode, the IDL commands expect colour indices. Colour indices can be used even in true-colour modes in IDL (but not PV-WAVE); this behaviour is selected with device,decomposed=0. The Met Office library can be run in pseudocolour mode by starting IDL with midl -p. Colours not looking as you expect them toIf you use pseudocolour rather then true colour, on some systems there can be conflicts between different applications using colours, with the result that IDL may not be able to get as large a colour table as it wants to use. This should not be a problem on Linux. If you have problems of this kind, whose symptom is that the colours don't appear as you expect them to in the IDL windows, first try closing down other colour-hungry applications such as netscape or ghostview and then restart IDL. If that doesn't work, you may be able to use a private colour table, if your X server supports it. That means the IDL windows will appear in the right colours when the focus is on them (when you've clicked in the window, for instance), but while you focus on a window belonging to another application, such as netscape, the IDL windows will look very strange. In Unix do $ export WAVE_MAX_COLORS=256 and then restart IDL. Conflicts between colours can mean also that ghostview/gv can't get the colours it needs to preview a plot when you run prend,/view. Again, closing down netscape or other colourful windows may help. If you can't see expected differences between colours on a PostScript plot it may be because the difference between the colors is too small for ghostview to distinguish. You can confirm this by using a different colour table such as tek_color, in which the colours are so stridently different that the truth must come to light.
Mathematical computations on fieldsTo add together PP fields f1 and f2, it would be convenient if we could do f3=f1+f2, for instance. Unfortunately this is not possible because IDL doesn't support mathematical operations on structures. Instead, we have to use a slightly less convenient but equally flexible method. The function pp_ff is used to do mathematical computation with PP fields. It evaluates a mathematical expression in which the terms can be PP fields or ordinary numbers, and it returns PP fields as the result. The mathematical expression is given as a string. It can contain any IDL operators, functions and system variables. Variables a, b, c, etc. appearing in the expression stand for the first, second, third, etc. argument to pp_ff following the expression. For instance,
If any of the arguments (after the expression) are vectors (of PP fields or numbers), the result is a vector of PP fields. In this case each PP field in the result vector is computed using the corresponding elements of the input vectors. For instance, in the f3 example, if f1 is a single PP field and f2 is a 10-element vector of PP fields, f3 will be a 10-element vector, in which f3(0)=f1+f2(0), f3(1)=f1+f2(1), etc. In the f4 example, if we had number=[7,8], the result f4 would be a 2-element vector of PP fields, with f4(0)=7*sin(f1) and f4(1)=8*sin(f1). When there is more than one PP field input, missing data is returned in the output PP field at any point where any of the corresponding input PP fields has missing data. So, for example, if f1.data(5,6) is missing data, f3(0).data(5,6) will be missing data, regardless of whether f2(0).data(5,6) is missing data. This behaviour of pp_ff is widely used in the UKMO library for imposing masks, like this:
where f1 and mask are PP fields. Since the mathematical expression does not mention b, its non-missing data values are irrelevant, and the non-missing data values of f2 will be equal to those of f1. However, f2 will have missing data wherever mask has missing data, as well as where f1 has missing data. (The /quiet keyword suppresses a warning message that the input fields have missing data at different points.) You can try this out with f1 being a field of surface air temperature (stash code 3236) and mask one of soil moisture content (8208). The result f2 will be a field of surface air temperature over land, and missing data over sea, where the soil moisture field has missing data. There are some rules about possible inputs to pp_ff:
The keyword /math_fix gets rid of illegal numbers that result from the calculation (e.g. when dividing by zero), replacing them with missing data. In the Met Office IDL version of pp_ff, the keyword /fortran causes the computation to be carried out by a Fortran subroutine which pp_ff constructs, compiles and calls. This is quicker for large computations. In that case, the mathematical expression must be valid in Fortran. A few common mathematical operations have special functions for convenience:
Collapsing dimensionsA large group of routines perform operations that can generally be described as "collapsing dimensions". For instance, computing a zonal cross-section of a longitude-latitude-height field (a vector of longitude-latitude fields, of course, since PP fields are 2D) "collapses" the longitude dimension by averaging over it, and produces a 2D field (latitude-height). Averaging over time similarly "collapses" a time dimension: from a longitude-latitude-time field it produces a longitude-latitude field. Computing the area-average of a longitude-latitude field produces a single number; in effect this has collapsed both the longitude and latitude dimensions. The routines whose names begin with pa are efficient for processing large amounts of data (though they can be used for small amounts as well). Many pa routines have region and/or mask keywords. Note that all routines which average or integrate over longitude and latitude support the /box keyword, which implies weighting by grid-box areas. This option is strongly recommended, because it is not the default behaviour for fields with lbcode=1 in the PP header, as is always (although inappropriately) coded by the UM. (See help information on pp_area for details.) Many routines in this table will also work for cross-sections (i.e. where x- and y-dimensions of the PP fields are not longitude and latitude). For example, given a depth-latitude cross-section of ocean temperatures, pp_area_avg can calculate the average on the section. The weights applied along the axes are just the cell-widths in the appropriate coordinate.
StatisticsSome functions which calculate statistics from PP fields:
This list is far from exhaustive. The library also contains routines for statistical distributions (e.g. t and F), EOFs, analysis of variance, power spectra, covariance, skill scores.
Some other PP processing operationsExtracting a regionTo restrict the latitude--longitude range of a field (for instance, before you plot it with pp_contour), use pp_extract. The region is specified as a four-element vector [west,south,east,north]. The longitude limits west and east are positive for degrees east. Any values which are equal under modulo 360 refer to the same longitude. For instance, longitude ranges -360 to -180 and 0 to 180 are equivalent (both specify longitudes east of Greenwich and west of the date line). The latitude limits south and north are positive for degrees north. IDL> smallfield=pp_extract(bigfield,[10,50,40,80]) If the input bigfield is global, the result smallfield is a field which covers the longitude range 10E to 40E, latitude 50N to 80N. pp_extract actually extracts the subset of the input field grid points which fall inside the specified region. Hence, if the input was not global in longitude but covered 30E to 90E, the output field would have longitude range 30E to 40E. Many UKMO routines, including ppa and pa routines, have a region keyword, which is a four-element vector of the same form as the argument to pp_extract. pp_extract also works for cross-section fields, if appropriate limits in the x- and y-coordinates are specified. SlicesAs well as zonal or meridional mean cross-sections, pp_cross_sect can also extract slices along latitude or longitude circles. For instance, if fields is a vector of latitude-longitude fields on model levels, IDL> slice=pp_cross_sect(fields,long=0) slice is a latitude-level slice along the Greenwich meridian. RegriddingBilinear interpolation to change the 2D grid of PP fields can be done with pp_regrid. E.g. IDL> ; interpolate old to grid of like It also works with cross-section fields. In PV-WAVE, pp_regrid is able to transform between lat-long grids with true and rotated North Pole. The required Fortran program has not been rewritten for IDL. In PV-WAVE, pp_regrid_avg is an alternative for changing to 2D grid, by area-averaging. It uses the area-averaging routines from the UM to do this, but the link to IDL has not yet been written. pp_relayer can be used to change the vertical coordinate of a 3D field e.g. from depth coordinates to density coordinates in the ocean. pp_isolayer may be used first to find the new coordinate surfaces in terms of the old vertical coordinate. Differential operatorsDifferentiation in one spatial dimension can be done with pp_derive, and in time for fields with a time-axis by tsderive. Vector differential operators are implemented as pp_grad, pp_div, pp_curl and pp_laplace . Polynomial fitspp_fit will fit a polynomial function at each point to a vector of PP fields. tsfit will fit a polynomial in time to each point in a timeseries PP field. PP field utilitiespp_comparable tests whether two PP fields are on the same grid. isppfield can carry out a large number of tests on the properties of a PP field. Locating gridboxeswhichbox will tell you which gridbox of a PP field contains a point whose coordinates you specify, and whereis will tell you the coordinates of a point whose indices you specify.
Processing a lot of dataIf you want to carry out an operation on a number of fields, for instance averaging them together (with pp_avg), the obvious thing to do is read them all into memory with ppa and then process them. The problem with this, however, is that if there are a lot of fields, this may be very slow or may even cause IDL or your workstation to crash. There is a set of routines intended for bulk processing of large amounts of data. Their names begin with pa. Their advantage and approach is that they read in the fields one at a time as they do the processing. The pa routines will accept any of three kinds of alternative input:
Headers and handles "imply" PP fields contained in the files and/or metasplit directories they point to. In the reference information for pa routines, these alternatives are referred to together as a "source" of PP fields. Such routines also usually allow an optional second argument which is a "search expression" of the kind accepted by ppa and constructed by ss. The expression implies that the pa routine will process the subset of the PP fields from the source which satisfy the expression. For instance,
calculates the time mean of precipitation fields (stash code 5216) from the directory aaxzc.001000. This will produce the same result as
If a handle and search expression are passed to pa_timemean, however, it reads in the fields one at a time as it does the processing.
Naming conventions of Met Office library routines
As it may help navigate the library, the following naming conventions are worth pointing out. pp_ and ts routinesAlmost all routines whose names begin pp_ require PP fields as input. Routines whose names begin ts require timeseries PP fields as input. pa_ and pats_ routinesRoutines whose names begin pa_ or pats_ (these are in the Hadley Centre applications library rather than the core) and a few of the pp_ routines (e.g. pp_contour) are intended for processing large amounts of data. The pats_ routines return timeseries PP fields; the pa_ are less specific. ukmo_ routinesMost routines whose names start ukmo_ are Met Office routines which have the same general purpose as the IDL standard routines without the ukmo_. For instance, ukmo_histogram is a Met Office routine that has the same purpose as histogram but is more powerful.
Met Office system variables
System variables have names beginning ! and are available to all routines. For instance, a particularly useful one is !p, which controls many aspects of plotting by IDL. The Met Office setup defines a number of IDL system variables. Paths!pp_path, !data_path and !shared_path are copied from the Unix environment variables $WAVE_PP_PATH, $WAVE_DATA_PATH and $WAVE_SHARED_PATH respectively. They are colon-separated lists of directories used by UKMO routines to search respectively for PP files, other (non-PP) data files, and shared object files for linknload/call_external. UM version and stashcode translationsFor translating stash codes to an English description, the UM version has to be selected. The system variable !um_version is copied from the Unix environment variable $UMVERSION. You should ensure this is set in your Unix environment before you start IDL e.g. export UMVERSION=4.5. It is used to deduce the system variable !stashcodesfile, which is a pattern that matches the filenames of the stashmaster files to be read. If you have private stashmaster files from your own UM experiments and you would like these to be used to translate private stashcodes, you can do this by defining the system variable !userstashcodesfile in your IDL startup file. This should be a pattern which matches the files you want to be read e.g. /home/jonathan/etc/sm4.5/*. Stash code translation is done by several routines, including
IDL> stash,'omega' ; find all stashcodes whose descriptions mention "omega" CF-netCDF standard_name and long_name translationsThe routine codename converts between such names and PP stash codes and field codes e.g. IDL> codename,3236,1 The translation tables are stored in the files named by the system variable !codenamefile, which is a blank-separated list of filenames, such as "stashstdname thcmiplongnames". Each name should specify only the basename of the file, not the directory. Each file should be in a directory listed in the !data_path. The codename files are plain text. Each line has up to six fields, separated by !. The relevant fields for the translation are: first - submodel (0 for PP field code, 1 for atmosphere stash code, 2 for ocean stash code), second - PP field code or stash code, fifth - standard_name or long_name which the code corresponds to. E.g. 1!3236!!!air_temperature PP graphics controlThe system variable !ppp controls miscellaneous aspects of various UKMO PP graphics routines, especially pp_contour. See IDL> man,'defpp' for details and enter WAVE> info,!ppp,/structure or IDL> help,!ppp,/structure to see its current settings. Its default value is stored in !ppp_default. For example, you can make pp_contour plots larger by reducing the number of lines allowed for a title (default is 4) and/or eliminating the space for the colour bar if you're only doing monochrome plots. IDL> !ppp.title_lines=1 PP field code translations and default axis directionsWhen no stash code is available, the PP field code LBFC from the field header is used instead. Translations for these are stored in the file $WU_DATA/pp_codes. You can add or alter PP field code translations using set_pp_code e.g. IDL> set_pp_code,'codes',['1602','steric height/m'] ; (re)defines LBFC=1602 Another possibility is to set the defaults for whether particular kinds of axis are plotted with decreasing or increasing coordinates e.g. set_pp_code,'cross',['Latitude','10','176','0','0'] redefines the entry for latitude so that it is plotted increasing to the right when it is on the x-axis. The default is increasing to the left i.e. north on the left. This is changed by the last entry in the [] vector, whose default value is '1'. See man,'set_pp_code' for details. If you want to make changes to PP code translations automatically whenever you start a session, write a IDL script of set_pp_code commands and define the name of the script in the Unix environment with export WAVE_CODES_SCRIPT=filename before you start IDL. PP controlThe system variable !pp controls miscellaneous aspects of various UKMO PP routines. See IDL> man,'defpp' for details and enter WAVE> info,!pp,/structure or IDL> help,!pp,/structure to see its current settings. A particularly useful element of the structure is !pp.last_search, which contains the last search expression evaluated by ppa and similar routines when looking for a PP field. It can be useful to look at it after you have had a failure of this type: % PPA: No such field(s) as it will tell you what field could not be found. The element !pp.rmdi is the default missing data indicator . Physical constantsSystem variable !met_values contains some general physical constants and !um_values the values of some constants used in the UM. See IDL> man,'setup_consts' and IDL> man,'um_values' respectively for details, and enter WAVE> info,variable,/structure or IDL> help,variable,/structure to see their values.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Last Updated on Tuesday, 11 October 2011 14:17 |





