Opened 5 months ago

Closed 3 months ago

#3431 closed help (answered)

Missing data

Reported by: dgalea Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 11.0



I am running suite u-bz203, which is a copy of the standard suite u-ax053. I am interested in the u and v winds throughout the atmosphere given at various pressure levels, which are set to be outputted in files containing pc in their name. I noticed that the data coming from the lower half of the layers have a fair bit of missing values. The data is on ARCHER at /work/n02/n02/dgalea/archive/u-bz203 or on JASMIN at /group_workspaces/jasmin4/hiresgw/dg/archer_transfers/u-bz203. This happens in both the suites mentioned. Is this usual, and if not, how can I fix these missing values.


Attachments (1)

example.png (17.6 KB) - added by dgalea 5 months ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 5 months ago by dgalea


just checking if you've had some time to have a look at this.


comment:2 Changed 5 months ago by simon

I think it's due to the Heaviside function. The model doesn't write data on those levels whose pressure is below surface pressure (pstar) due to orography. There should be an associate heaviside function diagnostic. From the code:

! The Heavyside function is defined as 1.0 if the pressure level
! is above the surface (i.e. pstar) and 0.0 if below, or if above the top of
! the model. A time mean of this gives the fraction of time a pressure
! level is above the land or sea surface.

(btw, which I wrote about 20 years ago, and the typo in the name is still there!).


comment:3 Changed 5 months ago by simon

Sorry, in above it's the model doesn't write data on those levels whose pressure is above surface pressure (pstar) due to orography

Changed 5 months ago by dgalea

comment:4 Changed 5 months ago by dgalea

Hi Simon,

The data I am getting has some missing values in the field but the field isn't completely empty. I've attached an example of what I'm getting for U at 850hPa. This is all filled during the model run itself as I'm using it in my Deep Learning model. What I'm trying to do is save that data that my DL model is seeing so that I can run TRACK on it to compare results.


comment:5 Changed 5 months ago by simon

OK, it isn't the Heaviside. I've had a look through some of the output, and can't locate an example like you've shown with entire missing rows of data. Could you give me a filename?

comment:6 Changed 5 months ago by dgalea

I have my data at /work/n02/n02/dgalea/archive/u-bz203 on ARCHER or /group_workspaces/jasmin4/hiresgw/dg/archer_transfers/u-bz203 on JASMIN. The variables I need are U, V throughout the atmosphere; U, V at 10m and MSLP. They should all be in the files containing "pc".

comment:7 Changed 5 months ago by simon

Could I have a filename? I've looked at number of the files on Archer, a random for instance /work/n02/n02/dgalea/archive/u-bz203/19960801T0000Z/bz203a.pc19960808.pp and don't see the issue.

comment:8 Changed 5 months ago by dgalea

It seems that cf-python wasn't handling the .pp files as I expected it to. I was trying to open the .pp files directly in cf-python on JASMIN, which produced the error. I have converted one of the files to .nc with the following script I've found:

#! /usr/bin/env csh

#  Convsh script conv2nc_1to1.tcl
#  Convert each file, into a corresponding netCDF file.
#  File names are taken from the command arguments.
#  For example to convert all PP files in the current directory
#  into netCDF files use the following command:
#      ./conv2nc_1to1.tcl *.pp

#  Write out netCDF file
set outformat netcdf

#  Automatically work out input file type
set filetype 0

#  Convert all fields in input files to netCDF
set fieldlist -1

#  Read in each of the input files and write output file

foreach infile $argv {

#  Replace input file extension with .nc to get output filename
   set outfile [file tail [file rootname $infile].nc]

#  Read input file
   readfile $filetype $infile

#  Write out all input fields to a netCDF file
   writefile $outformat $outfile $fieldlist

#  Remove input file information from Convsh's memory

The converted file then shows the right data.

Unrelated question: My model is spending a fair amount of time in the queue (at least 3 hrs on weekdays, sometimes 6). It is also taking around 2 hours 10 minutes to perform 1 simulated month using 8 full nodes. Is there a way to speed up either or both of these?


comment:9 Changed 5 months ago by simon

Could you report the cf-python issue at

Unfortunately there's not much you can do about the Archer throughput. It's oversubscribed at the moment. You could try removing any unneeded STASH items, and turn off climate meaning if you don't require it. This might gain ~10% speed up.
Fortunately Archer2 is ~10 times quicker and this should be available soon.


comment:10 Changed 3 months ago by ros

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.