Opened 6 years ago

Closed 5 years ago

#1701 closed help (fixed)

Job running but no output appears

Reported by: charlie Owned by: willie
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 6.6.3



Sorry to bother you, but I'm having trouble running some jobs on version 6.6.3.

I'm running 4 jobs at once. Regardless of job, however, the model is running for its full 6 hours but is only producing the first month e.g. /work/n02/n02/cjrw09/result/xlyhaa.pah1jan. And, although these have size, upon inspection with xconv they are all actually empty. It's also not producing any restart dumps, which is unusual.

My jobs are exact replicas of other jobs I have, all of which work fine. They have all been recompiled correctly, and they are based on a brand-new full extract. They are basically set up from scratch. I have followed exactly the same method to get them running as with my old jobs.

The only difference between my new jobs and the old ones is the input ancillary files, namely SST, sea ice and soil moisture - not how they are read, but the files themselves. So, unless I'm missing something silly, the obvious assumption is that it's one of these which is causing the problem.

The only obvious error in the .leave files is the following, but I don't know what this means or even if it's relevant:

=>> PBS: job killed: walltime 21626 exceeded limit 21600
aprun: Apid 18381595: Caught signal Terminated, sending to application
Application 18381595 is crashing. ATP analysis proceeding...
/var/spool/PBS/mom_priv/jobs/3224075.sdb.SC[305]: .: line 277: 30224: Terminated

Please can you advise?

Many thanks,


Change History (4)

comment:1 Changed 5 years ago by willie

  • Owner changed from um_support to willie
  • Status changed from new to accepted

Hi Charlie,

There are problems with the soil moisture ancillary


There is not much UM output for xlyha, but the file xlyha.out complains a lot about the soil moisture. Looking at it in xconv, it is all in one tiny clump, rather than a smoothly changing global distribution. So I think you need to regenerate this ancillary and try again.



comment:2 Changed 5 years ago by charlie

Dear Willie,

Thanks a lot, yes I feared it was to do with that ancillary file.

However, I don't think it's because of the spatial coverage. That's exactly what I want it to look like - values over northern India and no values elsewhere - and this has worked in the past. If you compare




you will see that spatially they are identical. If I run with the former, it works absolutely fine.

The only difference between them is temporally. The first ancillary file, which works, is a normal timeseries of values - 12240 days (= 34 years), where each day is different. The second ancillary file however, which doesn't work, is again a timeseries of the same number of total days, however every 360 days (i.e. a year) is repeated. So, for example, 10 January is always the same.

It this likely to be causing the problem? If so, why - the data are all there, so how does the model know it is repeating?

One thing that occurred to me: when creating the ancillary file (using xancil), I said No to "Is ancillary data periodic in time?" Should I have said Yes? If so, how can this make a difference?


comment:3 Changed 5 years ago by willie

Hi Charlie,

It's not getting as far as time step 2 according to the pe0 file. The problem is that the ancillary file soil moisture contains NaNs?. This is evidenced in the xlyha.out file, but you can also see it for yourself by comparing the file with itself using cumf. The summary file for this should show no differences for normal files, but flag up the NaNs? in files that contain them. Also xconv > view data struggles to display, another indicator. I think the simplest thing to do is re-create this file.



comment:4 Changed 5 years ago by willie

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.