#1701 closed help (fixed)

Job running but no output appears

Reported by: charlie Owned by: willie
Priority: normal Component: UM Model
Keywords: Cc:
Platform: ARCHER UM Version: 6.6.3

Description

Hi,

Sorry to bother you, but I'm having trouble running some jobs on version 6.6.3.

I'm running 4 jobs at once. Regardless of job, however, the model is running for its full 6 hours but is only producing the first month e.g. /work/n02/n02/cjrw09/result/xlyhaa.pah1jan. And, although these have size, upon inspection with xconv they are all actually empty. It's also not producing any restart dumps, which is unusual.

My jobs are exact replicas of other jobs I have, all of which work fine. They have all been recompiled correctly, and they are based on a brand-new full extract. They are basically set up from scratch. I have followed exactly the same method to get them running as with my old jobs.

The only difference between my new jobs and the old ones is the input ancillary files, namely SST, sea ice and soil moisture - not how they are read, but the files themselves. So, unless I'm missing something silly, the obvious assumption is that it's one of these which is causing the problem.

The only obvious error in the .leave files is the following, but I don't know what this means or even if it's relevant:

=>> PBS: job killed: walltime 21626 exceeded limit 21600
aprun: Apid 18381595: Caught signal Terminated, sending to application
Application 18381595 is crashing. ATP analysis proceeding...
/var/spool/PBS/mom_priv/jobs/3224075.sdb.SC[305]: .: line 277: 30224: Terminated


Please can you advise?

Many thanks,

Charlie

Change History (4)

comment:1 Changed 22 months ago by willie

  • Owner changed from um_support to willie
  • Status changed from new to accepted

Hi Charlie,

There are problems with the soil moisture ancillary

/work/n02/n02/cjrw09/ancil/hydro.d/smow_jules_repeatingcyc_first5

There is not much UM output for xlyha, but the file xlyha.out complains a lot about the soil moisture. Looking at it in xconv, it is all in one tiny clump, rather than a smoothly changing global distribution. So I think you need to regenerate this ancillary and try again.

Regards,

Willie

comment:2 Changed 22 months ago by charlie

Dear Willie,

Thanks a lot, yes I feared it was to do with that ancillary file.

However, I don't think it's because of the spatial coverage. That's exactly what I want it to look like - values over northern India and no values elsewhere - and this has worked in the past. If you compare

/work/n02/n02/cjrw09/ancil/hydro.d/smow_jules_1971-2004

with

/work/n02/n02/cjrw09/ancil/hydro.d/smow_jules_repeatingcyc_first5

you will see that spatially they are identical. If I run with the former, it works absolutely fine.

The only difference between them is temporally. The first ancillary file, which works, is a normal timeseries of values - 12240 days (= 34 years), where each day is different. The second ancillary file however, which doesn't work, is again a timeseries of the same number of total days, however every 360 days (i.e. a year) is repeated. So, for example, 10 January is always the same.

It this likely to be causing the problem? If so, why - the data are all there, so how does the model know it is repeating?

One thing that occurred to me: when creating the ancillary file (using xancil), I said No to "Is ancillary data periodic in time?" Should I have said Yes? If so, how can this make a difference?

Charlie

comment:3 Changed 22 months ago by willie

Hi Charlie,

It's not getting as far as time step 2 according to the pe0 file. The problem is that the ancillary file soil moisture contains NaNs?. This is evidenced in the xlyha.out file, but you can also see it for yourself by comparing the file with itself using cumf. The summary file for this should show no differences for normal files, but flag up the NaNs? in files that contain them. Also xconv > view data struggles to display, another indicator. I think the simplest thing to do is re-create this file.

Regards

Willie

comment:4 Changed 22 months ago by willie

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.