Opened 4 years ago

Closed 2 years ago

#1589 closed help (wontfix)

REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH

Reported by: s1374103 Owned by: um_support
Priority: normal Component: UM Model
Keywords: PP HEADERS ON ANCILLARY FILE DO NOT MATCH, time series, climatology Cc:
Platform: ARCHER UM Version: 7.3

Description

Dear CMS Helpdesk,

Original job: xjnjn
My job: xlnah

The original job (xjnjn) was due to run from September 1999 – December 2000. When I take an exact copy of this job it runs successfully, however, when I modify the job by increasing the run length (6 years) but keeping the start date the same the run crashes at December 2000.

In the .leave files the following error message is generated;

REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH

On the CMS helpdesk I saw that this error had been generated before (ticket 1417) and the reason for that was that some ancillary files were timeseries that ended before the end date of the model run and were not climatologies.

I looked at the ancillary files I am using and I found it difficult to identify which ancillary files need to be converted into climatologies - Are they the 2D sulphur emissions(DMSlandNH3SO218502000.N96), Soot emissions (BC_hi_Nozawa_18502000.N96) and Biomass emissions (Bio_AEROCOM_1750_2000.N96)?

I just thought I would check the following two question before making the changes to the ancillary files and re-running.

1) Is this the reason why the run is crashing?
2) Have identified the correct ancillary files which need to be converted?

Kind Regards,

Jamie Kelly

Change History (11)

comment:1 Changed 4 years ago by grenville

Jamie

We don't have a quick solution to this - I can only suggest checking all the ancillary fields which are being updated to find which don't extent far enough.

Grenville

comment:2 Changed 4 years ago by s1374103

Dear Grenville,

Some ancillary files (e.g. DMSlandNH3SO218502000.N96 from 1850-2000) contain multiple fields (dimethyl sulphide emissions, ammonia gas emissions and sulphur dioxide emissions) and the instructions for changing time-series ancillaries into climatologies says to only select the year of interest (2000).

I can highlight a particular year for one field, then do the same for the next field, but when I return to the original field it is back in the un-highlighted form.

Is it possible to export only 1 year (2000) from a time-series ancillary file containing multiple fields or should they be exported separately?

Jamie

comment:3 Changed 4 years ago by s1374103

The instructions I am following are taken from - http://www.ukca.ac.uk/wiki/index.php/Using_the_UMUI.

At the end of the page, it says to click on the radio button for "UMUI values for CFCs etc", however, this option is not available for my job (xlnah) - is this step important?

Jamie

comment:4 Changed 4 years ago by luke

Hi Jamie,

You're running a tropospheric chemistry job (CheT) with GLOMAP-mode aerosols at N96L63 resolution. The instructions at the end of the above webpage relate to a particular vn7.3 branch.

This job uses the branch

fcm:um-br/dev/gmann/vn7.3_glomap_ukca_mode_nitrate_GMdebug/src

which is owned by Graham Mann. It should be noted that this job appears to be under active development. Tracking back the history of this branch shows that it is eventually based on my VN7.3_UKCA_CheM_vn1.1 branch, so this hand-edit could be changed for this branch. However, those changes are mainly for stratospheric jobs, CheS or CheST, and so aren't really needed for this job.

From your 2nd comment, I think that you might not be clicking the "Apply" button when you select the times in Xconv. Could you try that and then see if you can extract just the year 2000 values. The error message that you are having is coming from the fact that your ancillary file ends in 2000 and so cannot be used past that time.

I think that there are year 2000 climatologies for these fields at N48L60, but not N96L63 I'm afraid. You will need to make these up, either by making a climatology (in some way) or by extending the timeseries.

Thanks,
Luke

comment:5 Changed 4 years ago by s1374103

Hi Luke,

I have selected and exported the year 1999 for the ancillary files (2D Sulphur emissions, Soot emissions and Biomass emissions) which I think are preventing my job from reaching the intended end date. I have opened these in Xancil and replaced them with files that are periodic in time. Providing my job is setup correctly, does this mean that it will run using 2D Sulphur, Soot and Biomass emissions indefinitely with year 1999 emissions?

With regards to the UKCA code change instructions, as I am using a different configuration to that used in the instructions should I ignore those or seek an alternate method?

Regards,

Jamie

comment:6 Changed 4 years ago by s1374103

Hi,

I have spoken with Graham Mann and he has suggested which files I replace. I have run the job but part way through it fails. In the .leave file it says;

Application 16460206 is crashing. ATP analysis proceeding...

ATP Stack walkback for Rank 24 starting:
  bi_linear_h_@bi_linear_h.f90:569
ATP Stack walkback for Rank 24 done
Process died with signal 11: 'Segmentation fault'
Forcing core dumps of ranks 24, 0, 1, 116
View application merged backtrace tree with: stat-view atpMergedBT.dot
You may need to: module load stat

_pmiu_daemon(SIGCHLD): [NID 02194] [c3-1c1s4n2] [Sat Jul 11 12:55:55 2015] PE RANK 146 exit signal Killed
[NID 02194] 2015-07-11 12:55:55 Apid 16460206: initiated application termination
_pmiu_daemon(SIGCHLD): [NID 02195] [c3-1c1s4n3] [Sat Jul 11 12:55:55 2015] PE RANK 168 exit signal Killed
_pmiu_daemon(SIGCHLD): [NID 02071] [c2-1c2s5n3] [Sat Jul 11 12:55:55 2015] PE RANK 99 exit signal Killed
xlnao: Run failed

Any idea what this means?

Regards,

Jamie

comment:7 Changed 4 years ago by grenville

Jamie

Sorry for the delay — is this still as problem?

Grenville

comment:8 Changed 3 years ago by s1374103

Hi Grenville,

I have attempted to run the job again (xlwms) and it has failed. The nrun completed, the crun started and ran for around 9 months but then crashed (last .pm file to be produced was April 2000).

The .leave file contains;

Application 18221066 is crashing. ATP analysis proceeding...

ATP Stack walkback for Rank 144 starting:
  bi_linear_h_@bi_linear_h.f90:569
ATP Stack walkback for Rank 144 done
Process died with signal 11: 'Segmentation fault'
Forcing core dumps of ranks 144, 0, 1
View application merged backtrace tree with: stat-view atpMergedBT.dot
You may need to: module load stat

_pmiu_daemon(SIGCHLD): [NID 01226] [c6-0c1s2n2] [Fri Oct  9 20:13:50 2015] PE RANK 98 exit signal Killed
_pmiu_daemon(SIGCHLD): [NID 01228] [c6-0c1s3n0] [Fri Oct  9 20:13:50 2015] PE RANK 146 exit signal Killed
[NID 01226] 2015-10-09 20:13:50 Apid 18221066: initiated application termination
_pmiu_daemon(SIGCHLD): [NID 01225] [c6-0c1s2n1] [Fri Oct  9 20:13:50 2015] PE RANK 72 exit signal Killed
_pmiu_daemon(SIGCHLD): [NID 01219] [c6-0c1s0n3] [Fri Oct  9 20:13:50 2015] PE RANK 24 exit signal Killed
xlwms: Run failed"""

It also contains;

/work/n02/n02/jimbo/um/xlwms/bin/qsfinal: Error in exit processing after model run
Failed in model executable

Any ideas?

Regards,

Jamie

comment:9 Changed 3 years ago by grenville

Jamie

This run only uses standard ancillary files as far as I can see (?), so this is a new problem — we need some more output to help diagnose it.

Please rerun the last month and get the model to write out daily dumps or ask for dumps at days 20, 21, 22….

Do you know what these messages mean

Error density iwvolmethod=3: imode_density does not converge

rh= NaN t= 298.14999999999998
ions= 2*0., 5120000., 480020.00000000006, 2*0., 4640000.
soln_vol= NaN soln_tmass= NaN sol_dens= NaN
bin.dens= NaN, 5*0., NaN, 0., NaN
water mass= NaN

Grenville

comment:10 Changed 3 years ago by s1374103

Hi Grenville,

I have re-run the job last few months with daily dumping. The last dump created was 25th May 2000 (xlwmsa.dak05p0).

The last .leave file created contains;

ATP Stack walkback for Rank 60 starting:
  bi_linear_h_@bi_linear_h.f90:561
ATP Stack walkback for Rank 60 done
Process died with signal 11: 'Segmentation fault'
Forcing core dumps of ranks 60, 0, 1
View application merged backtrace tree with: stat-view atpMergedBT.dot
You may need to: module load stat

_pmiu_daemon(SIGCHLD): [NID 02488] [c4-1c2s14n0] [Mon Oct 19 18:03:06 2015] PE RANK 168
 exit signal Killed
_pmiu_daemon(SIGCHLD): [NID 01427] [c7-0c1s4n3] [Mon Oct 19 18:03:06 2015] PE RANK 24 e
xit signal Killed
[NID 02488] 2015-10-19 18:03:06 Apid 18381676: initiated application termination
_pmiu_daemon(SIGCHLD): [NID 01433] [c7-0c1s6n1] [Mon Oct 19 18:03:06 2015] PE RANK 72 e
xit signal Killed
_pmiu_daemon(SIGCHLD): [NID 02309] [c4-1c0s1n1] [Mon Oct 19 18:03:06 2015] PE RANK 145
exit signal Killed
xlwms: Run failed

I'm not sure what the NaN density error means. I know there was a bug found for high nitrate in the stratosphere, however, this didn't prevent the job from running when I was using the previous ancillary files so I don't think it would be the cause of this error.

Regards,

Jamie

comment:11 Changed 2 years ago by grenville

  • Resolution set to wontfix
  • Status changed from new to closed

Jamie

I'll close this for lack of activity — unfortunately, these kinds of errors result from many possible causes. I'm assuming you got round the problem - please let me know if not (if it still a problem )

Grenville

Note: See TracTickets for help on using tickets.