Opened 6 years ago
Closed 4 years ago
#1589 closed help (wontfix)
REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH
Reported by: | s1374103 | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | PP HEADERS ON ANCILLARY FILE DO NOT MATCH, time series, climatology |
Cc: | Platform: | ARCHER | |
UM Version: | 7.3 |
Description
Dear CMS Helpdesk,
Original job: xjnjn
My job: xlnah
The original job (xjnjn) was due to run from September 1999 – December 2000. When I take an exact copy of this job it runs successfully, however, when I modify the job by increasing the run length (6 years) but keeping the start date the same the run crashes at December 2000.
In the .leave files the following error message is generated;
REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH
On the CMS helpdesk I saw that this error had been generated before (ticket 1417) and the reason for that was that some ancillary files were timeseries that ended before the end date of the model run and were not climatologies.
I looked at the ancillary files I am using and I found it difficult to identify which ancillary files need to be converted into climatologies - Are they the 2D sulphur emissions(DMSlandNH3SO218502000.N96), Soot emissions (BC_hi_Nozawa_18502000.N96) and Biomass emissions (Bio_AEROCOM_1750_2000.N96)?
I just thought I would check the following two question before making the changes to the ancillary files and re-running.
1) Is this the reason why the run is crashing?
2) Have identified the correct ancillary files which need to be converted?
Kind Regards,
Jamie Kelly
Change History (11)
comment:1 Changed 6 years ago by grenville
comment:2 Changed 6 years ago by s1374103
Dear Grenville,
Some ancillary files (e.g. DMSlandNH3SO218502000.N96 from 1850-2000) contain multiple fields (dimethyl sulphide emissions, ammonia gas emissions and sulphur dioxide emissions) and the instructions for changing time-series ancillaries into climatologies says to only select the year of interest (2000).
I can highlight a particular year for one field, then do the same for the next field, but when I return to the original field it is back in the un-highlighted form.
Is it possible to export only 1 year (2000) from a time-series ancillary file containing multiple fields or should they be exported separately?
Jamie
comment:3 Changed 6 years ago by s1374103
The instructions I am following are taken from - http://www.ukca.ac.uk/wiki/index.php/Using_the_UMUI.
At the end of the page, it says to click on the radio button for "UMUI values for CFCs etc", however, this option is not available for my job (xlnah) - is this step important?
Jamie
comment:4 Changed 6 years ago by luke
Hi Jamie,
You're running a tropospheric chemistry job (CheT) with GLOMAP-mode aerosols at N96L63 resolution. The instructions at the end of the above webpage relate to a particular vn7.3 branch.
This job uses the branch
fcm:um-br/dev/gmann/vn7.3_glomap_ukca_mode_nitrate_GMdebug/src
which is owned by Graham Mann. It should be noted that this job appears to be under active development. Tracking back the history of this branch shows that it is eventually based on my VN7.3_UKCA_CheM_vn1.1 branch, so this hand-edit could be changed for this branch. However, those changes are mainly for stratospheric jobs, CheS or CheST, and so aren't really needed for this job.
From your 2nd comment, I think that you might not be clicking the "Apply" button when you select the times in Xconv. Could you try that and then see if you can extract just the year 2000 values. The error message that you are having is coming from the fact that your ancillary file ends in 2000 and so cannot be used past that time.
I think that there are year 2000 climatologies for these fields at N48L60, but not N96L63 I'm afraid. You will need to make these up, either by making a climatology (in some way) or by extending the timeseries.
Thanks,
Luke
comment:5 Changed 6 years ago by s1374103
Hi Luke,
I have selected and exported the year 1999 for the ancillary files (2D Sulphur emissions, Soot emissions and Biomass emissions) which I think are preventing my job from reaching the intended end date. I have opened these in Xancil and replaced them with files that are periodic in time. Providing my job is setup correctly, does this mean that it will run using 2D Sulphur, Soot and Biomass emissions indefinitely with year 1999 emissions?
With regards to the UKCA code change instructions, as I am using a different configuration to that used in the instructions should I ignore those or seek an alternate method?
Regards,
Jamie
comment:6 Changed 6 years ago by s1374103
Hi,
I have spoken with Graham Mann and he has suggested which files I replace. I have run the job but part way through it fails. In the .leave file it says;
Application 16460206 is crashing. ATP analysis proceeding... ATP Stack walkback for Rank 24 starting: bi_linear_h_@bi_linear_h.f90:569 ATP Stack walkback for Rank 24 done Process died with signal 11: 'Segmentation fault' Forcing core dumps of ranks 24, 0, 1, 116 View application merged backtrace tree with: stat-view atpMergedBT.dot You may need to: module load stat _pmiu_daemon(SIGCHLD): [NID 02194] [c3-1c1s4n2] [Sat Jul 11 12:55:55 2015] PE RANK 146 exit signal Killed [NID 02194] 2015-07-11 12:55:55 Apid 16460206: initiated application termination _pmiu_daemon(SIGCHLD): [NID 02195] [c3-1c1s4n3] [Sat Jul 11 12:55:55 2015] PE RANK 168 exit signal Killed _pmiu_daemon(SIGCHLD): [NID 02071] [c2-1c2s5n3] [Sat Jul 11 12:55:55 2015] PE RANK 99 exit signal Killed xlnao: Run failed
Any idea what this means?
Regards,
Jamie
comment:7 Changed 5 years ago by grenville
Jamie
Sorry for the delay — is this still as problem?
Grenville
comment:8 Changed 5 years ago by s1374103
Hi Grenville,
I have attempted to run the job again (xlwms) and it has failed. The nrun completed, the crun started and ran for around 9 months but then crashed (last .pm file to be produced was April 2000).
The .leave file contains;
Application 18221066 is crashing. ATP analysis proceeding... ATP Stack walkback for Rank 144 starting: bi_linear_h_@bi_linear_h.f90:569 ATP Stack walkback for Rank 144 done Process died with signal 11: 'Segmentation fault' Forcing core dumps of ranks 144, 0, 1 View application merged backtrace tree with: stat-view atpMergedBT.dot You may need to: module load stat _pmiu_daemon(SIGCHLD): [NID 01226] [c6-0c1s2n2] [Fri Oct 9 20:13:50 2015] PE RANK 98 exit signal Killed _pmiu_daemon(SIGCHLD): [NID 01228] [c6-0c1s3n0] [Fri Oct 9 20:13:50 2015] PE RANK 146 exit signal Killed [NID 01226] 2015-10-09 20:13:50 Apid 18221066: initiated application termination _pmiu_daemon(SIGCHLD): [NID 01225] [c6-0c1s2n1] [Fri Oct 9 20:13:50 2015] PE RANK 72 exit signal Killed _pmiu_daemon(SIGCHLD): [NID 01219] [c6-0c1s0n3] [Fri Oct 9 20:13:50 2015] PE RANK 24 exit signal Killed xlwms: Run failed"""
It also contains;
/work/n02/n02/jimbo/um/xlwms/bin/qsfinal: Error in exit processing after model run Failed in model executable
Any ideas?
Regards,
Jamie
comment:9 Changed 5 years ago by grenville
Jamie
This run only uses standard ancillary files as far as I can see , so this is a new problem — we need some more output to help diagnose it.
Please rerun the last month and get the model to write out daily dumps or ask for dumps at days 20, 21, 22….
Do you know what these messages mean
Error density iwvolmethod=3: imode_density does not converge
rh= NaN t= 298.14999999999998
ions= 2*0., 5120000., 480020.00000000006, 2*0., 4640000.
soln_vol= NaN soln_tmass= NaN sol_dens= NaN
bin.dens= NaN, 5*0., NaN, 0., NaN
water mass= NaN
Grenville
comment:10 Changed 5 years ago by s1374103
Hi Grenville,
I have re-run the job last few months with daily dumping. The last dump created was 25th May 2000 (xlwmsa.dak05p0).
The last .leave file created contains;
ATP Stack walkback for Rank 60 starting: bi_linear_h_@bi_linear_h.f90:561 ATP Stack walkback for Rank 60 done Process died with signal 11: 'Segmentation fault' Forcing core dumps of ranks 60, 0, 1 View application merged backtrace tree with: stat-view atpMergedBT.dot You may need to: module load stat _pmiu_daemon(SIGCHLD): [NID 02488] [c4-1c2s14n0] [Mon Oct 19 18:03:06 2015] PE RANK 168 exit signal Killed _pmiu_daemon(SIGCHLD): [NID 01427] [c7-0c1s4n3] [Mon Oct 19 18:03:06 2015] PE RANK 24 e xit signal Killed [NID 02488] 2015-10-19 18:03:06 Apid 18381676: initiated application termination _pmiu_daemon(SIGCHLD): [NID 01433] [c7-0c1s6n1] [Mon Oct 19 18:03:06 2015] PE RANK 72 e xit signal Killed _pmiu_daemon(SIGCHLD): [NID 02309] [c4-1c0s1n1] [Mon Oct 19 18:03:06 2015] PE RANK 145 exit signal Killed xlwms: Run failed
I'm not sure what the NaN density error means. I know there was a bug found for high nitrate in the stratosphere, however, this didn't prevent the job from running when I was using the previous ancillary files so I don't think it would be the cause of this error.
Regards,
Jamie
comment:11 Changed 4 years ago by grenville
- Resolution set to wontfix
- Status changed from new to closed
Jamie
I'll close this for lack of activity — unfortunately, these kinds of errors result from many possible causes. I'm assuming you got round the problem - please let me know if not (if it still a problem )
Grenville
Jamie
We don't have a quick solution to this - I can only suggest checking all the ancillary fields which are being updated to find which don't extent far enough.
Grenville