Opened 5 months ago

Last modified 4 months ago

#3555 new help

NEMO Annual mean not possible

Reported by: Alcide.Zhao Owned by: um_support
Component: PostProc Keywords: NEMO postprocessing
Cc: Platform: NEXCS
UM Version: 10.7

Description

Dear CMS desk,

my suite u-ce564 stuck at nemo postprocessing with the following error:

grid-T Annual mean for year ending December 1854 not possible as only got 3 file(s): 
nemo_ce564o_1s_18521201-18530301_grid-T.nc, nemo_ce564o_1s_18530301-18530601_grid-T.nc, nemo_ce564o_1s_18530901-18531201_grid-T.nc

This also leads the next cycle point to fail also at postproc_nemo.

I tried to insert new tasks following #3226, but the suite does not respond to my insert action at all.

Any solutions.

Thanks, Alcide

Change History (11)

comment:1 in reply to: ↑ description Changed 4 months ago by Alcide.Zhao

Dear CMS desk,

I still have not got this issue resolved. In fact, I now have three simulations: u-ce557, u-ce562, u-ce564 all stuck because of the same reason.

It would be great if there is any tips. I am trying to push these simulations done before NEXCS is down in two weeks.

Cheers,
Alcide

comment:2 follow-up: Changed 4 months ago by grenville

Alcide

u-ce564 has not created a JJA mean but /home/d04/alzha/cylc-run/u-ce564/log/job/18530401T0000Z has no logging information for the model or post-proc so it's hard to tell what happened. Have you moved the logs?

Do you understand why so many postproc_nemo tasks needed to retry 3 times - why did the first attempts fail?

u-ce557 failed for a different reason
Error message: ACUMPS1: Partial sum file inconsistent

u-ce562 failed with
[FAIL] No restart data avaliable in NEMO restart directory

Grenville

comment:3 in reply to: ↑ 2 Changed 4 months ago by Alcide.Zhao

Hi Grenville,

I did not remove any log files. The postproc_nemo failed many times as I tried to trigger it many times.

In terms of u-ce557, how do I solve this Partial sum file inconsistent issue?
I should say that this suite failed before, so I restarted it from its 18640101 dump. It then only finished that first cycle and crashed there.

For u-ce562, I think I may have mistakenly removed all the NEMO restarts. I wonder if it is backed up somewhere? I suppose not, so is it easier if I restart it from its previously saved model dumps?

Many thanks,
Alcide

Last edited 4 months ago by Alcide.Zhao (previous) (diff)

comment:4 follow-up: Changed 4 months ago by grenville

Alcide

u-ce562 appears to have got in a confused state - the coupled task at 18530401T0000Z clearly ran OK,and it would appear that post-proc did too, but without logs I can't tell for sure why it didn't create seasonal means. You can probably create the yearly means from other means (I can't see in /gws/pw/j05/wishbone/alcide/HadGEM_GC31.)

Partial sum file errors are notoriously difficult to diagnose. Your climate meaning profile is unusual - with 3-month dumping and a (3,3,4,10) climate meaning profile, the model will try to create 9-month, 27-month, 108-month… means. {(3,3,4,10) is usually associated with 10 day dumping so as to create monthly, seasonal, yearly, and decadal means}.

It's probably worth checking how these suites are configured for climate meaning.

Switching off climate meaning will solve the problem - you have monthly means from which means over other periods can be derived.

Grenville

comment:5 in reply to: ↑ 4 Changed 4 months ago by Alcide.Zhao

Hi Grenville,

Many thanks for the help. I will give it a go!

Meantime, I think you spotted a critical issue related to my climate meaning. The reason why I got the 9-month, 27-month, 108-month… means is that I changed dumping frequency to every 90 days from the initial 10 days. However, I did not touch the (3,3,4,10) climate meaning profile. In this case, all my atmospheric climate meaning does not make much sense.

Unfortunately, I have run this model set-up already for 10 ensemble memebrs. I wonder how I can exclude these climate meanings from my model output files?
I am concerned because once I switched off the climate meaning, the ../app/um/rose-app.conf changes really a lot (see a diff file here: /home/d04/alzha/roses/diff): it seems that almost every output stream has one field related to the meaning? It would be a mess if I later do analyse these fields.

Cheers,
Alcide

comment:6 follow-up: Changed 4 months ago by grenville

Alcide

I'd try it on a test suite. I think it will not be a problem - I'm assuming that you won't use 9, 27,81… month meaned data anyway?

Grenville

comment:7 in reply to: ↑ 6 Changed 4 months ago by Alcide.Zhao

Hi Grenville,

many thanks! You are right, I'd not want to use them, and would like to remove them from my outputs if possible (or mark them if I know which streams they went to).

Thanks,
Alcide

comment:8 follow-up: Changed 4 months ago by grenville

Alcide

Climate-mean data is output in its own file (or files) - so by switching off climate meaning, you will not get any fields output to "UPMEAN"; all other fields will be unaffected and will continue to be written.

Grenville

comment:9 in reply to: ↑ 8 Changed 4 months ago by Alcide.Zhao

Thanks Grenville,

Very useful to know these climate means are attached to the UPMEAN user profile.
I wonder if there is a quick way to spot them from the pp files?

Cheers,
Alcide

comment:10 follow-up: Changed 4 months ago by grenville

Alcide

The climate mean files are named as explained below:

Legacy-style absolute (%C)

This expands to a variety of possible strings.

For climate meaning files it expands to a pre-determined name
depending on the periodicity of the meaning. There are too many
possibilities to cover here but as a few examples:

%C ⇒ "m%Y%b" (monthly means)
%C ⇒ "s%Y%s" (seasonal means)
%C ⇒ "w%Y%m%d" (weekly means)
%C ⇒ "x%Y%m%d" (10 yearly means)
%C ⇒ "1%Y%m%d_%H" (unknown mean for period 1)
%C ⇒ "2%Y%m%d_%H" (unknown mean for period 2)

Note also that using the above option files will be named using
the date/time at the *end* of the meaning period (not the start),
*except* for the monthly and seasonal means - which will be
labelled using the start of the mean period. It is recommended
to replace the "%C" indicator with a suitable form such as those
above, based on knowledge of the periodicity.

Grenville

comment:11 in reply to: ↑ 10 Changed 4 months ago by Alcide.Zhao

Many thanks, Grenville,

That really helped. I now spotted all of them with the extension

p1*_00.pp; p2*_00.pp; p3*_00.pp;

I have checked their frequency and can confirm these are the dumbs ones. I will remove them entirely from my model outputs.

Cheers,
Alcide

Last edited 4 months ago by Alcide.Zhao (previous) (diff)
Note: See TracTickets for help on using tickets.