Opened 3 weeks ago

Last modified 2 days ago

#2894 new help

Copying a Met Office suite to Archer, vn10.7 GA7.1

Reported by: ha392 Owned by: um_support
Component: UM Model Keywords: Archer, 10.7, GA7.1, AMIP, CMIP6
Cc: Platform: ARCHER
UM Version: 10.7

Description

Hi,

I am currently trying to create a working copy of the MetO Cray suite u-au074 for ARCHER in the form of suite u-bi305.
So far I have changed the host machine in suite-conf to ARCHER, and attemepted to change the meta in the um rose-app.conf file from

/data/users/moci/CMIP6_rose-meta/um-atmos/HEAD
to

um-atmos/vn10.7

However, I am unsure where to go from here or if this is even correct.

I am still getting an error when opening the suite related to /data/users/moci/CMIP6_rose-meta/um-atmos/HEAD/rose-meta.conf when I open the suite, but cannot find where else to change this.

I also get an error related to post processing but I plan to update that file to a better version of post processing for ARCHER.

Any advice on this would be greatly appreciated.

Thank you,
Holly

Change History (17)

comment:1 Changed 3 weeks ago by ros

Hi Holly,

I have a copy of that metadata directory at: /home/ros/meta/moci/CMIP6_rose-meta/um-atmos/HEAD. I haven't looked to see if the Met Office have updated this for a while but it will be a good start at least. I'll check it tomorrow.

As well as in the app/um/rose-app.conf file it is also referenced in app/um/meta/rose-meta.conf.

Hope that helps.
Regards,
Ros.

comment:2 Changed 3 weeks ago by ha392

Hi Ros,

This seems to get rid of my main error, so as long as it's the right version it is definitely a good start, thank you.

Holly

Last edited 3 weeks ago by ha392 (previous) (diff)

comment:3 Changed 3 weeks ago by ros

Hi Holly,

I confirm the metadata directory is still at the same revision as the Met Office are working from.

Other things you will need to change if you haven't already are Domain decomposition…. Max number of processes on ARCHER needs setting to 24. And obviously check the domain decompositions are a multiple of 24 and input file paths. I can point you to upgrade instructions for postproc when you get that far.

Regards,
Ros.

comment:4 Changed 11 days ago by ha392

Hi Ros,
I have gotten as far as changing the metadata directories, changing the domain decomposition and sorting through all the flagged-up errors with the suite.

I am currently having troubles with the fcm_make_um2 process.

http://puma.nerc.ac.uk/rose-bush/view/ha392/u-bg509?&no_fuzzy_time=0&path=log/job/19790101T0000Z/fcm_make2_um/01/job.err

As far as I can tell, the main differences between the fcm_make_um files of this suite and ones that I have working, are the branches in the sources section-

branches/pkg/Share/vn10.7_CMIP6_production_mods@43043
branches/dev/martinandrews/vn10.7_nancillookup300000@45275
branches/dev/samcusworth/vn10.7_fix_defensive_programming@35126
branches/dev/irinalp/vn4.8_Copy_0376_to_8376@8494
branches/dev/benjohnson/um10.7_easyaerosol_cmip6@312

I would like to keep the suite as similar as possible to the met office original, so I don’t think I want to change them, but I am not sure what impact it would have if I did or if they are infact the problem.

Thank you,
Holly

comment:5 Changed 11 days ago by grenville

Hi Holly

Please see #2484 - the first comment.

Greenville

comment:6 Changed 11 days ago by ha392

Hi Grenville,

Great thank you, I now have fcm_make_um2 working. I am now having trouble with reconfiguration-

[FAIL] env=SPECTRAL_FILE_DIR: CMIP6_ANCILS: unbound variable

http://puma.nerc.ac.uk/rose-bush/view/ha392/u-bi305?&no_fuzzy_time=0&path=log/job/19790101T0000Z/recon/01/job.err

There does not seem to be anything particularly helpful in the job.out file from what I can tell.

Thank you,
Holly

comment:7 Changed 11 days ago by grenville

Holly

I think you need to add in

site/archer.rc: CMIP6_ANCILS = $UMDIR/cmip6/ancils

There may be other similar issues - take a look at u-bc613 which is set up to work on Monsoon and ARCHER.

Grenville

comment:8 Changed 10 days ago by ha392

Hi Grenville,

That seemed to fix one problem, I am now getting a new error in recon-

[FAIL] file:STASHmaster=source=fcm:um.xm_br/pkg/Share/vn10.7_CMIP6_production_mods/rose-meta/um-atmos/HEAD/etc/stash/STASHmaster@43043: bad or missing value

From the output file I can see that it is going though the um/rose-app.conf file. The issue seems to be with the STASHMASTER, specifically line 60?

Thank you,
Holly

comment:9 Changed 9 days ago by grenville

Holly

try removing these lines from /home/ha392/roses/u-bi305/app/um/rose-app.conf

[file:STASHmaster]
source=fcm:um.xm_br/pkg/Share/vn10.7_CMIP6_production_mods/rose-meta/um-atmos/HEAD/etc/stash/STASHmaster@43043

Grenville

comment:10 Changed 9 days ago by ha392

Hi Grenville,

Thank you for all of your help so so far. A couple of errors later and I have come across another one that I am not too sure how to handle in the recon-

???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 1
?  Error from routine: io:file_open
?  Error message: Failed to open file
?  Error from processor: 0
?  Error number: 2
????????????????????????????????????????????????????????????????????????????????

My guess is that I have something wrong with my IO Server settings, I have tried to change them so that they match a similar AMIP 10.7 suite's settings but this does not seem to help. Is there any recommendation on this?

Hopefully by the end of setting up this suite I should get better with these error messages.

Thank you,
Holly

comment:11 Changed 9 days ago by grenville

Hi Holly

Not very helpful - you'd think it would say which file it couldn't open - please set RCF_PRINTSTATUS to extra diagnostic messages (um→env→runtime controls→reconfig.. ) & re run - see if that reveals the bad file

Grenville

comment:13 Changed 9 days ago by grenville

Holly

I think the problem is that you are trying to read an empty AINITIAL and the reconfiguration will try to write to

/data/d01/ukcmip6/N96AMIP_ensemble1_dumps/au068a.da19790101_00

that's not an ARCHER path.

Grenville

comment:14 Changed 8 days ago by ha392

Hi Grenville,

Okay, is there anyway to get hold of this file/path needed?

Thank you,
Holly

comment:15 Changed 8 days ago by grenville

Holly

I don't know what you are aiming to do, but I suggest you get hold of the start file for the original job (which I assume ran OK?) and set AINITIAL to that file. Set astart to $ROSE_DATA/$RUNID.astart so that the reconfiguration writes its output here.

I tried with what's set in site/archer.rc for AINITIAL_N96 but that had start time mismatch and the model seemed to fail in an easy_aerosol routine.

Please check that the original site ran.

Grenville

Grenville

comment:16 Changed 8 days ago by ha392

Hi Grenville,

We were told to use the suite u-au074 (which has a master suite u-as602) from contacts at the Met Office as a setup for an experiment using the 10.7 GA7.1 AMIP setup. These suites on rose were not setup for ARCHER, so I have made copies (u-bi305 and u-bg509 for the master suite) and have just been trying to get them to run on ARCHER to start with, which has led me to this point. So the original suites do not run on ARCHER. I will try to get hold of the start file for the original job.

Thank you,
Holly

comment:17 Changed 2 days ago by ha392

Hi Grenville,

We have fixed the start dump issue using one copied over from the original suite, but we are not sure on what is going on with the easy_aerosol failure.

Thank you,
Holly

Note: See TracTickets for help on using tickets.