Opened 7 months ago

Closed 6 months ago

#2894 closed help (answered)

Copying a Met Office suite to Archer, vn10.7 GA7.1

Reported by: ha392 Owned by: um_support
Component: UM Model Keywords: Archer, 10.7, GA7.1, AMIP, CMIP6
Cc: Platform: ARCHER
UM Version: 10.7

Description

Hi,

I am currently trying to create a working copy of the MetO Cray suite u-au074 for ARCHER in the form of suite u-bi305.
So far I have changed the host machine in suite-conf to ARCHER, and attemepted to change the meta in the um rose-app.conf file from

/data/users/moci/CMIP6_rose-meta/um-atmos/HEAD
to

um-atmos/vn10.7

However, I am unsure where to go from here or if this is even correct.

I am still getting an error when opening the suite related to /data/users/moci/CMIP6_rose-meta/um-atmos/HEAD/rose-meta.conf when I open the suite, but cannot find where else to change this.

I also get an error related to post processing but I plan to update that file to a better version of post processing for ARCHER.

Any advice on this would be greatly appreciated.

Thank you,
Holly

Change History (25)

comment:1 Changed 7 months ago by ros

Hi Holly,

I have a copy of that metadata directory at: /home/ros/meta/moci/CMIP6_rose-meta/um-atmos/HEAD. I haven't looked to see if the Met Office have updated this for a while but it will be a good start at least. I'll check it tomorrow.

As well as in the app/um/rose-app.conf file it is also referenced in app/um/meta/rose-meta.conf.

Hope that helps.
Regards,
Ros.

comment:2 Changed 7 months ago by ha392

Hi Ros,

This seems to get rid of my main error, so as long as it's the right version it is definitely a good start, thank you.

Holly

Last edited 7 months ago by ha392 (previous) (diff)

comment:3 Changed 7 months ago by ros

Hi Holly,

I confirm the metadata directory is still at the same revision as the Met Office are working from.

Other things you will need to change if you haven't already are Domain decomposition…. Max number of processes on ARCHER needs setting to 24. And obviously check the domain decompositions are a multiple of 24 and input file paths. I can point you to upgrade instructions for postproc when you get that far.

Regards,
Ros.

comment:4 Changed 7 months ago by ha392

Hi Ros,
I have gotten as far as changing the metadata directories, changing the domain decomposition and sorting through all the flagged-up errors with the suite.

I am currently having troubles with the fcm_make_um2 process.

http://puma.nerc.ac.uk/rose-bush/view/ha392/u-bg509?&no_fuzzy_time=0&path=log/job/19790101T0000Z/fcm_make2_um/01/job.err

As far as I can tell, the main differences between the fcm_make_um files of this suite and ones that I have working, are the branches in the sources section-

branches/pkg/Share/vn10.7_CMIP6_production_mods@43043
branches/dev/martinandrews/vn10.7_nancillookup300000@45275
branches/dev/samcusworth/vn10.7_fix_defensive_programming@35126
branches/dev/irinalp/vn4.8_Copy_0376_to_8376@8494
branches/dev/benjohnson/um10.7_easyaerosol_cmip6@312

I would like to keep the suite as similar as possible to the met office original, so I don’t think I want to change them, but I am not sure what impact it would have if I did or if they are infact the problem.

Thank you,
Holly

comment:5 Changed 7 months ago by grenville

Hi Holly

Please see #2484 - the first comment.

Greenville

comment:6 Changed 7 months ago by ha392

Hi Grenville,

Great thank you, I now have fcm_make_um2 working. I am now having trouble with reconfiguration-

[FAIL] env=SPECTRAL_FILE_DIR: CMIP6_ANCILS: unbound variable

http://puma.nerc.ac.uk/rose-bush/view/ha392/u-bi305?&no_fuzzy_time=0&path=log/job/19790101T0000Z/recon/01/job.err

There does not seem to be anything particularly helpful in the job.out file from what I can tell.

Thank you,
Holly

comment:7 Changed 7 months ago by grenville

Holly

I think you need to add in

site/archer.rc: CMIP6_ANCILS = $UMDIR/cmip6/ancils

There may be other similar issues - take a look at u-bc613 which is set up to work on Monsoon and ARCHER.

Grenville

comment:8 Changed 7 months ago by ha392

Hi Grenville,

That seemed to fix one problem, I am now getting a new error in recon-

[FAIL] file:STASHmaster=source=fcm:um.xm_br/pkg/Share/vn10.7_CMIP6_production_mods/rose-meta/um-atmos/HEAD/etc/stash/STASHmaster@43043: bad or missing value

From the output file I can see that it is going though the um/rose-app.conf file. The issue seems to be with the STASHMASTER, specifically line 60?

Thank you,
Holly

comment:9 Changed 7 months ago by grenville

Holly

try removing these lines from /home/ha392/roses/u-bi305/app/um/rose-app.conf

[file:STASHmaster]
source=fcm:um.xm_br/pkg/Share/vn10.7_CMIP6_production_mods/rose-meta/um-atmos/HEAD/etc/stash/STASHmaster@43043

Grenville

comment:10 Changed 7 months ago by ha392

Hi Grenville,

Thank you for all of your help so so far. A couple of errors later and I have come across another one that I am not too sure how to handle in the recon-

???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 1
?  Error from routine: io:file_open
?  Error message: Failed to open file
?  Error from processor: 0
?  Error number: 2
????????????????????????????????????????????????????????????????????????????????

My guess is that I have something wrong with my IO Server settings, I have tried to change them so that they match a similar AMIP 10.7 suite's settings but this does not seem to help. Is there any recommendation on this?

Hopefully by the end of setting up this suite I should get better with these error messages.

Thank you,
Holly

comment:11 Changed 7 months ago by grenville

Hi Holly

Not very helpful - you'd think it would say which file it couldn't open - please set RCF_PRINTSTATUS to extra diagnostic messages (um→env→runtime controls→reconfig.. ) & re run - see if that reveals the bad file

Grenville

comment:13 Changed 7 months ago by grenville

Holly

I think the problem is that you are trying to read an empty AINITIAL and the reconfiguration will try to write to

/data/d01/ukcmip6/N96AMIP_ensemble1_dumps/au068a.da19790101_00

that's not an ARCHER path.

Grenville

comment:14 Changed 7 months ago by ha392

Hi Grenville,

Okay, is there anyway to get hold of this file/path needed?

Thank you,
Holly

comment:15 Changed 7 months ago by grenville

Holly

I don't know what you are aiming to do, but I suggest you get hold of the start file for the original job (which I assume ran OK?) and set AINITIAL to that file. Set astart to $ROSE_DATA/$RUNID.astart so that the reconfiguration writes its output here.

I tried with what's set in site/archer.rc for AINITIAL_N96 but that had start time mismatch and the model seemed to fail in an easy_aerosol routine.

Please check that the original site ran.

Grenville

Grenville

comment:16 Changed 7 months ago by ha392

Hi Grenville,

We were told to use the suite u-au074 (which has a master suite u-as602) from contacts at the Met Office as a setup for an experiment using the 10.7 GA7.1 AMIP setup. These suites on rose were not setup for ARCHER, so I have made copies (u-bi305 and u-bg509 for the master suite) and have just been trying to get them to run on ARCHER to start with, which has led me to this point. So the original suites do not run on ARCHER. I will try to get hold of the start file for the original job.

Thank you,
Holly

comment:17 Changed 7 months ago by ha392

Hi Grenville,

We have fixed the start dump issue using one copied over from the original suite, but we are not sure on what is going on with the easy_aerosol failure.

Thank you,
Holly

comment:18 Changed 7 months ago by grenville

Hi Holly

The problem is in easyaerosol_read_input_mod.F90 where the array easyaerosol_dist(l)%values_4d is being accessed out of ounds. The size of its 4th dimension is 6 but the model is trying to access values up to 643 before it fails.

I don't see how this could ever have worked - easyaerosol_read_input_mod.F90 has been modified in later versions of the UM, possibly because of this (?) - are you in contact with the code owner? Do you have access to the logs of a run which claims to have used this code?

Grenville

comment:19 Changed 7 months ago by ha392

Hi Grenville,

I have contacted the suite owner and I am now waiting on a reply. I suppose a question I should have asked earlier, is there an Atmospheric only vn10.7, GA7.1 N96 suite already setup on ARCHER, upon looking at the standard suite GA7.0 configuration page I could not find one apart from u-av321 which seems to have some problems of its own?

Thank you,
Holly

comment:20 Changed 7 months ago by grenville

Holly

Do you need a 10.7 version?

Grenville

comment:21 Changed 7 months ago by ha392

Hi Grenville,

10.7 is preferred as we will be comparing results to a coupled model of the same configuration (based on the cmip6 picontrol suite).

Thank you,
Holly

comment:22 Changed 7 months ago by grenville

Hi Holly

u-av321 has been fixed - it ran OK with the GC7.1 configuration (I ran for ~500 timesteps is all). If you make a copy you'll need to fix app/install_ancil/opt/rose-app-ga7p1.conf to be

[env]
ANCILRES=n96e_orca1
ANCILREV=
ANCILROOT=$UMDIR/ancil/data/ancil_versions
ANCILVN=GA7.1/v2

[file:$ROSE_DATA/etc/um_ancils_gl]
source=$ANCILROOT/$ANCILRES/$ANCILVN/ancils$ANCILREV

I don't know why u-bi305 doesn't work. I'll try to compare with u-az321 if time permits.

Grenville

comment:23 Changed 6 months ago by ha392

Hi Grenville,

Thank you for this, I think I am all set up on the on this suite now. One last question before you close the ticket, I am trying to read in some new ancillary files for SIC and SST, but I cannot find any documentation on this. I have the .anc files ready and produced by colleagues at the Met Office, but how do I correctly apply them to my suite?

Thank you,
Holly

comment:24 Changed 6 months ago by grenville

Holly

Go to um→namelist→reconfiguration and …→configure ancils.. - right click on the SST entry (for example), then view namelist items and set the ancil filename.

Grenville

comment:25 Changed 6 months ago by grenville

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.