Opened 5 years ago

Closed 5 years ago

#1283 closed help (fixed)

Problem with FCM extraction on submitting UM job

Reported by: jcrook Owned by: ros
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 6.6.3

Description

I have copied a job that I used to run on Hector and have modified it to run on ARCHER. When I submit the job, the FCM extraction fails. The ext.out file says:

[FAIL] /home/um/fcm/etc/um_tutorial_revisions.cfg: cannot locate config file, abort at /home/um/fcm-2014-02/bin/../lib/FCM1
/Config.pm line 842.

Change History (14)

comment:1 Changed 5 years ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi Julia,

You need to remove the file /home/jcrook/.fcm on PUMA, it is no longer required.

This was a result of FCM being upgraded a few weeks ago, I did try and send emails to all those who were affected.

Cheers,
Ros.

comment:2 Changed 5 years ago by jcrook

I have removed the file and tried again. It still says the base extract failed but when I look at the ext.out file in umbase it looks like it has started ok. There are files under cfg and src with todays date on PUMA but nothing on ARCHER. I previously did some set up to allow ssh access to Hector without me having to type in the Hector password. I have still done ssh-add on PUMA but my account is now on ARCHER with a different username and the UMUI prompts me for the ARCHER password. I did not copy authorised-keys file from Hector to ARCHER as I only started using ARCHER after Hector closed. Could the problems be related to this?

comment:3 Changed 5 years ago by ros

Hi Julia,

Run the following to copy your public key to ARCHER:

puma$ cat ~/.ssh/id_dsa.pub | ssh <username>@login.archer.ac.uk 'mkdir -p .ssh ; cat - >> ~/.ssh/authorized_keys'
[Enter your ARCHER password]

As you have already run ssh-add to add your key to the agent this should now work. Try ssh'ing to ARCHER from PUMA to test that ssh-keys are set up correctly and you are logged on without any prompt for a password/passphrase.

Cheers,
Ros.

comment:4 Changed 5 years ago by jcrook

Thanks. Having done this the extraction now works and I am compiling on ARCHER!

comment:5 Changed 5 years ago by jcrook

Well it built ok so I tried to just run it with the executable. This time it failed to do the extract:

Extract command started on Tue Apr 29 15:06:43 2014.
→Parse configuration: start
Config file (ext): svn://puma/UM_svn/UM/branches/dev/um/HG6.6.3_machine_cfg/src/configs/bindings/container.cfg@13658
[FAIL] /home/jcrook/umui_jobs/xgezn/FCM_UMUI_BASE_CFG: cannot locate config file, abort at /home/um/fcm-2014-02/bin/../lib/
FCM1/ConfigSystem.pm line 539.

Config file (ext): svn://puma/UM_svn/UM/branches/dev/um/HG6.6.3_machine_cfg/src/configs/machines/cray-xc30-cce-archer/machi
ne.cfg@13658
Config file (ext): /dev/null

comment:6 Changed 5 years ago by ros

Hi Julia,

I'm guessing you've changed settings in job xgezn and resubmitted since sending this as there are no error messages in the ext.out files I just looked at, so I can't be totally sure what's going on, but my guess would be that you only switched off compilation of the model executable. You also need to switch the reconfiguration to be "Run from existing executable" as well.

Cheers,
Ros.

comment:7 Changed 5 years ago by jcrook

Yes sorry. I put it back to compile again but this time to run as well - just to check that the extract worked. The xtract did work so it is now running. I did only switch off the compilation. Next time I will make sure I switch both compile and reconfig to run from executable.

Thanks
Julia

comment:8 Changed 5 years ago by jcrook

Yesterday my xgezn job compiled but then it wouldn't run. It crashes really early. There is some file it wants that doesn't exist but I don't know what it is. I noticed that there doesn't seem to be a qxreconf file in the bin directory even though I asked it to build this.

comment:9 Changed 5 years ago by ros

Hi Julia,

Can you please change permissions on your /home and /work directories so that we can read them.

chmod -R g+rX /home/n02/n02/jcrook2
chmod -R g+rX /work/n02/n02/jcrook2

Cheers,
Ros.

comment:10 Changed 5 years ago by jcrook

Done.

comment:11 Changed 5 years ago by ros

Hmmmm…. interesting I can't currently see which file it is struggling with either. Will investigate further.

In answer to your question about the reconfiguration. You have not requested to run the reconfiguration for either atmos or ocean so that is why it hasn't built the reconfiguration exec.

Cheers,
Ros.

comment:12 Changed 5 years ago by ros

Hi Julia,

It looks like there's an environment variable ($RCP45L60_ancils) that's not getting expanded when it's trying to read in the Ocean Carbon Cycle file. There are certain places in the UMUI when variables cannot be used in pathnames. Can you try specifying the absolute pathname to the rcp45_co2_conc.dat file in window Ocean → Scientific Parameters → Carbon Cycle and see if that fixes the problem?

Cheers,
Ros.

comment:13 Changed 5 years ago by jcrook

Yes that has fixed the problem. That was something I changed because my username on Archer is different to that on Hector. I've probably tried to use the environment variable there before because I wondered why I hadn't used it there.

Thanks
Julia

comment:14 Changed 5 years ago by ros

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.