Opened 6 months ago

Closed 6 months ago

#2771 closed help (fixed)

Error in rose suite-run on xcslc0; SITE is undefined (u-bf393)

Reported by: langtont Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: Monsoon2
UM Version: 11.2

Description

Hi, since the exvmsrose was retired I've not been able to run any suites on xcs. When I run rose suite-run it gives the error shown in the screenshot. I think this is due to the suite not copying relevant files properly over to the ~/cylc-run directory but I'm not sure how to fix this. Possibly something to do with the mosrs-gpg-agent not being set up on xcslc0 but again unsure.

Thanks in advance

Attachments (2)

Screenshot 2019-02-13 at 13.39.35.png (246.7 KB) - added by langtont 6 months ago.
Error message
tomsRoseConfig.txt (1.6 KB) - added by langtont 6 months ago.
Rose config settings

Download all attachments as: .zip

Change History (14)

Changed 6 months ago by langtont

Error message

comment:1 Changed 6 months ago by dcase

I copied your suite bf-393 and tried to run it. The SITE variable is set in the rose-suite.conf, and the logic for JINJA2 resolved itself adequately for me.
You mentioned copying over files to the cylc-run directory, so there may be an issue here. You don't appear to have run through many cycles, so if you are willing to start afresh you could run with rose suite-run --new which would clear out the directory, and may clear any conflict.

If this suggestion doesn't work, let me know and I'll dig deeper.

Dave

Last edited 6 months ago by dcase (previous) (diff)

comment:2 Changed 6 months ago by langtont

Hi Dave,

Thanks for the suggestion, I've been digging a bit more myself and it's definitely due to incorrect copying of files. None of the files held in my ~/roses/u-bf393 directory are being copied over to the relevant cylc-run directory. I'm not sure the cause of this, maybe due to some incorrect settings in rose config?

Tom

comment:3 Changed 6 months ago by dcase

Tom,

you're saying that if you run rose suite-run --new it won't copy suite.rc and suite.rc.processed (and so on) to cylc-run/u-bf393 ? If so, could you run rose config > ~/tomsRoseConfig.txt and I'll compare it to mine to see if anything is different.

Dave

Changed 6 months ago by langtont

Rose config settings

comment:4 Changed 6 months ago by langtont

Dave,

No copying done, just the creation of the log, share and work directories unfortunately. I've attached the Rose Config file to the ticket now

Last edited 6 months ago by langtont (previous) (diff)

comment:5 Changed 6 months ago by dcase

Tom,

I looked at your config, but saw nothing untoward. If you run the suite with rose suite-run -vvv --new and put the output here then I'll look at it.
Alternatively you could make another copy of the suite and run that. I know that there's no reason why that should make a difference, but your suite did run for me and if you've followed the Yammer channel you'll have seen all of the problems that xcs filesystem has been having this week, so it may be that rolling the dice again brings more luck.

If none of this works, then I'll see if there's someone I can ask at the Monsoon end.

Dave

comment:6 Changed 6 months ago by langtont

Dave,

I copied the suite into suite u-bg004 and I'm running into the same issue as before. It seems to be specific to this suite, as I've run suite u-ay476 from scratch with no issues. Perhaps it is just down to an error with the xcs filesystem. Still as confused as before.

Tom

comment:7 Changed 6 months ago by dcase

I've just looked been comparing u-ay467 (which works) with u-bg004 (which doesn't). You actually set the variable MONSOON_ACCOUNT='ukca-ox' in u-bg004/rose-suite.conf but to 'project-ukesm' in u-bg004/rose-suite.conf_monsoon

For u-ay467 ACCOUNT_MONSOON='ukca-ox' in rose-suite.conf, but not anywhere else.

As a quick thing, could you remove this from u-bg004/rose-suite.conf_monsoon and do a rose suite-run -vvv --new ?

comment:8 Changed 6 months ago by langtont

Tried changing the line in rose-suite.conf_monsoon to say ukca-ox and files still aren't being copied to the directory properly. I was under the impression that the .conf_monsoon files etc. were used as example .conf files for different servers but I may be wrong.

Also, interestingly I tried manually copying all the files over to the cylc-run directory and running from there, but none of the variables in rose-suite.conf were being applied to suite.rc and I was still getting the SITE not recognised error. It seems that even when the copying issue is fixed, the suite.rc isn't interacting properly with my rose-suite.conf file which is puzzling.

Sorry this issue is taking so long to resolve.
Tom

comment:9 Changed 6 months ago by dcase

Tom,

I'm sure you're correct- that was probably not a good suggestion. I've been debugging something today and wanted to say something before leaving, without thinking properly.

Another difference is in the site/monsoon.rc (MONSooN.rc) under [[EXTRACT_RESOURCE]] and possibly the line host = $(rose host-select $ROSE_ORIG_HOST) is failing under [[[remote]]]. It is set to host = 'exvmsrose' for ay476 , so I would set it to localhost (or maybe 'exvmsrose' as in ay476, but this is surely not as good).

I will ask around for help tomorrow if changing this doesn't work.
Dave

comment:10 Changed 6 months ago by dcase

Also,

the reason I was concerned about things being set twice was because you are writing to two places. The real reason for this is that the line export DATADIR="/projects/ukca-ox/tlangton" is set below [[ $- != *i* ]] && return in your ~/.bashrc (and hence is not picked up in non-interactive sessions).

Please place only the lines about the MOSRS agent and the bash completion below the line which tests for an interactive session. DATADIR and other variables should be set above.

comment:11 Changed 6 months ago by langtont

Dave,

Thanks for the help - one of these two changes worked! I wonder if it's to do with the interactive session—my bash.rc worked previously on exvmsrose but maybe when changing to xcslc0/1 it's needed. All is working now though so thanks again!

comment:12 Changed 6 months ago by dcase

  • Resolution set to fixed
  • Status changed from new to closed

It seems likely as things were being written to /projects/ukca-ox and /projects/phdcases , rather than one or the other.

I'll shut this ticket, but let me know if there are further problems.

Note: See TracTickets for help on using tickets.