Opened 2 years ago
Closed 2 years ago
#2750 closed help (fixed)
STASHmaster addition: Error "No stashmaster record".
Reported by: | cbellisario | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | stashmaster, um |
Cc: | Platform: | NEXCS | |
UM Version: | 11.0 |
Description
Dear NCAS team,
I am struggling with the addition of a STASHmaster item in suite be515.
I did perform the addition in STASTHmaster_A, STASHmaster-meta.conf.
I did relied the address in:
- um/ meta
- um/env/Runtime Controls (add latent variable)
- um/namelist/Reconfiguration and Ancillary controls/Configure ancils and initialise dump fields (with the addition of the new section)
- um/namelist/Model Input and Output/STASH Requests and Profiles/Stash? requests (with the addition of the stash request and the run of the corresponding macros transform/tidy/validate)
And yet, I get the following error message
????????????????????????????????????????????????????????????????????????????????
?????????????????????????????? WARNING ??????????????????????????????
? Warning code: -10
? Warning from routine: PRELIM
? Warning message:
? Field - Section:0, Item:512 discarded.
? No stashmaster record.
? Warning from processor: 0
? Warning number: 24
????????????????????????????????????????????????????????????????????????????????
I have checked that the jlw_down_band (new index added to read the variable lw_down_band) does not get "good" values as it starts at 0.
So I think I miss one point (or more) in the configuration in Rosie go, but I cannot figure out which one.
I did similar things in suite bd497 and I have index starting at 40000 something, so I tried to do the same but without success.
Any help is more that welcome,
Best regards,
Christophe
Change History (15)
comment:1 Changed 2 years ago by grenville
comment:2 Changed 2 years ago by cbellisario
Dear Grenville,
Thanks for your help,
So I guess there is one part where I do not refer to the good STASHmaster file.
In STASHmaster_A located in ~/GA7.1_UM11.0_AMIP/Branch_Seq03_check/vn11.0_check/rose-meta/um-atmos/vn11.0_HEAD/etc/stash/STASHmaster/, I do have the line:
1| 1 | 0 | 512 |SURFACE DOWNWARD LW RADIATIONbdsW/M2|
And I do refer to this STASHMaster directory in rosie go & um/meta:
/home/d04/chrbe/GA7.1_UM11.0_AMIP/Branch_Seq03_check/vn11.0_check/rose-meta/um-atmos/vn11.0_HEAD/
And I could find it in the STASH requests in rosie go.
Is there another place where I have to redirect the STASHMaster folder to the good one?
Thank you for your help,
Best regards,
Christophe
comment:3 Changed 2 years ago by grenville
But your suite points to:
/home/d04/chrbe/GA7.1_UM11.0_AMIP/Branch_Seq03_check/vn11.0_check/rose-meta/um-atmos/HEAD/etc/stash/STASHmaster
Grenville
comment:4 Changed 2 years ago by cbellisario
Dear Grenville,
Thank you for your help, I trace back the error in the um/env/Runtime Controls link.
However, I still get trouble in the connexion to NEXCS, or when Rosie Go is launched, I get sometimes crashes due to memory issues. Today is about the connexion on NEXCS that takes a really long time before stopping. Yesterday was about cylc runs not visible but running on. "Connect Now" was not working either. Could it be due to a configuration of my suite that somehow crashed the memory allocation of my runs? Or is it only related to NEXCS?
Thank you in advance,
Best regards,
Christophe
comment:5 Changed 2 years ago by grenville
Christophe
I don't know what you mean by connection to NEXCS - from where? Where are you running Rosie go? Have you moved away from running from exvmsrose?
Grenville
comment:6 Changed 2 years ago by cbellisario
Dear Grenville,
Sorry for not being very clear:
- To run on NEXCS, I connect on Puma, from which I connect on exvmslander and from which I connect on exvmsrose. From Puma to exvmslander, I don't have any problem, but from exvmslander to exvmsrose, today, I cannot access to exvmsrose.
- A problem I had yesterday was: on exvmsrose, I launched rosie go &. However, when opened, the cylc button was a bit different. When launched the suite directly from the suite list, the cylc appeared but with nothing in it (blank) despite that the suite was told to be running.
- A problem I had the days before was on exvmsrose, opening the suite with rosie go &, I had the suite crashing (about once every two times) when I tried to run it. I could though ran it directly from the suite list without any problem.
Now I am wondering if it is about temporary hpc ressources issues or if they are related.
Christophe
comment:7 Changed 2 years ago by grenville
Christophe
The xcs is down today:
"Here is a reminder for tomorrow's extended Monsoon and NEXCS outage, starting" from 04:00 on 6th February through to 11:00 on 7th February 2019, details on Yammer."
contact Monsoon to arrange access to the Yammer group.
It is no longer necessary to use exvmsrose - see https://collab.metoffice.gov.uk/twiki/bin/view/Support/MONSooN. The new system is much faster and hopefully the answer to your problems.
Grenville
comment:8 Changed 2 years ago by cbellisario
Dear Grenville,
Following the end of the exvmsrose era and the moving to xcslc0 / xcslc1, I now face a -new- problem:
When trying to get access to xcslc* from exvmslander, I get the following error:
[chrbe@exvmslander:~]$ ssh -Y xcslc1 Last login: Thu Feb 7 15:53:18 2019 from 10.168.5.6 This computer is provided for the processing of Official Information. Unauthorized access may constitute a criminal offence. All activity on the system is liable to monitoring. -bash: mosrs-cache-password: command not found Met Office Science Repository Service password: gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option svn: E215004: Unable to connect to a repository at URL 'https://code.metoffice.gov.uk/svn/test' svn: E215004: No more credentials or we tried too many times. Authentication failed Error: Unable to access Subversion with given password Run "mosrs-cache-password" to try caching your password again Met Office Science Repository Service password:
I do get access at some points to xcslc1 but without being able to run anything.
I followed the https://collab.metoffice.gov.uk/twiki/bin/view/Support/RetirementOfRoseCylcVMs and associated https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching#Monsoon .
I changed the directory of mosrs-cache-password from
#!/bin/bash set -u gpgpresetpassphrase="/usr/libexec/gpg-preset-passphrase"
to
#!/bin/bash set -u gpgpresetpassphrase="/usr/lib64/gpg-preset-passphrase"
It does work either.
When trying to run mosrs-cache-password, it logs me out of xcslc1.
No need to let you know that rosie is of course not running when I am still on xcslc1.
On the other side, I can get back to exvmsrose as I use to before.
But rosie go on it still behaves strangely (related to the same troubles as expressed on http://cms.ncas.ac.uk/ticket/2758)
So I still cannot run/see/do anything on this side.
It starts to become as annoying as depressing in the sens that wherever I try to run the UM, nothing works (and I am not even advanced with the problem of the STASHmaster changes that does crash the UM at some point).
So any ideas about how to solve these problems are more than welcome.
I did contacted Monsoon team about the first part of the problem, I will post it here when they answer, that could help some other people.
Best regards,
Christophe
comment:9 Changed 2 years ago by grenville
Christophe
Please delete /home/d04/chrbe/mosrs-cache-password and /home/d04/chrbe/mosrs-setup-gpg-agent, then follow the instructions again.
Grenville
comment:10 Changed 2 years ago by cbellisario
Thank you for your answer.
I removed both mosrs-cache-password / mosrs-setup-gpg-agent, took them back from https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching/GpgAgent, scp them to xcslc1 and tried to run it but without success:
chrbe@xcslc1:~> . mosrs-cache-password Met Office Science Repository Service password: gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option svn: E215004: Unable to connect to a repository at URL 'https://code.metoffice. gov.uk/svn/test' svn: E215004: No more credentials or we tried too many times. Authentication failed Error: Unable to access Subversion with given password basename: invalid option -- 'b' Try `basename --help' for more information. Run "" to try caching your password again Connection to xcslc1 closed. [chrbe@exvmslander:~]$
comment:11 Changed 2 years ago by grenville
Christophe
Those aren't the instructions.
These are the instructions - https://collab.metoffice.gov.uk/twiki/bin/view/Support/RetirementOfRoseCylcVMs and associated https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching#Monsoon.
Grenville
comment:12 Changed 2 years ago by cbellisario
Yes, I followed these instructions:
- From https://collab.metoffice.gov.uk/twiki/bin/view/Support/RetirementOfRoseCylcVMs , I tried to Update my $HOME/.bashrc according to the guidance here (https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching#Monsoon) where I
- insured that I had configured my ~/.subversion/servers file (https://code.metoffice.gov.uk/trac/home/wiki/FAQ#ConfiguringSubversionaccess).
- My .bash_profile and .bashrc files are up-to-date.
So when I try to connect to xcslc1, I have:
[chrbe@exvmslander:~]$ ssh -Y xcslc1 Last login: Thu Feb 7 17:22:04 2019 from 10.168.5.6 This computer is provided for the processing of Official Information. Unauthorized access may constitute a criminal offence. All activity on the system is liable to monitoring. -bash: mosrs-cache-password: command not found Met Office Science Repository Service password: gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response gpg-preset-passphrase: problem with the agent gpg-preset-passphrase: caching passphrase failed: Invalid response svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option svn: E215004: Unable to connect to a repository at URL 'https://code.metoffice.gov.uk/svn/test' svn: E215004: No more credentials or we tried too many times. Authentication failed Error: Unable to access Subversion with given password Run "mosrs-cache-password" to try caching your password again Met Office Science Repository Service password:
comment:13 Changed 2 years ago by grenville
Christophe
Please delete /home/d04/chrbe/mosrs-cache-password and /home/d04/chrbe/mosrs-setup-gpg-agent (again) then follow instructions - https://collab.metoffice.gov.uk/twiki/bin/view/Support/RetirementOfRoseCylcVMs and associated https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching#Monsoon.
Don't follow https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching/GpgAgent — the instructions do not direct you to follow this link.
Grenville
comment:14 Changed 2 years ago by cbellisario
Dear Grenville,
Thank you! It does work, my error was to retrieve mosrs-cache-password/mosrs-setup-gpg-agent on the /home/d04/chrbe/ after deleting them. I still get the message
-bash: mosrs-setup-gpg-agent: No such file or directory
but it does not impact the following steps.
When opening Rose, I know get the display of the run, in comparison to what was happening on exvmsrose, so I guess that problem is solved too.
I am now back to my segmentation fault problem that I relate to the STASHmaster modifications. But at least I can work on the code now.
Thank you for your help!
Best regards,
Christophe
comment:15 Changed 2 years ago by grenville
- Resolution set to fixed
- Status changed from new to closed
Christophe
In STASH requests, you request section 0, item 512 — but there is no such entry in the STASHMaster file - there are entries for sections 1,2,3,and 38 only
1| 1 | 1 | 512 |CLEAR DOWN SW FLUX ON LEVS AND BANDS|
1| 1 | 2 | 512 |CLEAR DOWN LW FLUX ON LEVS AND BANDS|
1| 1 | 3 | 512 |STABILITY FUNCTION FOR MOMENTUM |
1| 1 | 38 | 512 |H2O Aitken-sol mode (kgm-3) |
Look like you have mistypes some metadata.
Grenville