#2878 closed help (answered)
Remote login to ARCHER from PUMA; Rose UM submission
Reported by: | Leighton_Regayre | Owned by: | um_support |
---|---|---|---|
Component: | ARCHER | Keywords: | PUMA ssh |
Cc: | Platform: | PUMA | |
UM Version: | 11.1 |
Description
Hello,
I'm having trouble submitting a job (u-bh909) to ARCHER from within Rose on PUMA. Error messages in the log suggest this is because the ssh to ARCHER is failing, ending with a "no hosts selected" error.
I've followed the advice for the FAQ:
ssh-add gives error message: Could not open a connection to your authentication agent.
which was relevant to me.
I've confirmed that I can access ARCHER without entering a password or passphrase.
Change History (20)
comment:1 in reply to: ↑ description Changed 22 months ago by Leighton_Regayre
comment:2 Changed 21 months ago by grenville
Leighton
Have you done this
http://cms.ncas.ac.uk/wiki/RoseCylc/Hints#Settinguprosehost-selectarcher
Grenville
comment:3 Changed 21 months ago by Leighton_Regayre
Hi Grenville,
No, I'd overlooked the required setup. I presumed that picking up a job configured for ARCHER use it would be ready to run.
I've included the suggested lines in my config file and also included the following:
Host dtn*.rdf.ac.uk
User lre
which Chris Symonds suggested was needed for postprocessing and transfer of data to the dtn.
I followed all instructions on the link you sent and can now log into rdf as well as ARCHER without password/passphrase. I can also log into JASMIN from rdf without password/passphrase.
The original problem has been solved, but I'm still having trouble with the submission. The current error is a "RosePopenError?"/"bash: rose: command not found" and looks to be related to ssh. Am I right in thinking this is related to post postproc? I've attempted to follow advice given in ticket #2746 by including cylc and rose versions in my .profile and .bash_profile which match my puma versions, but the error persists.
Thanks for your help,
Leighton.
comment:4 Changed 21 months ago by ros
Hi Leighton,
Not sure what the line . /work/n02/n02/lre/rose/ is in your ~/.profile on dtn for? That file doesn't exist. Please remove this line and try again.
Cheers,
Ros.
comment:5 Changed 21 months ago by Leighton_Regayre
Hi Ros,
I've removed that line but have the same problem. Do I need something like this in my .profile?
# Setup environment for running the UM under Rose
. /work/y07/y07/umshared/bin/rose-um-env
That was part of Dave's suggestion in ticket #2746.
Thanks,
Leighton.
comment:6 Changed 21 months ago by ros
Hi Leighton.
Sorry I thought you were talking about the DTN not ARCHER login nodes!
Yes you need
. /work/y07/y07/umshared/bin/rose-um-env
in your ~/.profile on ARCHER. Please don't set cylc, rose, fcm versions in there otherwise you will not pick up essential new versions when they are released. The system is set up to pick up the correct defaults across PUMA and ARCHER. You should only set the versions explicitly if you need to use non-standard versions.
Cheers,
Ros.
comment:7 Changed 21 months ago by Leighton_Regayre
Hi Ros,
OK, thanks. Sorry for the confusion. I was mixing up the ARCHER and DTN nodes.
I've added the line
. /work/y07/y07/umshared/bin/rose-um-env
to my ARCHER ~/.profile and have successfully submitted the suite.
Thanks for your help,
Leighton.
comment:8 Changed 21 months ago by Leighton_Regayre
Hi again,
I'm having some additional problems with this suite that are related tot he above ssh help request.
My suite (u-bh914) now runs to completion but fails on the pptransfer step. It looks like this is because I haven't correctly set up the connection from data transfer nodes to Jasmin. I've followed all steps here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin
However, I am denied permission when attempting to log into JASMIN using:
ssh jasmin-xfer2.ceda.ac.uk
Is this the sort of command implied by step 7 in the above link?
I took some advice from Jeremy Walton and Mark Richardson and included the following in my PUMA ~/.ssh/config file:
Host jasmin-login1.ceda.ac.uk User lregayre Host jasmin-xfer2.ceda.ac.uk # PubKeyAuthentication yes User lregayre IdentityFile ~/.ssh/id_rsa_jasmin ForwardAgent no Host jasmin-xfer1.ceda.ac.uk # PubKeyAuthentication yes User lregayre IdentityFile ~/.ssh/id_rsa_jasmin ForwardAgent no
but am still denied access to JASMIN from rdf.
Thanks in advance for your help with this,
Leighton.
comment:9 Changed 21 months ago by ros
Hi Leighton,
On dtn02 can you please run:
ssh-add -l
and then also:
ssh -vvv jasmin-xfer2.ceda.ac.uk
and send me the output of both so that I can your ssh-agent is running ok and hopefully see what might be going on.
Cheers,
Ros.
comment:10 follow-up: ↓ 13 Changed 21 months ago by grenville
Leightion
Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)
comment:11 follow-up: ↓ 12 Changed 21 months ago by ros
P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk
comment:12 in reply to: ↑ 11 Changed 21 months ago by Leighton_Regayre
Replying to ros:
P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk
Here is the output from all 3 commands:
[lre@dtn02 ~]$ ssh-add -l
[lre@dtn02 ~]$ ssh -vvv jasmin-xfer2.ceda.ac.uk
[lre@dtn02 ~]$ ssh jasmin-xfer1.ceda.ac.uk
[Information removed]
So, access denied for xfer2, but fine for xfer1 (after including the above "Host" commands in my RDF ~/.ssh/config file as well.
comment:13 in reply to: ↑ 10 Changed 21 months ago by Leighton_Regayre
Replying to grenville:
Leightion
Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)
Hi Grenville,
I think so. I can access xfer1, so I presume so. Is there a way for me to check?
comment:14 Changed 21 months ago by grenville
xfer1 is not one of the fast transfer machines. Sounds like you haven't.
Grenville
comment:15 follow-up: ↓ 16 Changed 21 months ago by ros
Hi Leighton,
Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.
Cheers,
Ros.
comment:16 in reply to: ↑ 15 Changed 21 months ago by Leighton_Regayre
Replying to ros:
Hi Leighton,
Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.
Cheers,
Ros.
Hi both,
No, I haven't got access to the hpxfer service. I'll register as soon as I've worked out the ip address.
Thanks for your patience,
Leighton
comment:17 follow-up: ↓ 18 Changed 21 months ago by ros
Hi Leighton,
You don't need to worry about the IP address as the DTN is already registered with JASMIN.
Cheers,
Ros.
comment:18 in reply to: ↑ 17 Changed 21 months ago by Leighton_Regayre
Replying to ros:
Hi Leighton,
You don't need to worry about the IP address as the DTN is already registered with JASMIN.
Cheers,
Ros.
Thanks for the tip Ros. The application page won't accept my request without a valid IP address, so I've got to enter something. Any suggestions?
Thanks,
Leighton.
comment:19 Changed 21 months ago by Leighton_Regayre
Hello,
Thanks for your help with this.
I've applied for hpxfer service access.
My mistake was in my interpretation of instruction 8 here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin
where I assumed being able to log into xfer1 was sufficient. It might be useful to provide an example with instruction 8 to help others avoid this mistake.
Also, I'm not sure how I would ssh into hpxfer1/2/3?
Thanks again,
Leighton.
comment:20 Changed 21 months ago by grenville
- Resolution set to answered
- Status changed from new to closed
Leightion
ssh jasmin-xfer[1, 2,or 3].ceda.ac.uk
(xfer1 is not a fast transfer machine)
Grenville
Please note that I've deleted suite u-bh909 which is a copy of a suite which was passed to me with incorrect postprocessing settings. Suite u-bh914 has the same issue with ssh to ARCHER, but no postprocessing errors.
Thanks,
Leighton.