Opened 8 months ago

Closed 7 months ago

Last modified 7 weeks ago

#2878 closed help (answered)

Remote login to ARCHER from PUMA; Rose UM submission

Reported by: Leighton_Regayre Owned by: um_support
Component: ARCHER Keywords: PUMA ssh
Cc: Platform: PUMA
UM Version: 11.1

Description

Hello,

I'm having trouble submitting a job (u-bh909) to ARCHER from within Rose on PUMA. Error messages in the log suggest this is because the ssh to ARCHER is failing, ending with a "no hosts selected" error.

I've followed the advice for the FAQ:
ssh-add gives error message: Could not open a connection to your authentication agent.
which was relevant to me.

I've confirmed that I can access ARCHER without entering a password or passphrase.

Change History (20)

comment:1 in reply to: ↑ description Changed 8 months ago by Leighton_Regayre

Please note that I've deleted suite u-bh909 which is a copy of a suite which was passed to me with incorrect postprocessing settings. Suite u-bh914 has the same issue with ssh to ARCHER, but no postprocessing errors.

Thanks,

Leighton.

comment:2 Changed 8 months ago by grenville

comment:3 Changed 8 months ago by Leighton_Regayre

Hi Grenville,

No, I'd overlooked the required setup. I presumed that picking up a job configured for ARCHER use it would be ready to run.

I've included the suggested lines in my config file and also included the following:

Host dtn*.rdf.ac.uk

User lre

which Chris Symonds suggested was needed for postprocessing and transfer of data to the dtn.

I followed all instructions on the link you sent and can now log into rdf as well as ARCHER without password/passphrase. I can also log into JASMIN from rdf without password/passphrase.

The original problem has been solved, but I'm still having trouble with the submission. The current error is a "RosePopenError?"/"bash: rose: command not found" and looks to be related to ssh. Am I right in thinking this is related to post postproc? I've attempted to follow advice given in ticket #2746 by including cylc and rose versions in my .profile and .bash_profile which match my puma versions, but the error persists.

Thanks for your help,

Leighton.

comment:4 Changed 8 months ago by ros

Hi Leighton,

Not sure what the line . /work/n02/n02/lre/rose/ is in your ~/.profile on dtn for? That file doesn't exist. Please remove this line and try again.

Cheers,
Ros.

comment:5 Changed 8 months ago by Leighton_Regayre

Hi Ros,

I've removed that line but have the same problem. Do I need something like this in my .profile?

# Setup environment for running the UM under Rose
. /work/y07/y07/umshared/bin/rose-um-env

That was part of Dave's suggestion in ticket #2746.

Thanks,

Leighton.

Last edited 8 months ago by Leighton_Regayre (previous) (diff)

comment:6 Changed 8 months ago by ros

Hi Leighton.

Sorry I thought you were talking about the DTN not ARCHER login nodes!

Yes you need

. /work/y07/y07/umshared/bin/rose-um-env

in your ~/.profile on ARCHER. Please don't set cylc, rose, fcm versions in there otherwise you will not pick up essential new versions when they are released. The system is set up to pick up the correct defaults across PUMA and ARCHER. You should only set the versions explicitly if you need to use non-standard versions.

Cheers,
Ros.

comment:7 Changed 8 months ago by Leighton_Regayre

Hi Ros,

OK, thanks. Sorry for the confusion. I was mixing up the ARCHER and DTN nodes.

I've added the line

. /work/y07/y07/umshared/bin/rose-um-env

to my ARCHER ~/.profile and have successfully submitted the suite.

Thanks for your help,

Leighton.

comment:8 Changed 8 months ago by Leighton_Regayre

Hi again,

I'm having some additional problems with this suite that are related tot he above ssh help request.

My suite (u-bh914) now runs to completion but fails on the pptransfer step. It looks like this is because I haven't correctly set up the connection from data transfer nodes to Jasmin. I've followed all steps here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin

However, I am denied permission when attempting to log into JASMIN using:
ssh jasmin-xfer2.ceda.ac.uk

Is this the sort of command implied by step 7 in the above link?

I took some advice from Jeremy Walton and Mark Richardson and included the following in my PUMA ~/.ssh/config file:

Host jasmin-login1.ceda.ac.uk
   User lregayre

Host jasmin-xfer2.ceda.ac.uk
# PubKeyAuthentication yes
User lregayre
IdentityFile ~/.ssh/id_rsa_jasmin
ForwardAgent no

Host jasmin-xfer1.ceda.ac.uk
# PubKeyAuthentication yes
User lregayre
IdentityFile ~/.ssh/id_rsa_jasmin
ForwardAgent no

but am still denied access to JASMIN from rdf.

Thanks in advance for your help with this,

Leighton.

Last edited 8 months ago by ros (previous) (diff)

comment:9 Changed 8 months ago by ros

Hi Leighton,

On dtn02 can you please run:

ssh-add -l

and then also:

ssh -vvv jasmin-xfer2.ceda.ac.uk

and send me the output of both so that I can your ssh-agent is running ok and hopefully see what might be going on.

Cheers,
Ros.

comment:10 follow-up: Changed 8 months ago by grenville

Leightion

Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)

comment:11 follow-up: Changed 8 months ago by ros

P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk

comment:12 in reply to: ↑ 11 Changed 8 months ago by Leighton_Regayre

Replying to ros:

P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk

Here is the output from all 3 commands:

[lre@dtn02 ~]$ ssh-add -l

[lre@dtn02 ~]$ ssh -vvv jasmin-xfer2.ceda.ac.uk

[lre@dtn02 ~]$ ssh jasmin-xfer1.ceda.ac.uk

[Information removed]

So, access denied for xfer2, but fine for xfer1 (after including the above "Host" commands in my RDF ~/.ssh/config file as well.

Last edited 7 weeks ago by andy (previous) (diff)

comment:13 in reply to: ↑ 10 Changed 8 months ago by Leighton_Regayre

Replying to grenville:

Leightion

Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)

Hi Grenville,

I think so. I can access xfer1, so I presume so. Is there a way for me to check?

comment:14 Changed 8 months ago by grenville

xfer1 is not one of the fast transfer machines. Sounds like you haven't.

Grenville

comment:15 follow-up: Changed 8 months ago by ros

Hi Leighton,

Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.

Cheers,
Ros.

comment:16 in reply to: ↑ 15 Changed 8 months ago by Leighton_Regayre

Replying to ros:

Hi Leighton,

Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.

Cheers,
Ros.

Hi both,

No, I haven't got access to the hpxfer service. I'll register as soon as I've worked out the ip address.

Thanks for your patience,

Leighton

comment:17 follow-up: Changed 8 months ago by ros

Hi Leighton,

You don't need to worry about the IP address as the DTN is already registered with JASMIN.

Cheers,
Ros.

comment:18 in reply to: ↑ 17 Changed 8 months ago by Leighton_Regayre

Replying to ros:

Hi Leighton,

You don't need to worry about the IP address as the DTN is already registered with JASMIN.

Cheers,
Ros.

Thanks for the tip Ros. The application page won't accept my request without a valid IP address, so I've got to enter something. Any suggestions?

Thanks,

Leighton.

comment:19 Changed 8 months ago by Leighton_Regayre

Hello,

Thanks for your help with this.

I've applied for hpxfer service access.

My mistake was in my interpretation of instruction 8 here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin

where I assumed being able to log into xfer1 was sufficient. It might be useful to provide an example with instruction 8 to help others avoid this mistake.

Also, I'm not sure how I would ssh into hpxfer1/2/3?

Thanks again,

Leighton.

Last edited 8 months ago by Leighton_Regayre (previous) (diff)

comment:20 Changed 7 months ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Leightion

ssh jasmin-xfer[1, 2,or 3].ceda.ac.uk

(xfer1 is not a fast transfer machine)

Grenville

Note: See TracTickets for help on using tickets.