Opened 2 months ago

Closed 6 weeks ago

#2878 closed help (answered)

Remote login to ARCHER from PUMA; Rose UM submission

Reported by: Leighton_Regayre Owned by: um_support
Component: ARCHER Keywords: PUMA ssh
Cc: Platform: PUMA
UM Version: 11.1

Description

Hello,

I'm having trouble submitting a job (u-bh909) to ARCHER from within Rose on PUMA. Error messages in the log suggest this is because the ssh to ARCHER is failing, ending with a "no hosts selected" error.

I've followed the advice for the FAQ:
ssh-add gives error message: Could not open a connection to your authentication agent.
which was relevant to me.

I've confirmed that I can access ARCHER without entering a password or passphrase.

Change History (20)

comment:1 in reply to: ↑ description Changed 2 months ago by Leighton_Regayre

Please note that I've deleted suite u-bh909 which is a copy of a suite which was passed to me with incorrect postprocessing settings. Suite u-bh914 has the same issue with ssh to ARCHER, but no postprocessing errors.

Thanks,

Leighton.

comment:2 Changed 2 months ago by grenville

comment:3 Changed 2 months ago by Leighton_Regayre

Hi Grenville,

No, I'd overlooked the required setup. I presumed that picking up a job configured for ARCHER use it would be ready to run.

I've included the suggested lines in my config file and also included the following:

Host dtn*.rdf.ac.uk

User lre

which Chris Symonds suggested was needed for postprocessing and transfer of data to the dtn.

I followed all instructions on the link you sent and can now log into rdf as well as ARCHER without password/passphrase. I can also log into JASMIN from rdf without password/passphrase.

The original problem has been solved, but I'm still having trouble with the submission. The current error is a "RosePopenError?"/"bash: rose: command not found" and looks to be related to ssh. Am I right in thinking this is related to post postproc? I've attempted to follow advice given in ticket #2746 by including cylc and rose versions in my .profile and .bash_profile which match my puma versions, but the error persists.

Thanks for your help,

Leighton.

comment:4 Changed 2 months ago by ros

Hi Leighton,

Not sure what the line . /work/n02/n02/lre/rose/ is in your ~/.profile on dtn for? That file doesn't exist. Please remove this line and try again.

Cheers,
Ros.

comment:5 Changed 2 months ago by Leighton_Regayre

Hi Ros,

I've removed that line but have the same problem. Do I need something like this in my .profile?

# Setup environment for running the UM under Rose
. /work/y07/y07/umshared/bin/rose-um-env

That was part of Dave's suggestion in ticket #2746.

Thanks,

Leighton.

Last edited 2 months ago by Leighton_Regayre (previous) (diff)

comment:6 Changed 2 months ago by ros

Hi Leighton.

Sorry I thought you were talking about the DTN not ARCHER login nodes!

Yes you need

. /work/y07/y07/umshared/bin/rose-um-env

in your ~/.profile on ARCHER. Please don't set cylc, rose, fcm versions in there otherwise you will not pick up essential new versions when they are released. The system is set up to pick up the correct defaults across PUMA and ARCHER. You should only set the versions explicitly if you need to use non-standard versions.

Cheers,
Ros.

comment:7 Changed 2 months ago by Leighton_Regayre

Hi Ros,

OK, thanks. Sorry for the confusion. I was mixing up the ARCHER and DTN nodes.

I've added the line

. /work/y07/y07/umshared/bin/rose-um-env

to my ARCHER ~/.profile and have successfully submitted the suite.

Thanks for your help,

Leighton.

comment:8 Changed 2 months ago by Leighton_Regayre

Hi again,

I'm having some additional problems with this suite that are related tot he above ssh help request.

My suite (u-bh914) now runs to completion but fails on the pptransfer step. It looks like this is because I haven't correctly set up the connection from data transfer nodes to Jasmin. I've followed all steps here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin

However, I am denied permission when attempting to log into JASMIN using:
ssh jasmin-xfer2.ceda.ac.uk

Is this the sort of command implied by step 7 in the above link?

I took some advice from Jeremy Walton and Mark Richardson and included the following in my PUMA ~/.ssh/config file:

Host jasmin-login1.ceda.ac.uk
   User lregayre

Host jasmin-xfer2.ceda.ac.uk
# PubKeyAuthentication yes
User lregayre
IdentityFile ~/.ssh/id_rsa_jasmin
ForwardAgent no

Host jasmin-xfer1.ceda.ac.uk
# PubKeyAuthentication yes
User lregayre
IdentityFile ~/.ssh/id_rsa_jasmin
ForwardAgent no

but am still denied access to JASMIN from rdf.

Thanks in advance for your help with this,

Leighton.

Last edited 2 months ago by ros (previous) (diff)

comment:9 Changed 2 months ago by ros

Hi Leighton,

On dtn02 can you please run:

ssh-add -l

and then also:

ssh -vvv jasmin-xfer2.ceda.ac.uk

and send me the output of both so that I can your ssh-agent is running ok and hopefully see what might be going on.

Cheers,
Ros.

comment:10 follow-up: Changed 2 months ago by grenville

Leightion

Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)

comment:11 follow-up: Changed 2 months ago by ros

P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk

comment:12 in reply to: ↑ 11 Changed 2 months ago by Leighton_Regayre

Replying to ros:

P.S. Please also try ssh jasmin-xfer1.ceda.ac.uk

Here is the output from all 3 commands:

[lre@dtn02 ~]$ ssh-add -l
2048 0c:4f:74:55:46:6b:c3:b3:f8:f0:04:1a:fd:9f:48:76 /nerc/n02/n02/lre/.ssh/id_rsa_jasmin (RSA)

[lre@dtn02 ~]$ ssh -vvv jasmin-xfer2.ceda.ac.uk
OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: Reading configuration data /nerc/n02/n02/lre/.ssh/config
debug1: Applying options for jasmin-xfer2.ceda.ac.uk
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug2: ssh_connect: needpriv 0
debug1: Connecting to jasmin-xfer2.ceda.ac.uk [130.246.128.244] port 22.
debug1: Connection established.
debug3: Not a RSA1 key file /nerc/n02/n02/lre/.ssh/id_rsa_jasmin.
debug2: key_type_from_name: unknown key type '——-BEGIN'
debug3: key_read: missing keytype
debug2: key_type_from_name: unknown key type 'Proc-Type:'
debug3: key_read: missing keytype
debug2: key_type_from_name: unknown key type 'DEK-Info:'
debug3: key_read: missing keytype
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug2: key_type_from_name: unknown key type '——-END'
debug3: key_read: missing keytype
debug1: identity file /nerc/n02/n02/lre/.ssh/id_rsa_jasmin type 1
debug1: identity file /nerc/n02/n02/lre/.ssh/id_rsa_jasmin-cert type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3
debug1: match: OpenSSH_5.3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.3
debug2: fd 3 setting O_NONBLOCK
debug1: SSH2_MSG_KEXINIT sent
debug3: Wrote 864 bytes for a total of 885
debug1: SSH2_MSG_KEXINIT received
debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa-cert-v01@…,ssh-dss-cert-v01@…,ssh-rsa-cert-v00@…,ssh-dss-cert-v00@…,ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@…
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,rijndael-cbc@…
debug2: kex_parse_kexinit: hmac-sha1,umac-64@…,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@…,hmac-sha1-96
debug2: kex_parse_kexinit: hmac-sha1,umac-64@…,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@…,hmac-sha1-96
debug2: kex_parse_kexinit: none,zlib@…,zlib
debug2: kex_parse_kexinit: none,zlib@…,zlib
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@…
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@…
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@…,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@…,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@…,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@…,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zlib@…
debug2: kex_parse_kexinit: none,zlib@…
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: mac_setup: found hmac-sha1
debug1: kex: server→client aes128-ctr hmac-sha1 none
debug2: mac_setup: found hmac-sha1
debug1: kex: client→server aes128-ctr hmac-sha1 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<2048<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug3: Wrote 24 bytes for a total of 909
debug2: dh_gen_key: priv key bits set: 150/320
debug2: bits set: 979/2048
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug3: Wrote 272 bytes for a total of 1181
debug3: check_host_in_hostfile: host jasmin-xfer2.ceda.ac.uk filename /nerc/n02/n02/lre/.ssh/known_hosts
debug3: check_host_in_hostfile: host jasmin-xfer2.ceda.ac.uk filename /nerc/n02/n02/lre/.ssh/known_hosts
debug3: check_host_in_hostfile: match line 4
debug3: check_host_in_hostfile: host 130.246.128.244 filename /nerc/n02/n02/lre/.ssh/known_hosts
debug3: check_host_in_hostfile: host 130.246.128.244 filename /nerc/n02/n02/lre/.ssh/known_hosts
debug3: check_host_in_hostfile: match line 4
debug1: Host 'jasmin-xfer2.ceda.ac.uk' is known and matches the RSA host key.
debug1: Found key in /nerc/n02/n02/lre/.ssh/known_hosts:4
debug2: bits set: 1003/2048
debug1: ssh_rsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug3: Wrote 16 bytes for a total of 1197
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug3: Wrote 52 bytes for a total of 1249
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /nerc/n02/n02/lre/.ssh/id_rsa_jasmin (0x2b3237e8ddd0)
debug3: Wrote 68 bytes for a total of 1317
debug3: input_userauth_banner

Access to this system is monitored and restricted to
authorised users. If you do not have authorisation
to use this system, you should not proceed beyond
this point and should disconnect immediately.

Unauthorised use could lead to prosecution.

(See also - http://www.stfc.ac.uk/aup)

debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic
debug3: start over, passed a different list publickey,gssapi-keyex,gssapi-with-mic
debug3: preferred gssapi-keyex,gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_lookup gssapi-keyex
debug3: remaining preferred: gssapi-with-mic,publickey,keyboard-interactive,password
debug3: authmethod_is_enabled gssapi-keyex
debug1: Next authentication method: gssapi-keyex
debug1: No valid Key exchange context
debug2: we did not send a packet, disable method
debug3: authmethod_lookup gssapi-with-mic
debug3: remaining preferred: publickey,keyboard-interactive,password
debug3: authmethod_is_enabled gssapi-with-mic
debug1: Next authentication method: gssapi-with-mic
debug3: Trying to reverse map address 130.246.128.244.
debug1: Unspecified GSS failure. Minor code may provide more information
Credentials cache file '/tmp/krb5cc_15323' not found

debug1: Unspecified GSS failure. Minor code may provide more information
Credentials cache file '/tmp/krb5cc_15323' not found

debug2: we did not send a packet, disable method
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering public key: /nerc/n02/n02/lre/.ssh/id_rsa_jasmin
debug3: send_pubkey_test
debug2: we sent a publickey packet, wait for reply
debug3: Wrote 372 bytes for a total of 1689
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic
debug2: we did not send a packet, disable method
debug1: No more authentication methods to try.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

[lre@dtn02 ~]$ ssh jasmin-xfer1.ceda.ac.uk

Access to this system is monitored and restricted to
authorised users. If you do not have authorisation
to use this system, you should not proceed beyond
this point and should disconnect immediately.

Unauthorised use could lead to prosecution.

(See also - http://www.stfc.ac.uk/aup)

Last login: Fri Apr 26 14:49:44 2019 from dtn02-priv.hector.ac.uk

RAL High Performance Computing Services Group

Configured by PXE/Kickstart: 2012-08-14 15:57

Admin contact: Cristina Novales <cristina.del-cano-novales@…>

Additional information about JASMIN can be found at: http://jasmin.ac.uk

For support please contact CEDA Helpdesk: support@…
[lregayre@jasmin-xfer1 ~]$

So, access denied for xfer2, but fine for xfer1 (after including the above "Host" commands in my RDF ~/.ssh/config file as well.

comment:13 in reply to: ↑ 10 Changed 2 months ago by Leighton_Regayre

Replying to grenville:

Leightion

Have you registered to use the fast transfer service (https://help.jasmin.ac.uk/article/4414-data-transfer-hpxfer)

Hi Grenville,

I think so. I can access xfer1, so I presume so. Is there a way for me to check?

comment:14 Changed 2 months ago by grenville

xfer1 is not one of the fast transfer machines. Sounds like you haven't.

Grenville

comment:15 follow-up: Changed 2 months ago by ros

Hi Leighton,

Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.

Cheers,
Ros.

comment:16 in reply to: ↑ 15 Changed 2 months ago by Leighton_Regayre

Replying to ros:

Hi Leighton,

Login to the JASMIN accounts portal (https://accounts.jasmin.ac.uk), click on manage services and check that you have access to the hpxfer service.

Cheers,
Ros.

Hi both,

No, I haven't got access to the hpxfer service. I'll register as soon as I've worked out the ip address.

Thanks for your patience,

Leighton

comment:17 follow-up: Changed 2 months ago by ros

Hi Leighton,

You don't need to worry about the IP address as the DTN is already registered with JASMIN.

Cheers,
Ros.

comment:18 in reply to: ↑ 17 Changed 2 months ago by Leighton_Regayre

Replying to ros:

Hi Leighton,

You don't need to worry about the IP address as the DTN is already registered with JASMIN.

Cheers,
Ros.

Thanks for the tip Ros. The application page won't accept my request without a valid IP address, so I've got to enter something. Any suggestions?

Thanks,

Leighton.

comment:19 Changed 2 months ago by Leighton_Regayre

Hello,

Thanks for your help with this.

I've applied for hpxfer service access.

My mistake was in my interpretation of instruction 8 here:
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppArcherSetup#sshdtntojasmin

where I assumed being able to log into xfer1 was sufficient. It might be useful to provide an example with instruction 8 to help others avoid this mistake.

Also, I'm not sure how I would ssh into hpxfer1/2/3?

Thanks again,

Leighton.

Last edited 2 months ago by Leighton_Regayre (previous) (diff)

comment:20 Changed 6 weeks ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Leightion

ssh jasmin-xfer[1, 2,or 3].ceda.ac.uk

(xfer1 is not a fast transfer machine)

Grenville

Note: See TracTickets for help on using tickets.