Opened 6 weeks ago

Closed 4 weeks ago

#3065 closed error (fixed)

No hosts selected on archer

Reported by: luciad Owned by: um_support
Component: PUMA Keywords: hosts unknown
Cc: Platform: ARCHER
UM Version: 11.1

Description

Hello,

Recently my suites have been transfered from PUMA to pumatest, and now I can't seem to be able to run them. I have followed the instructions from http://cms.ncas.ac.uk/wiki/RoseCylc/Hints in order to set up the known hosts in the ssh, and everything looks good, until I run the model. This is the error:
[INFO] REGISTERED u-bo228 → /home/luciad/cylc-run/u-bo228
[FAIL] bash -ec H=$(rose\ host-select\ archer);\ echo\ $H # return-code=1, stderr=
[FAIL] [WARN] login2.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login1.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login3.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login5.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login8.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login4.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login6.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login7.archer.ac.uk: (ssh failed)
[FAIL] [WARN] login.archer.ac.uk: (ssh failed)
[FAIL] [FAIL] No hosts selected.

Could you please advise on this matter?

Thank you,
Lucia

Change History (9)

comment:1 Changed 6 weeks ago by luciad

It's what I have been trying to say: I followed those instructions and it didn't work.
But I just received Grenville's email regarding the number of files for n02 users, and I assume that could be the issue?

Lucia

comment:2 Changed 6 weeks ago by dcase

Can you just confirm that if you type:

rose config rose-host-select group{archer}

you get a list of nodes back. And if you type rose host-select archer you get one of those nodes back?

comment:3 Changed 6 weeks ago by luciad

This is what I get when I type those commands.
So the list of nodes is there, but none is available.

-bash-4.1$ rose config rose-host-select group{archer}
-bash: {{{rose: command not found
-bash-4.1$ rose config rose-host-select group{archer}
login1.archer.ac.uk login2.archer.ac.uk login3.archer.ac.uk login4.archer.ac.uk login5.archer.ac.uk login6.archer.ac.uk login7.archer.ac.uk login8.archer.ac.uk login.archer.ac.uk
-bash-4.1$ rose host-select archer
[WARN] login3.archer.ac.uk: (ssh failed)
[WARN] login2.archer.ac.uk: (ssh failed)
[WARN] login4.archer.ac.uk: (ssh failed)
[WARN] login8.archer.ac.uk: (ssh failed)
[WARN] login5.archer.ac.uk: (ssh failed)
[WARN] login6.archer.ac.uk: (ssh failed)
[WARN] login.archer.ac.uk: (ssh failed)
[WARN] login7.archer.ac.uk: (ssh failed)
[WARN] login1.archer.ac.uk: (ssh failed)
[FAIL] No hosts selected.

Lucia

comment:4 Changed 6 weeks ago by dcase

In the setup notes that you mentioned, there's the command:

~um/um-training/setup-archer-hosts

Could you retry this, and see if the

rose host-select archer

command works now?

comment:5 Changed 6 weeks ago by luciad

So, it connects, but then fails to recognize any of the hosts.

-bash-4.1$ ~um/um-training/setup-archer-hosts
Connecting to ARCHER hosts…
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login1.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login2.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login3.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login4.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login5.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login6.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login7.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login8.archer.ac.uk
Enter passphrase for key '/home/luciad/.ssh/id_rsa':
Connected to login.archer.ac.uk
-bash-4.1$
-bash-4.1$
-bash-4.1$ rose host-select archer
[WARN] login1.archer.ac.uk: (ssh failed)
[WARN] login4.archer.ac.uk: (ssh failed)
[WARN] login5.archer.ac.uk: (ssh failed)
[WARN] login6.archer.ac.uk: (ssh failed)
[WARN] login2.archer.ac.uk: (ssh failed)
[WARN] login3.archer.ac.uk: (ssh failed)
[WARN] login.archer.ac.uk: (ssh failed)
[WARN] login8.archer.ac.uk: (ssh failed)
[WARN] login7.archer.ac.uk: (ssh failed)
[FAIL] No hosts selected.

comment:6 Changed 6 weeks ago by dcase

And if you type ssh-add -l you see the .ssh/id_rsa (which is archer's)
And if you pick one of those yourself, for e.g. ssh login2.archer.ac.uk you get into archer with no need for a password, and no comments from ssh ?

Sorry, I'm about to leave I'll check tomorrow morning if you can't debug it by then

Dave

comment:7 Changed 4 weeks ago by dcase

Lucia, you currently have 2 tickets open relating to ssh issues. Is this one solved? Can you run the commands that I suggested above satisfactorily?

If so I will close the ticket, but I will continue to help you with your other ticket.

comment:8 Changed 4 weeks ago by luciad

Hi,

Yes this has been solved a while ago. Thank you.
You can close this ticket.

Lucia

comment:9 Changed 4 weeks ago by dcase

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.