Opened 8 weeks ago

Closed 8 weeks ago

#3243 closed help (fixed)

Error submitting jobs to archer

Reported by: emnicki Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: PUMA
UM Version: 8.4

Description

Dear CMS,

When attempting to submit a job using the UMUI I get the following error:
ERROR: Timed out, login.archer.ac.uk not responding while attempting to access account emnicki on host login.archer.ac.uk. Note that repeated failures may result in expiry of password due to security procedures on some machines. Check user id, hostname and password for your account on the host machine.

The error seems to be intermittent as on one occasion the job did submit successfully, while the subsequent one failed again.

Would you have any ideas how I can resolve this?

Thanks,

Nicola

Change History (10)

comment:1 Changed 8 weeks ago by dcase

Can you ssh to archer without any password issues (is ssh-add -l showing your id)?

If the connection is ok, then please give me the suite id and I can try to read the files.

comment:2 Changed 8 weeks ago by emnicki

Yes, I can ssh to archer without any issue.
If I type ssh-add -l, I get a list of letters/numbers, followed by: /home/emnicki/.ssh/id_rsa (RSA)
My UMUI job ids are xnbvf, xolbe. I've been running both jobs for a while and I've not had any trouble before today (no changes to anything other than length of run).
I've just noticed that I was over quota in my home directory on archer, but reducing that doesn't seem to have fixed the issue.

Thanks,
Nicola

comment:3 Changed 8 weeks ago by dcase

It does sound like something's at fault at either PUMA or ARCHER. We are aware of issues at the PUMA end, and my colleague emailed ARCHER yesterday, so I'll let you know if there are announcements.

Hopefully it starts working as it did previously. I'm sorry I can't diagnose anything more precisely at this moment, but I'll keep an eye on this ticket.

comment:4 Changed 8 weeks ago by dcase

As a minor update- there may be an issue with connections from Reading to ARCHER generally, but there is an anecdotal suggestion that trying with login1.archer.ac.uk as the host seems to be working for someone, so if you want to try changing your host to be explicitly this node, it may be worth a go.

comment:5 Changed 8 weeks ago by emnicki

As an update, I am still unable to submit anything from puma to archer today. I tried changing the host name to login1.archer.ac.uk in Model selection>User Information and Submit Method>Job submission details, but I then get the error message below on processing. Is there somewhere else I also need to change the host name?

Thanks,
Nicola

ERROR: Execution error in processing file nds_prog_env, line 7: can't read "nm_method": no such variable
STACK TRACE:

can't read "nm_method": no such variable

while executing

"putl "# Load default programming environment for $nm_method""

(procedure "run_processing" line 38069)
invoked from within

"run_processing"

comment:6 Changed 8 weeks ago by dcase

Ah, maybe it's not possible with the UMUI- sorry for the false lead.

There is a certain amount going on behind the scenes on this, so maybe we all have to wait? I'll let you know if I find out more.

comment:7 Changed 8 weeks ago by emnicki

Ok, thanks.

comment:8 Changed 8 weeks ago by andy

Hi Nicola,

Can you try again with the original login.archer.ac.uk? We hope this was caused by an incorrect reverse DNS entry for Puma which has now been fixed.

Thanks
Andy

comment:9 Changed 8 weeks ago by emnicki

Hi Andy,
I've just tried and it worked first time, so hopefully that was the problem!

Thanks,
Nicola

comment:10 Changed 8 weeks ago by andy

  • Resolution set to fixed
  • Status changed from new to closed

Hi Nicola,

Great! I'll close the ticket but do get back to us if it reoccurs.

Thanks
Andy

Note: See TracTickets for help on using tickets.