Opened 8 years ago

Closed 8 years ago

#970 closed help (fixed)

Sent job to Monsoon using PUMA failed

Reported by: pjn Owned by: um_support
Component: MONSooN Keywords:
Cc: PUMA Platform: MONSooN
UM Version: 7.3

Description

Tried to save, process, submit job to IBM machine on MONSooN, but connection always timed out when submitting. Error: puma.nerc.ac.uk is not responding. Also cannot ssh into PUMA from lander (ssh: connect to host puma.nerc.ac.uk port 22: Connection refused.

I've set up the ssh config files etc. as given here:
http://cms.ncas.ac.uk/index.php/monsoon/1471-ssh-setup

Change History (3)

comment:1 Changed 8 years ago by ros

Dear Peer,

The issue of UM jobs timing out on submission is being investigated, but is proving incredibly tricky to track down as it is not reliably repeated, infact it's very random. Usually if you just try submitting again the job will then submit fine. I've just tried submitting a job and it's gone through ok.

You can't ssh from lander to PUMA. You can ssh from ibm02 and postproc to PUMA.

Regards,
Ros.

comment:2 Changed 8 years ago by pjn

OK, thank you.

comment:3 Changed 8 years ago by ros

  • Resolution set to fixed
  • Status changed from new to closed

Hi Peer,

The Met Office put a fix in a couple of weeks ago which we believe has fixed the time out problems with the MONSooN HPC/lander. If you continue to experience this problem, please do let us know. I will close this ticket now, however, you can re-open it should the problem be found to persist.

Regards,
Ros.

Note: See TracTickets for help on using tickets.