Opened 6 years ago
Closed 6 years ago
#1435 closed error (answered)
weird glitch in submitting jobs to ARCHER
Reported by: | swr04ojb | Owned by: | annette |
---|---|---|---|
Component: | UM Model | Keywords: | puma archer ssh |
Cc: | Platform: | ARCHER | |
UM Version: | 8.5 |
Description
I'm finding today that when I submit jobs to ARCHER sometimes the job will fail. It will be at a point where it is trying to ssh something to ARCHER, but exactly which point seems to vary. If, upon encountering the error, I do nothing other than close the information window (by hitting OK), and press submit another time, then sometimes it will fail due to the same type of error but at a different point (e.g. doing an ssh for UMRECON rather than for UMSCRIPTS) and sometimes it will go through. Typically it doesn't seem to take more than 3 attempts to work, so it's not crippling, but it is a bit weird. Is this something to do with my ssh passwordless setup, something on PUMA, something on ARCHER, or something else entirely?
For info I've been mainly submitting xini#f and xini#g today.
Change History (4)
comment:1 Changed 6 years ago by swr04ojb
comment:2 Changed 6 years ago by annette
- Keywords puma archer ssh added
Oliver,
Sorry not to post a response sooner. We have seen this issue as well, so I don't /think/ it's do with your set up. The problem is that it is intermittent and non-repeatable so difficult to track down. We have done some investigation but are still not quite sure why it happens.
We are monitoring the problem, however, and at some point plan to upgrade puma which may fix things.
Annette
comment:3 Changed 6 years ago by annette
- Owner changed from um_support to annette
- Status changed from new to assigned
comment:4 Changed 6 years ago by annette
- Resolution set to answered
- Status changed from assigned to closed
Encountering the same problem again today. It's a new PUMA session, same job from yesterday (xini#f) and same problem - sometimes submit works, sometimes it doesn't.