Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#1337 closed help (answered)

um6.1 hanging on hector

Reported by: jonny Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 6.1

Description

Hi,
I have a um job (xjurb) which runs for about a model month, then stops producing output until it hit's it's walltime limit. The .leav file is here, but is not particularly informative:

/home/n02/n02/jonny/um/umui_out/xjurb000.xjurb.d14216.t150623.leave

This long control run was running fine, as a crun, until Jul 30th.

Thanks
Jonny

Change History (3)

comment:1 Changed 5 years ago by willie

Hi Jonny,

I think the issue is that the last four CRUNS have all hit the 6hr queue limit and been terminated. The rest of the runs have all taken less than 5.5 hrs. If you could do less and ensure that it fits within the 6 hr limit comfortably then it should work.

Regards

Willie

comment:2 Changed 5 years ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Since there has been no activity we've closed the ticket

comment:3 Changed 5 years ago by jonny

Hi Willie,
To clarify, I am talking about recent nruns with the same ID. I am running one currently. According to the queue on Archer, the job is running. However, it has not produced any output in the last 2 hours. That's what I mean by it "hanging". I wouldn't expect this behaviour if the job was simply running out of wall time. In this case, I would expect the job to run like normal, producing output until it ran out of time.

Any thoughts?

Cheers,
Jonny

Note: See TracTickets for help on using tickets.