Opened 21 months ago

Closed 21 months ago

Last modified 21 months ago

#2897 closed help (fixed)

suite u-be699 - coupled task fails

Reported by: xd904476 Owned by: ros
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version:


I am having issues with my suite run again: ever since yesterday the coupled task of 2090 gets submitted and it fails after a short amount of run time. I am not sure about why.
Any suggestions?

Change History (6)

comment:1 Changed 21 months ago by willie

Hi, What computer are you running the suite on? What error messages are you seeing?


comment:2 Changed 21 months ago by ros

  • Owner changed from um_support to ros
  • Platform set to ARCHER
  • Status changed from new to accepted

Hi Dani,

You have somehow managed to get multiple occurances of the coupled task running for the current cycle and they are unsurprisingly interferring with each other. Can you please kill the currently running coupled task in the cylc GUI and then on ARCHER do:

qstat -u dflocco

and then kill the listed jobs with qdel <job id>.

Then retrigger the coupled task.


comment:3 Changed 21 months ago by xd904476

  • Resolution set to fixed
  • Status changed from accepted to closed

Hi Willie and Ros,
yesterday a couple task was showing a failure and was retrying very often. Not sure about why it would be running moltiple times. I think at some point I did set it to "ready" after a failure.

I've set it to run again now. Thanks

comment:4 Changed 21 months ago by ros

Hi Dani,

I took a look at your logs and it looks like there was an issue with qsub on ARCHER which meant the cylc couldn't check on the status of the job and thus deemed it had failed. It then tried 5 or so times to resubmit it until it succeeded and thus you ended up with 2 jobs running.

I've sent the details to ARCHER.


comment:5 Changed 21 months ago by ros

CMS note for completeness: Cray identified that this problem was due to a faulty blade which now been resolved.

comment:6 Changed 21 months ago by xd904476

Thanks for the follow up.

Note: See TracTickets for help on using tickets.