Opened 22 months ago
Closed 22 months ago
#2848 closed help (fixed)
Job stuck at 'submitted' and having problems at 'postproc': u-bh050
Reported by: | cwc46 | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | ARCHER | |
UM Version: | 11.0 |
Description
Dear Helpdesk,
I am having trouble with one of my runs - u-bh050 is stuck currently at 'submitted', and I am sometimes getting error with 'postproc' when I kill the job and try to restart.
I am unable to solve the problem, and in particular I am questioning why it is not creating the required directory on my ARCHER /work, which is probably causing it to fail at 'postproc'..
Thank you for your help!
Glen
Change History (4)
comment:1 Changed 22 months ago by ros
comment:2 Changed 22 months ago by ros
Hi Glen,
Overnight your suite has progressed fine the 6103953.sdb atmos_main finished fine and postproc has run successfully and the next 3 cycles have run.
Cheers,
Ros.
comment:3 Changed 22 months ago by cwc46
Dear Ros,
Ah, thank you very much for your help!
Best wishes,
Glen
comment:4 Changed 22 months ago by grenville
- Resolution set to fixed
- Status changed from new to closed
Hi Glen,
I don't know what has happened but you have 7 atoms_main jobs for u-bh050 sitting in the queue on Archer (qstat -u glenchua).
Looks like you have been retriggering the atoms_main tasks and thus submitting multiple instances? So the reason postproc is failing is because the atoms_main for the cycle has not run yet hence being in status submitted.
I would suggest stopping your suite (choosing the option to kill all running tasks). Then on ARCHER use qdel (e.g. qdel 6103953.sdb) l to kill all your queuing jobs in the list above.
Then restart the suite, retrigger the atoms_main task in the 19910801 cycle and change the status of the associated postproc task to waiting. That should get it back on track.
Regards,
Ros.