Opened 3 years ago

Closed 3 years ago

#2383 closed help (fixed)

Can't submit job from Puma through Rose

Reported by: apm Owned by: um_support
Component: UM Model Keywords: GO6, postproc
Cc: Platform: ARCHER
UM Version:


I have been unable to submit jobs on Archer from Puma since the start of this week.

When I click on the submit button on the Rose edit screen for either of my suites u-ao882 or u-ao922, the cylc window opens up and, a moment later, both of the fem-make_ocean and fcm-make-pp boxes turn magenta with the status "submit-failed". If I click on View Job Activity Log, I get this:

[job-submit cmd] cylc jobs-submit — /home/apm/cylc-run/u-ao882/log/job 19920101T0000Z/fcm_make_ocean/01 19920101T0000Z/fcm_make_pp/01
[job-submit ret_code] 1
[job-submit out] 2018-02-02T14:55:08Z|19920101T0000Z/fcm_make_ocean/01|1|None
2018-02-02T14:55:08Z [STDERR] warning: commands will be executed using /bin/sh
2018-02-02T14:55:08Z [STDERR] job 9029 at 2018-02-02 14:55
2018-02-02T14:55:08Z [STDERR] Warning: at daemon not running
[(('event-handler-00', 'submission failed'), 1) cmd] rose suite-hook 'submission failed' 'u-ao882' 'fcm_make_ocean.19920101T0000Z' 'job submission failed'
[(('event-handler-00', 'submission failed'), 1) ret_code] 0

What is going on? What does "at daemon not running" mean? This is a new failure mode to me.



Change History (7)

comment:1 Changed 3 years ago by andy

  • Resolution set to fixed
  • Status changed from new to closed

Hi Alex,

The at daemon had died on puma. I've restarted it so you should be able to submit jobs again. I'll close the ticket on the assumption that this is now fixed but please reopen if it still doesn't work.


comment:2 Changed 3 years ago by apm

Thanks, Andy.

I'm afraid I get exactly the same error, even after I have closed Rose, logged out of Puma and then start over again. It still claims the at daemon is not running.


comment:3 Changed 3 years ago by jeff

Hi Alex

See ticket #2377 for a way to fix this.


comment:4 Changed 3 years ago by apm

Hi Jeff,

Thanks for that. I changed "at" for "background" in the suite.rc files and as a result the compile job doesn't fail immediately.

Now fcm_make_pp fails with this error:

[FAIL] config-file=/home/apm/cylc-run/u-ao922/work/20060101T0000Z/fcm_make_pp/fcm-make.cfg:4
[FAIL] config-file= - svn://puma/moci.xm_svn/main/branches/dev/rosalynhatcher/postproc_2.0_pp_with_rose_bunch/Postprocessing/fcm_make/postproc.cfg@HEAD
[FAIL] svn://puma/moci.xm_svn/main/branches/dev/rosalynhatcher/postproc_2.0_pp_with_rose_bunch/Postprocessing/fcm_make/postproc.cfg@HEAD: cannot load config file
[FAIL] svn://puma/moci.xm_svn/main/branches/dev/rosalynhatcher/postproc_2.0_pp_with_rose_bunch/Postprocessing/fcm_make/postproc.cfg@HEAD: not found
[FAIL] svn: warning: W170000: URL 'svn://puma/moci.xm_svn/main/branches/dev/rosalynhatcher/postproc_2.0_pp_with_rose_bunch/Postprocessing/fcm_make/postproc.cfg' non-existent in revision 2450
[FAIL] svn: E200009: Could not display info for all targets because some targets don't exist

[FAIL] fcm make -f /home/apm/cylc-run/u-ao922/work/20060101T0000Z/fcm_make_pp/fcm-make.cfg -C /home/apm/cylc-run/u-ao922/share/fcm_make_pp -j 4…:cylc-run/u-ao922/share/fcm_make_pp mirror.prop{}=2 # return-code=1
Received signal ERR


comment:5 Changed 3 years ago by ros

  • Resolution fixed deleted
  • Status changed from closed to reopened

Hi Alex,

Sorry I deleted that branch the other day. You can still use it but will need to specify the revision number 1859.

In panel fcm_make_pp → Configuration set config_rev to @1859 and also add this to the branch in pp_sources


comment:6 Changed 3 years ago by apm

Thanks, Ros - that's fixed it.



comment:7 Changed 3 years ago by ros

  • Keywords GO6, postproc added
  • Platform set to ARCHER
  • Resolution set to fixed
  • Status changed from reopened to closed
  • UM Version <select version> deleted
Note: See TracTickets for help on using tickets.