Opened 5 months ago

Closed 5 months ago

#2806 closed help (fixed)

Suite will not run - empty gcylc window

Reported by: shakka Owned by: ros
Component: UM Model Keywords: submission, cylc
Cc: Platform: Monsoon2
UM Version: 11.1

Description

Hi helpdesk,

I am trying to re-run a suite that has previously run fine, u-ba502. When I submit the run, either via the GUI or from the command line, I get the usual output messages, but then an empty gcylc window pops up with the message 'Stopped with queued' at the bottom. I am currently able to run other suites, so I am fairly certain it is something to do with that particular suite. Do you have any ideas what is going on?

Best,
Ella

Change History (3)

comment:1 Changed 5 months ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

In the log/suite/log file there is an error accessing a temporary directory which is then causing the suite to shut down.

2019-03-11T11:18:39Z ERROR - [Errno 2] No such file or directory: '/working/d04/elgil/jtmp/tmp.FynWZnXoSv/tmpFexvGw'
        Traceback (most recent call last):
          File "/common/fcm/cylc-7.8.1/lib/cylc/scheduler.py", line 261, in start
            self.run()
          File "/common/fcm/cylc-7.8.1/lib/cylc/scheduler.py", line 1664, in run
            self.process_task_pool()
          File "/common/fcm/cylc-7.8.1/lib/cylc/scheduler.py", line 1283, in process_task_pool
            self.suite, itasks, self.run_mode == 'simulation')
          File "/common/fcm/cylc-7.8.1/lib/cylc/task_job_mgr.py", line 214, in submit_task_jobs
            is_init = self.task_remote_mgr.remote_init(host, owner)
          File "/common/fcm/cylc-7.8.1/lib/cylc/task_remote_mgr.py", line 200, in remote_init
            tmphandle = NamedTemporaryFile()
          File "/usr/lib64/python2.6/tempfile.py", line 444, in NamedTemporaryFile
            (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags)
          File "/usr/lib64/python2.6/tempfile.py", line 228, in _mkstemp_inner
            fd = _os.open(file, flags, 0600)
        OSError: [Errno 2] No such file or directory: '/working/d04/elgil/jtmp/tmp.FynWZnXoSv/tmpFexvGw'
2019-03-11T11:18:39Z ERROR - error caught: cleaning up before exit
2019-03-11T11:18:39Z INFO - Suite shutting down - ERROR: [Errno 2] No such file or directory: '/working/d04/elgil/jtmp/tmp.FynWZnXoSv/tmpFexvGw'
2019-03-11T11:18:42Z INFO - [('suite-event-handler-00', 'shutdown') cmd] rose suite-hook --mail 'shutdown' 'u-ba502' 'ERROR: [Errno 2] No such file or directo
ry: '/working/d04/elgil/jtmp/tmp.FynWZnXoSv/tmpFexvGw''

It's the cold start task that has the problem as it's being run on postproc which is know to be troublesome in some setups. Please try changing it to run on localhost - the suite works for me then. So that's in site/monsoon-cray-xc40/suite-adds.rc at the top change IDL_SERVER to be localhost.

Cheers,
Ros.

comment:2 Changed 5 months ago by shakka

Thanks Ros, this has worked. Ella

comment:3 Changed 5 months ago by shakka

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.