#3175 closed help (answered)
u-bq683 failure
Reported by: | jonathan | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | ||
UM Version: |
Description
After a few days of not progressing because of slowness of pptransfer my HadGEM3 job on nexcs has failed. Supposing it might just have timed out in some way I tried to restart it unchanged but got the following error messages:
xcslc0$ pwd
/home/d01/hadsa/roses/u-bq683
xcslc0$ rose suite-restart
[FAIL] cylc restart u-bq683 # return-code=1, stderr=
[FAIL] Traceback (most recent call last):
[FAIL] File "/common/fcm/cylc-7.8.3/bin/cylc-restart", line 25, in <module>
[FAIL] main(is_restart=True)
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/scheduler_cli.py", line 134, in main
[FAIL] scheduler.start()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/scheduler.py", line 237, in start
[FAIL] self.suite_db_mgr.restart_upgrade()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/suite_db_mgr.py", line 524, in restart_upgrade
[FAIL] pri_dao.vacuum()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/rundb.py", line 1031, in vacuum
[FAIL] return self.connect().execute("VACUUM")
[FAIL] sqlite3.OperationalError?: disk I/O error
Do you know what this mean?
Thanks
Jonathan
Change History (3)
comment:1 Changed 13 months ago by grenville
comment:2 Changed 13 months ago by jonathan
Dear Grenville
Thanks. No, cylc gscan shows nothing. There's a similar error from suite-run —restart:
xcslc0$ pwd
/home/d01/hadsa/roses/u-bq683
xcslc0$ rose suite-run —restart
[INFO] export CYLC_VERSION=7.8.3
[INFO] export ROSE_ORIG_HOST=xcslc0
[INFO] export ROSE_SITE=
[INFO] export ROSE_VERSION=2019.01.2
[INFO] delete: log/rose-suite-run.conf
[INFO] symlink: rose-conf/20200205T124439-restart.conf ⇐ log/rose-suite-run.conf
[INFO] delete: log/rose-suite-run.version
[INFO] symlink: rose-conf/20200205T124439-restart.version ⇐ log/rose-suite-run.version
[INFO] chdir: log/
[FAIL] cylc restart u-bq683 # return-code=1, stderr=
[FAIL] Traceback (most recent call last):
[FAIL] File "/common/fcm/cylc-7.8.3/bin/cylc-restart", line 25, in <module>
[FAIL] main(is_restart=True)
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/scheduler_cli.py", line 134, in main
[FAIL] scheduler.start()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/scheduler.py", line 237, in start
[FAIL] self.suite_db_mgr.restart_upgrade()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/suite_db_mgr.py", line 524, in restart_upgrade
[FAIL] pri_dao.vacuum()
[FAIL] File "/common/fcm/cylc-7.8.3/lib/cylc/rundb.py", line 1031, in vacuum
[FAIL] return self.connect().execute("VACUUM")
[FAIL] sqlite3.OperationalError?: disk I/O error
[FAIL] ERROR: command terminated by signal 1: ssh -oBatchMode=yes -oConnectTimeout=8 -oStrictHostKeyChecking=no -n xcslc1 env CYLC_VERSION=7.8.3 bash —login -c "'"'exec "$0" "$@"'"'" cylc restart u-bq683 —host=localhost
Cheers
Jonathan
comment:3 Changed 13 months ago by grenville
- Resolution set to answered
- Status changed from new to closed
see #3182
Jonathan
Do you see anything after typing 'cylc gscan' on NEXCS?
I don't know what trace means - please try 'rose suite-run —restart' (from '/home/d01/hadsa/roses/u-bq683')
Grenville