wiki:RoseCylc/Hints

Version 29 (modified by annette, 19 months ago) (diff)

Useful information for running with Rose

See also (redirects to Collaboration Twiki):

Hints and tips

Switching versions of Rose and/or cylc

  • On puma and MONSooN export the variables: CYLC_VERSION=x.y.z and ROSE_VERSION=YYYY.MM.DD
  • On Archer use module switch command. It is advisable to use the same versions of Rose and Cylc on puma and Archer.

Viewing the suite run graph without running

When developing suites, it can be useful to check what the run graph looks like after jinja evaluation etc. To do this without running the suite:

rose suite-run -i --name=puma-aa045  # install suite in cylc db only
cylc graph puma-aa045                # view graph in browser       

To just view the dependencies on the command line:

cylc ls -t puma-aa045

Setting the default size of the rose edit window

Setting the default size of the rose edit window and the width of the rose edit left hand menu pane can be very helpful.

Edit ~/.metomi/rose.conf

Adding the following information to the file sets the default size and width of the rose config-edit (rose edit) window:

[rose-config-edit]
SIZE_WINDOW = (1100, 650)
WIDTH_TREE_PANEL = 400

For details of further customisations that can be made to the rose edit window see: http://metomi.github.io/rose/doc/rose-rug-config-edit.html#customisation

Launching Rose commands

It is possible to launch many of the Rose tools from the various GUIs. For example you can run or edit suites from rosie go, run suites from rose edit, and view log files from rose suite-gcontrol whilst the suite is running.

When running rose from the command line make sure to run from the appropriate roses/ directory or append the suite name using --name=puma-aa045, e.g.

rose suite-shutdown --name puma-aa015

Stop archiving of log files

By default, when a suite is run, the log files from the previous run will be tarred up. To avoid this run rose suite-run with the flag --no-log-archive.

Diff'ing suites

There is no formal mechanism for this as yet. But there is a tool rose config-dump which sort all of the app files in the suite into a common format, which then allows for diff to be run on the command-line between suite files. For more info see: http://metomi.github.io/rose/doc/rose-command.html#rose-config-dump

Adding UM user diagnostics

This works in a different way to the old UMUI and no longer uses user-STASHmaster files.

Instead the STASHmaster file is held in the UM trunk. To make changes, place your modified version in the file/ subdirectory of the app, e.g:

~roses/puma-aa045/app/um/file/STASHmaster

Passing arguments to fcm_make

Rose deals with fcm_make as a special app, see: http://metomi.github.io/rose/doc/rose-rug-task-run.html#rose-task-run.built-in-app.fcm_make

To pass arguments, such as -vvv for full verbose output:

  • Set the environment variable ROSE_TASK_OPTIONS=-vvv
  • Or add args=-vvv at the top of the fcm_make rose-app.conf file.

Troubleshooting common errors

Rosie go asks for "username for u"

By default rosie is set up to load suites from the local puma repository and the Met Office Science Repository Service (MOSRS). If your MOSRS password isn't cached, Rosie will prompt for it at startup. Clicking 'cancel' then produces an error:

Traceback (most recent call last):
  File "/home/fcm/rose-2015.04.1/lib/python/rosie/browser/main.py", line 994, in handle_update_treemodel_local_status
    self.display_box.update_treemodel_local_status(local_suites,
AttributeError: 'MainWindow' object has no attribute 'display_box'
get_known_keys: {}

There are two potential solutions:

  1. Re-cache your MOSRS password
  1. Tell Rosie to only load puma suites:

rosie go --prefix=puma

Users that don't have a MOSRS account may wish to set this up as an alias.

Unable to submit jobs (MONSooN)

The suite will fail straight away and the following error appears in the log/suite/err file:

Host key verification failed.
2015-01-21T14:56:23Z ERROR - [fcm_make.1] -Failed to construct job submission command
2015-01-21T14:56:23Z WARNING - Command '['ssh', '-oBatchMode=yes', '-oConnectTimeout=10', 'exvmsrose
.monsoon-metoffice.co.uk', 'mkdir -p "$HOME/cylc-run/nemovar_build" "$HOME/cylc-run/nemovar_build/lo
g/job"']' returned non-zero exit status 255
2015-01-21T14:56:23Z ERROR - [fcm_make.1] -submission failed 

This is because of an inability to ssh into the Rose VM from the Cylc VM interactively.

To solve, log in to the Cylc VM and then back to the Rose VM specifying the full paths, to add these to the known_hosts file.

  1. Check whether exvmscylc or exvmsrose appear in the known_hosts file already. If so delete these entries, especially if you accessed the VMs before their rebuild:
    cd .ssh
    mv known_hosts known_hosts.OLD
    sed '/^exvmsrose/d;/exvmscylc/d' known_hosts.OLD > known_hosts
    
  1. Now from exvmsrose, ssh into exvmscylc using the full path:
    ssh exvmscylc.monsoon-metoffice.co.uk
    
    This should provide output something like this:
    The authenticity of host 'exvmscylc.monsoon-metoffice.co.uk (10.168.64.4)' can't be established.
    RSA key fingerprint is 98:c8:5e:b9:b3:d2:2f:c4:9c:89:78:08:d6:78:70:3a.
    Are you sure you want to continue connecting (yes/no)? 
    
    Type yes.
  1. Now from exvmscylc, log in to exvmsrose using the full path:
    ssh exvmsrose.monsoon-metoffice.co.uk
    
    And again type yes at the prompt.
  1. Type exit to get back to the Rose VM, then ssh into exvmsrose again, and this should succeed without any interative prompts.
  1. Now type exit twice to get back to the original Rose terminal. And try re-submitting the rose suite.

No gcylc window

When submitting a job, no gcylc window appears.

Sometimes the gui is slow to load. If it does not appear at all however, check that you have X11 forwarding set up from your initial location and the lander.

To do so ssh with the -Y option or alternatively, append the following line to your ~/.ssh/config file:

Host *
ForwardX11 yes

Rose suite running but can't shutdown

A rose suite is supposedly running, i.e. rose suite-scan gives something like:

puma-aa046 gmslis@exvmscylc:7767 

Or trying to re-run the suite gives an error rose suite-run

[FAIL] Suite "puma-aa046" may still be running.
[FAIL] Host "exvmscylc" has process:
[FAIL]     9468 python /home/fcm/cylc-6.1.2/bin/cylc-run puma-aa046
[FAIL]     9469 python /home/fcm/cylc-6.1.2/bin/cylc-run puma-aa046
[FAIL] Try "rose suite-shutdown --name=puma-aa046" first? 

However, when trying to shutdown the suite, rose suite-stop reports that the suite isn't running:

Really shutdown puma-aa046 at exvmscylc? [y/n] y
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
'ERROR, remote port file not found' 

This is due to orphaned tasks on the Cylc VM, which can occur when exvmscylc and exvmsrose cannot communicate non-interactively.

To solve, log in to exvmscylc, and run cylc scan, this should show running tasks. To stop these, type:

cylc shutdown --now

This may report something like "Command queued", but re-running cylc scan will show that the tasks are now finished.

Can't run rose suite-log on MONSooN

On the MONSooN Rose VM (exvmsrose) running rose suite-log may do nothing.

To launch, instead run:

firefox http://localhost:8080/

And search for the suite id.

Device or resource busy when running suite

Unable to run suite.

exmsrose puma-aa045$ rose suite-run
[INFO] create: log.20150121T164500Z
[INFO] delete: log
[INFO] symlink: log.20150121T164500Z <= log
[INFO] log.20150121T163546Z.tar.gz <= log.20150121T163546Z
[FAIL] [Errno 16] Device or resource busy: 'log.20150121T163546Z/job/1/fcm_make/01/.nfs0000000000451b5d00000065'

You have one of the output files open somewhere, which means rose can't archive the old output. Close the file.

Warning when opening gcylc

A warning appears when the Rose/cylc run-time task manager, called gcylc, opens:

ParseError: File not found: /home/annette/.cylc/gcylc.rc
WARNING: user config parsing failed (continuing)

This is harmless but to avoid create an empty file in your home space:

touch ~/.cylc/gcylc.rc

.vimrc error with fcm commit

When trying to commit changes to a rose suite the following error occurs:

exmsrose puma-aa045$ fcm commit
[info] vi: starting commit message editor...
Error detected while processing /home/aospre/.vimrc:
line    5:
E518: Unknown option: foldlevelstart=99
Press ENTER or type command to continue
[FAIL] log message is empty

This error occurs with the Cylc syntax highlighting for Vim. Changing the default FCM editor to be vim rather than vi stops this error.

In your .profile add the following line:

export SVN_EDITOR=vim

Jinja error from rose suite-run

After editing the suite, a cryptic Jinja error message appears from rose suite-run:

[FAIL] cylc validate -v --strict puma-aa069 # return-code=1, stderr=
[FAIL] Jinja2 Error:
[FAIL]   File "<unknown>", line 58, in template
[FAIL] TemplateSyntaxError: expected token 'end of print statement', got '='

This is caused by some error in the suite.rc file caused by the Jinja syntax or Rose variables.

To debug, go to ~/cylc-run/<suite-name>, open the suite.rc file and navigate to the line number causing the error.

If the suite.rc file uses includes, then to generate the parsed file run:

cylc view -i <suite-name>

After identifying the error, fix in the original suite.rc or rose-suite.conf file in the roses directory. Editing the file in the cylc-run directory will have no effect!