Changes between Version 28 and Version 29 of RoseCylc/Hints


Ignore:
Timestamp:
07/11/15 15:33:53 (22 months ago)
Author:
annette
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • RoseCylc/Hints

    v28 v29  
    22 
    33= Useful information for running with Rose =  
     4 
     5== Links ==  
    46 
    57See also (redirects to Collaboration Twiki):   
     
    810* [http://collab.metoffice.gov.uk/twiki/bin/viewfile/Static/TWikiAdmin/AttachmentPool/RoseFAQ-594715l9t39032o573.html Rose FAQ] 
    911 
     12== Hints and tips ==  
     13 
    1014=== Switching versions of Rose and/or cylc === 
    1115 
     
    7882* Or add {{{args=-vvv}}} at the top of the fcm_make {{{rose-app.conf}}} file. 
    7983 
    80 === Rose tip of the day ===  
    81  
    82 For more Rose hints see the "tip of the day" from the Rose team: http://collab.metoffice.gov.uk/projects/rose/wiki/RoseTipOfDay 
    83  
     84== Troubleshooting common errors ==  
     85 
     86=== Rosie go asks for "username for u" === 
     87 
     88By default rosie is set up to load suites from the local puma repository and the Met Office Science Repository Service (MOSRS).  
     89If your MOSRS password isn't cached, Rosie will prompt for it at startup. Clicking 'cancel' then produces an error:  
     90{{{ 
     91Traceback (most recent call last): 
     92  File "/home/fcm/rose-2015.04.1/lib/python/rosie/browser/main.py", line 994, in handle_update_treemodel_local_status 
     93    self.display_box.update_treemodel_local_status(local_suites, 
     94AttributeError: 'MainWindow' object has no attribute 'display_box' 
     95get_known_keys: {} 
     96}}}  
     97 
     98There are two potential solutions:  
     99 
     1001. Re-cache your MOSRS password  
     101 
     1022. Tell Rosie to only load puma suites:  
     103 
     104   {{{rosie go --prefix=puma}}} 
     105 
     106   Users that don't have a MOSRS account may wish to set this up as an alias.   
     107 
     108=== Unable to submit jobs (MONSooN) ===  
     109 
     110The suite will fail straight away and the following error appears in the {{{log/suite/err}}} file:  
     111{{{ 
     112Host key verification failed. 
     1132015-01-21T14:56:23Z ERROR - [fcm_make.1] -Failed to construct job submission command 
     1142015-01-21T14:56:23Z WARNING - Command '['ssh', '-oBatchMode=yes', '-oConnectTimeout=10', 'exvmsrose 
     115.monsoon-metoffice.co.uk', 'mkdir -p "$HOME/cylc-run/nemovar_build" "$HOME/cylc-run/nemovar_build/lo 
     116g/job"']' returned non-zero exit status 255 
     1172015-01-21T14:56:23Z ERROR - [fcm_make.1] -submission failed  
     118}}} 
     119 
     120This is because of an inability to ssh into the Rose VM from the Cylc VM interactively.  
     121 
     122To solve, log in to the Cylc VM and then back to the Rose VM specifying the full paths, to add these to the known_hosts file.  
     123 
     1241. Check whether exvmscylc or exvmsrose appear in the known_hosts file already. If so delete these entries, especially if you accessed the VMs before their rebuild:  
     125{{{ 
     126cd .ssh 
     127mv known_hosts known_hosts.OLD 
     128sed '/^exvmsrose/d;/exvmscylc/d' known_hosts.OLD > known_hosts 
     129}}} 
     130 
     1312. Now from exvmsrose, ssh into exvmscylc using the full path:  
     132{{{ 
     133ssh exvmscylc.monsoon-metoffice.co.uk 
     134}}} 
     135  This should provide output something like this:  
     136{{{ 
     137The authenticity of host 'exvmscylc.monsoon-metoffice.co.uk (10.168.64.4)' can't be established. 
     138RSA key fingerprint is 98:c8:5e:b9:b3:d2:2f:c4:9c:89:78:08:d6:78:70:3a. 
     139Are you sure you want to continue connecting (yes/no)?  
     140}}} 
     141  Type {{{yes}}}.   
     142 
     1433. Now from exvmscylc, log in to exvmsrose using the full path:  
     144{{{ 
     145ssh exvmsrose.monsoon-metoffice.co.uk 
     146}}} 
     147  And again type {{{yes}}} at the prompt.  
     148 
     1494. Type {{{exit}}} to get back to the Rose VM, then ssh into exvmsrose again, and this should succeed without any interative prompts.  
     150 
     1515. Now type {{{exit}}} twice to get back to the original Rose terminal. And try re-submitting the rose suite.  
     152 
     153=== No gcylc window ===  
     154 
     155When submitting a job, no gcylc window appears.  
     156 
     157Sometimes the gui is slow to load. If it does not appear at all however, check that you have X11 forwarding set up from your **initial location and the lander**.  
     158 
     159To do so ssh with the -Y option or alternatively, append the following line to your ~/.ssh/config file:  
     160{{{ 
     161Host * 
     162ForwardX11 yes 
     163}}} 
     164 
     165=== Rose suite running but can't shutdown === 
     166 
     167A rose suite is supposedly running, i.e. {{{rose suite-scan}}} gives something like:  
     168{{{ 
     169puma-aa046 gmslis@exvmscylc:7767  
     170}}} 
     171Or trying to re-run the suite gives an error {{{rose suite-run}}}  
     172{{{ 
     173[FAIL] Suite "puma-aa046" may still be running. 
     174[FAIL] Host "exvmscylc" has process: 
     175[FAIL]     9468 python /home/fcm/cylc-6.1.2/bin/cylc-run puma-aa046 
     176[FAIL]     9469 python /home/fcm/cylc-6.1.2/bin/cylc-run puma-aa046 
     177[FAIL] Try "rose suite-shutdown --name=puma-aa046" first?  
     178}}} 
     179 
     180However, when trying to shutdown the suite, {{{rose suite-stop}}} reports that the suite isn't running:  
     181{{{ 
     182Really shutdown puma-aa046 at exvmscylc? [y/n] y 
     183@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 
     184'ERROR, remote port file not found'  
     185}}} 
     186 
     187This is due to orphaned tasks on the Cylc VM, which can occur when exvmscylc and exvmsrose cannot communicate non-interactively.  
     188 
     189To solve, log in to exvmscylc, and run {{{cylc scan}}}, this should show running tasks. To stop these, type:  
     190{{{ 
     191cylc shutdown --now 
     192}}} 
     193This may report something like "Command queued", but re-running {{{cylc scan}}} will show that the tasks are now finished.  
     194 
     195=== Can't run rose suite-log on MONSooN ===  
     196 
     197On the MONSooN Rose VM (exvmsrose) running {{{rose suite-log}}} may do nothing.  
     198 
     199To launch, instead run:  
     200{{{ 
     201firefox http://localhost:8080/ 
     202}}} 
     203And search for the suite id.  
     204 
     205=== Device or resource busy when running suite === 
     206 
     207Unable to run suite.  
     208 
     209{{{ 
     210exmsrose puma-aa045$ rose suite-run 
     211[INFO] create: log.20150121T164500Z 
     212[INFO] delete: log 
     213[INFO] symlink: log.20150121T164500Z <= log 
     214[INFO] log.20150121T163546Z.tar.gz <= log.20150121T163546Z 
     215[FAIL] [Errno 16] Device or resource busy: 'log.20150121T163546Z/job/1/fcm_make/01/.nfs0000000000451b5d00000065' 
     216}}} 
     217 
     218You have one of the output files open somewhere, which means rose can't archive the old output. Close the file.  
     219 
     220=== Warning when opening gcylc ===  
     221 
     222A warning appears when the Rose/cylc run-time task manager, called gcylc, opens:  
     223 
     224{{{ 
     225ParseError: File not found: /home/annette/.cylc/gcylc.rc 
     226WARNING: user config parsing failed (continuing) 
     227}}} 
     228 
     229This is harmless but to avoid create an empty file in your home space:  
     230{{{ 
     231touch ~/.cylc/gcylc.rc 
     232}}} 
     233 
     234 
     235=== .vimrc error with fcm commit ===  
     236 
     237When trying to commit changes to a rose suite the following error occurs: 
     238{{{ 
     239exmsrose puma-aa045$ fcm commit 
     240[info] vi: starting commit message editor... 
     241Error detected while processing /home/aospre/.vimrc: 
     242line    5: 
     243E518: Unknown option: foldlevelstart=99 
     244Press ENTER or type command to continue 
     245[FAIL] log message is empty 
     246}}}  
     247 
     248This error occurs with the Cylc syntax highlighting for Vim. Changing the default FCM editor to be vim rather than vi stops this error.  
     249 
     250In your {{{.profile}}} add the following line: 
     251{{{ 
     252export SVN_EDITOR=vim 
     253}}} 
     254 
     255=== Jinja error from rose suite-run ===  
     256 
     257After editing the suite, a cryptic Jinja error message appears from {{{rose suite-run}}}:  
     258 
     259{{{ 
     260[FAIL] cylc validate -v --strict puma-aa069 # return-code=1, stderr= 
     261[FAIL] Jinja2 Error: 
     262[FAIL]   File "<unknown>", line 58, in template 
     263[FAIL] TemplateSyntaxError: expected token 'end of print statement', got '=' 
     264}}} 
     265 
     266This is caused by some error in the {{{suite.rc}}} file caused by the Jinja syntax or Rose variables.  
     267 
     268To debug, go to {{{~/cylc-run/<suite-name>}}}, open the {{{suite.rc}}} file and navigate to the line number causing the error.  
     269 
     270If the {{{suite.rc}}} file uses includes, then to generate the parsed file run:  
     271{{{ 
     272cylc view -i <suite-name> 
     273}}} 
     274 
     275After identifying the error, fix in the original {{{suite.rc}}} or {{{rose-suite.conf}}} file in the {{{roses}}} directory.  
     276Editing the file in the {{{cylc-run}}} directory will have no effect!