Opened 19 months ago
Closed 18 months ago
#3022 closed help (fixed)
rosie go crashes when changing stash profiles
Reported by: | eelrm | Owned by: | um_support |
---|---|---|---|
Component: | Rose/Cylc | Keywords: | UKESM, STASH |
Cc: | andy.heaps@… | Platform: | PUMA |
UM Version: | 11.1 |
Description
Hi,
I am currently trying to add some new stash requests to a UKESM-AMIP suite (u-bm943). I have added the new items, but every time I try and change the profiles, rosie go crashes. Is there anything I may have done that may be causing this?
Many thanks,
Lauren
Change History (21)
comment:1 Changed 19 months ago by ros
comment:2 Changed 19 months ago by eelrm
Hi Ros,
I've just opened the suite with rose edit and I now see that when it crashes it's saying Memory fault.
It happens when I add in just one extra diagnostic - e.g. 38290 on DALLTH, TMONMN and UP4.
Thanks,
Lauren
comment:3 Changed 19 months ago by eelrm
Hi Ros,
Just a quick update - I removed all STASH items that were listed but were not being included. This was almost a third of all requests. Unfortunately it didn't make a difference and I am still getting a memory error when trying to add new STASH.
Thanks,
Lauren
comment:4 Changed 19 months ago by willie
Hi Lauren,
What do you get when you run the command
quota -v | awk 'END {printf " Using %6.2f%% of quota\n",100* ($2/$3)}'
on PUMA? If it is anywhere near 100%, then you'll need to delete some files e.g. old cylc_run directories and old UM branches that you no longer use.
Willie
comment:5 Changed 19 months ago by eelrm
Hi Willie,
It's 79.13%. Is that still too close?
Thanks,
Lauren
comment:6 Changed 19 months ago by willie
Hi Lauren,
That should be OK. STASH is quite tiny and should not overflow memory or disc. I have taken a copy of your job and added that STASH item successfully. You should be using the STASH Requests page in Rose to add the STASH. At what point is it crashing? Is it when you press "New +"?
Willie
comment:7 Changed 19 months ago by ros
Hi Lauren, Willie
Just to add I did manage to recreate your problem on Friday but only intermittently, since then I've not managed to get it to crash, which makes diagnosing/advising very tricky.
Regards,
Ros.
comment:8 Changed 19 months ago by eelrm
Hi Willie,
It's crashing after I've added the new item, when it has the red crosses and I am adding the profiles. I can add DALLTH and TMONMN most of the time successfully, but it crashes on adding the usage profile. It's normally after two changes (to any of the profiles) that it crashes.
Thanks,
Lauren
comment:9 Changed 19 months ago by willie
Hi Lauren,
It is still working for me with your new STASH. Could you try again please, and if it crashes paste the exact error message you're getting.
Willie
comment:10 Changed 19 months ago by eelrm
Hi Willie,
Just tried adding 38290 again and was successful this time. However, I then added a second item (38294) and it then crashed when adding the usage profile.
All I get to command line is: [1] + Memory fault rose edit&
Lauren
comment:11 Changed 19 months ago by willie
Hi Lauren,
As Ros has said intermittent errors are tricky to diagnose. If you look at your existing processes,
ps -flu eelrm F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD 1 S eelrm 4402 1 0 84 4 - 80223 ? Jul03 ? 00:08:17 python -m rosie.browser.main 1 S eelrm 6268 1 0 80 0 - 5861 ? May23 ? 00:00:47 ssh-agent 1 S eelrm 6433 1 0 80 0 - 5861 ? Apr02 ? 00:00:13 ssh-agent 0 S eelrm 6800 1 0 80 0 - 2020 pipe_w Apr23 ? 00:00:00 bash -ec H=$(rose host-select archer); echo $H 1 S eelrm 6801 6800 0 80 0 - 2020 wait Apr23 ? 00:00:00 bash -ec H=$(rose host-select archer); echo $H 0 S eelrm 6802 6801 0 80 0 - 18121 wait Apr23 ? 00:00:00 python -m rose.host_select archer 0 T eelrm 6832 6802 0 80 0 - 6162 signal Apr23 ? 00:00:00 ssh -oBatchMode=yes login.archer.ac.uk true 0 S eelrm 7701 1 0 80 0 - 2020 pipe_w Aug30 ? 00:00:00 bash -ec H=$(rose host-select archer); echo $H 1 S eelrm 7702 7701 0 80 0 - 2020 wait Aug30 ? 00:00:00 bash -ec H=$(rose host-select archer); echo $H 0 S eelrm 7703 7702 0 80 0 - 18121 wait Aug30 ? 00:00:00 python -m rose.host_select archer 0 T eelrm 7752 7703 0 80 0 - 6162 signal Aug30 ? 00:00:00 ssh -oBatchMode=yes login.archer.ac.uk true 1 S eelrm 16468 1 0 80 0 - 5861 ? Jul04 ? 00:00:30 ssh-agent 5 S eelrm 16848 16817 0 80 0 - 8873 ? 10:42 ? 00:00:04 sshd: eelrm@pts/182 0 S eelrm 16849 16848 0 80 0 - 5023 ? 10:42 pts/182 00:00:00 -ksh 1 S eelrm 16884 1 0 80 0 - 4768 ? 10:42 ? 00:00:00 gpg-agent --daemon --allow-preset-passphrase --batch --ma 1 S eelrm 17288 1 0 84 4 - 105096 ? Aug14 ? 00:00:00 python -m rose.config_editor.main -C /home/eelrm/roses/u- 1 S eelrm 19969 1 0 80 0 - 5861 ? Feb14 ? 00:00:04 ssh-agent 1 S eelrm 22414 1 0 80 0 - 5861 ? 2018 ? 00:00:00 ssh-agent 1 S eelrm 27918 1 0 84 4 - 102797 ? Aug14 ? 00:00:00 python -m rose.config_editor.main -C /home/eelrm/roses/u- 1 S eelrm 30487 1 0 80 0 - 5861 ? Aug15 ? 00:00:36 ssh-agent
you can see you have several (old) instances of the rose editor running. You could try to close down all your editors and, if any such processes remain, kill -9 PID for each. Then log out and back in again.
It is not a good idea to edit the same suite twice simultaneously with rose edit e.g processes 27918 and 17288 are both editing u-bk467.
Try that and see if the problem persists.
Willie
comment:12 Changed 19 months ago by eelrm
Hi Willie,
Ok, thanks. I have no other editors open - do you know how I have got all of these processes running?
I'll try and kill them all and let you know.
Lauren
comment:13 Changed 19 months ago by eelrm
Hi Willie,
Still no luck I'm afraid. I've now killed all of the processes but am still getting the Memory fault.
Lauren
comment:14 Changed 19 months ago by willie
Hi Lauren,
Could you try to add stash to a different unrelated suite please. Does that also not work?
Willie
comment:15 Changed 19 months ago by eelrm
Hi Willie,
Just tried with suite u-bl775. It still crashed, but after three additional stash items were added and profiles changed.
Lauren
comment:16 Changed 19 months ago by willie
Hi Lauren,
I think you should move to a later version of Rose. Add the following to your .profile
export ROSE_VERSION=2018.02.0
and then logout and back in again. You can check the version you're using with
rose --version
Try that and let us know how it goes.
Willie
comment:17 Changed 19 months ago by eelrm
Hi Willie,
Still no luck I'm afraid. I have updated the version but it's still crashing.
Lauren
comment:18 Changed 19 months ago by willie
- Cc andy.heaps@… added
Hi Lauren,
The next step is to move you from puma to a newer computer, pumatest, which has the latest version of Rose editor. I have tried your suite there and have not yet had a crash. Andy Heaps will be in touch to organise this.
Willie
comment:19 Changed 18 months ago by taubry
Dear Willie, Ros and Andy,
I have the exact same problem as described by Lauren with my suite u-bn966. When I add a new stash request, I can set dom_name, tim_name but then rosie crashes when I add use_name. If no solution exists, that means saving/reopening rosie twice for every new stash request, and I have 40+ to add in 5 suites.
Is there any update to this ticket? Does switching to pumatest indeed solve the problem? If so, is it possible to post instructions on this ticket?
Many thanks,
Thomas
comment:20 Changed 18 months ago by willie
- Component changed from UKESM to Rose/Cylc
- Keywords UKESM, STASH added
- Platform set to PUMA
comment:21 Changed 18 months ago by willie
- Resolution set to fixed
- Status changed from new to closed
Hi Lauren,
I've just opened up your suite and can edit the profiles fine.
Can you give us some more information please.
Regards,
Ros.