#3022 closed help (fixed)

rosie go crashes when changing stash profiles

Reported by: eelrm Owned by: um_support
Component: Rose/Cylc Keywords: UKESM, STASH
Cc: andy.heaps@… Platform: PUMA
UM Version: 11.1

Description

Hi,

I am currently trying to add some new stash requests to a UKESM-AMIP suite (u-bm943). I have added the new items, but every time I try and change the profiles, rosie go crashes. Is there anything I may have done that may be causing this?

Many thanks,

Lauren

Change History (21)

comment:1 Changed 11 months ago by ros

Hi Lauren,

I've just opened up your suite and can edit the profiles fine.

Can you give us some more information please.

  • Which items have you added and are trying to change the profiles of?
  • Is there any error message printed to a dialog box or the terminal window?

Regards,
Ros.

comment:2 Changed 11 months ago by eelrm

Hi Ros,

I've just opened the suite with rose edit and I now see that when it crashes it's saying Memory fault.
It happens when I add in just one extra diagnostic - e.g. 38290 on DALLTH, TMONMN and UP4.

Thanks,
Lauren

comment:3 Changed 11 months ago by eelrm

Hi Ros,

Just a quick update - I removed all STASH items that were listed but were not being included. This was almost a third of all requests. Unfortunately it didn't make a difference and I am still getting a memory error when trying to add new STASH.

Thanks,

Lauren

comment:4 Changed 11 months ago by willie

Hi Lauren,

What do you get when you run the command

quota -v | awk 'END {printf " Using %6.2f%% of quota\n",100* ($2/$3)}'

on PUMA? If it is anywhere near 100%, then you'll need to delete some files e.g. old cylc_run directories and old UM branches that you no longer use.

Willie

comment:5 Changed 11 months ago by eelrm

Hi Willie,

It's 79.13%. Is that still too close?

Thanks,

Lauren

comment:6 Changed 11 months ago by willie

Hi Lauren,

That should be OK. STASH is quite tiny and should not overflow memory or disc. I have taken a copy of your job and added that STASH item successfully. You should be using the STASH Requests page in Rose to add the STASH. At what point is it crashing? Is it when you press "New +"?

Willie

comment:7 Changed 11 months ago by ros

Hi Lauren, Willie

Just to add I did manage to recreate your problem on Friday but only intermittently, since then I've not managed to get it to crash, which makes diagnosing/advising very tricky.

Regards,
Ros.

comment:8 Changed 11 months ago by eelrm

Hi Willie,

It's crashing after I've added the new item, when it has the red crosses and I am adding the profiles. I can add DALLTH and TMONMN most of the time successfully, but it crashes on adding the usage profile. It's normally after two changes (to any of the profiles) that it crashes.

Thanks,

Lauren

comment:9 Changed 11 months ago by willie

Hi Lauren,

It is still working for me with your new STASH. Could you try again please, and if it crashes paste the exact error message you're getting.

Willie

comment:10 Changed 11 months ago by eelrm

Hi Willie,

Just tried adding 38290 again and was successful this time. However, I then added a second item (38294) and it then crashed when adding the usage profile.

All I get to command line is: [1] + Memory fault rose edit&

Lauren

comment:11 Changed 11 months ago by willie

Hi Lauren,

As Ros has said intermittent errors are tricky to diagnose. If you look at your existing processes,

ps -flu eelrm
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
1 S eelrm     4402     1  0  84   4 - 80223 ?      Jul03 ?        00:08:17 python -m rosie.browser.main
1 S eelrm     6268     1  0  80   0 -  5861 ?      May23 ?        00:00:47 ssh-agent
1 S eelrm     6433     1  0  80   0 -  5861 ?      Apr02 ?        00:00:13 ssh-agent
0 S eelrm     6800     1  0  80   0 -  2020 pipe_w Apr23 ?        00:00:00 bash -ec H=$(rose host-select archer); echo $H
1 S eelrm     6801  6800  0  80   0 -  2020 wait   Apr23 ?        00:00:00 bash -ec H=$(rose host-select archer); echo $H
0 S eelrm     6802  6801  0  80   0 - 18121 wait   Apr23 ?        00:00:00 python -m rose.host_select archer
0 T eelrm     6832  6802  0  80   0 -  6162 signal Apr23 ?        00:00:00 ssh -oBatchMode=yes login.archer.ac.uk true
0 S eelrm     7701     1  0  80   0 -  2020 pipe_w Aug30 ?        00:00:00 bash -ec H=$(rose host-select archer); echo $H
1 S eelrm     7702  7701  0  80   0 -  2020 wait   Aug30 ?        00:00:00 bash -ec H=$(rose host-select archer); echo $H
0 S eelrm     7703  7702  0  80   0 - 18121 wait   Aug30 ?        00:00:00 python -m rose.host_select archer
0 T eelrm     7752  7703  0  80   0 -  6162 signal Aug30 ?        00:00:00 ssh -oBatchMode=yes login.archer.ac.uk true
1 S eelrm    16468     1  0  80   0 -  5861 ?      Jul04 ?        00:00:30 ssh-agent
5 S eelrm    16848 16817  0  80   0 -  8873 ?      10:42 ?        00:00:04 sshd: eelrm@pts/182                                     
0 S eelrm    16849 16848  0  80   0 -  5023 ?      10:42 pts/182  00:00:00 -ksh
1 S eelrm    16884     1  0  80   0 -  4768 ?      10:42 ?        00:00:00 gpg-agent --daemon --allow-preset-passphrase --batch --ma
1 S eelrm    17288     1  0  84   4 - 105096 ?     Aug14 ?        00:00:00 python -m rose.config_editor.main -C /home/eelrm/roses/u-
1 S eelrm    19969     1  0  80   0 -  5861 ?      Feb14 ?        00:00:04 ssh-agent
1 S eelrm    22414     1  0  80   0 -  5861 ?       2018 ?        00:00:00 ssh-agent
1 S eelrm    27918     1  0  84   4 - 102797 ?     Aug14 ?        00:00:00 python -m rose.config_editor.main -C /home/eelrm/roses/u-
1 S eelrm    30487     1  0  80   0 -  5861 ?      Aug15 ?        00:00:36 ssh-agent

you can see you have several (old) instances of the rose editor running. You could try to close down all your editors and, if any such processes remain, kill -9 PID for each. Then log out and back in again.

It is not a good idea to edit the same suite twice simultaneously with rose edit e.g processes 27918 and 17288 are both editing u-bk467.

Try that and see if the problem persists.

Willie

comment:12 Changed 11 months ago by eelrm

Hi Willie,

Ok, thanks. I have no other editors open - do you know how I have got all of these processes running?

I'll try and kill them all and let you know.

Lauren

comment:13 Changed 11 months ago by eelrm

Hi Willie,

Still no luck I'm afraid. I've now killed all of the processes but am still getting the Memory fault.

Lauren

comment:14 Changed 11 months ago by willie

Hi Lauren,

Could you try to add stash to a different unrelated suite please. Does that also not work?

Willie

comment:15 Changed 11 months ago by eelrm

Hi Willie,

Just tried with suite u-bl775. It still crashed, but after three additional stash items were added and profiles changed.

Lauren

comment:16 Changed 11 months ago by willie

Hi Lauren,

I think you should move to a later version of Rose. Add the following to your .profile

export ROSE_VERSION=2018.02.0

and then logout and back in again. You can check the version you're using with

rose --version

Try that and let us know how it goes.

Willie

comment:17 Changed 11 months ago by eelrm

Hi Willie,

Still no luck I'm afraid. I have updated the version but it's still crashing.

Lauren

comment:18 Changed 11 months ago by willie

  • Cc andy.heaps@… added

Hi Lauren,

The next step is to move you from puma to a newer computer, pumatest, which has the latest version of Rose editor. I have tried your suite there and have not yet had a crash. Andy Heaps will be in touch to organise this.

Willie

comment:19 Changed 10 months ago by taubry

Dear Willie, Ros and Andy,

I have the exact same problem as described by Lauren with my suite u-bn966. When I add a new stash request, I can set dom_name, tim_name but then rosie crashes when I add use_name. If no solution exists, that means saving/reopening rosie twice for every new stash request, and I have 40+ to add in 5 suites.

Is there any update to this ticket? Does switching to pumatest indeed solve the problem? If so, is it possible to post instructions on this ticket?

Many thanks,

Thomas

comment:20 Changed 10 months ago by willie

  • Component changed from UKESM to Rose/Cylc
  • Keywords UKESM, STASH added
  • Platform set to PUMA

comment:21 Changed 10 months ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.