Opened 5 months ago

Last modified 5 months ago

#3179 accepted help

Increasing Suite runtime and memory

Reported by: NoelClancy Owned by: pmcguire
Component: JULES Keywords: rose, cylc, FLUXNET, JULES, resources
Cc: Platform: JASMIN
UM Version:



I have added variables to suites which have been successful.
Sometimes, if I add too may variables the suite does not request sufficient time resources and memory etc. However, I am not sure how to modify the requested time and memory for a suite.

Do you know where this is specified in u-al752 and related suites?


Change History (4)

comment:1 Changed 5 months ago by pmcguire

  • Status changed from new to accepted

comment:2 Changed 5 months ago by pmcguire

  • Keywords rose, cylc, FLUXNET, JULES, resources added
  • Platform set to JASMIN

Hi Noel:
In the u-al752 suite, in the file ~/u-al752/site/suite.rc.CEDA_JASMIN, you will see these lines:

        inherit = None, JASMIN

            -m = ivybridge128G
            -q = short-serial

            batch system = lsf

        inherit = None, JASMIN_LOTUS

            -W = 2:00
            -n = 1

This means that it is asking for 2 hours of Wallclock time to run the job on ONE(1) of CEDA JASMIN's LOTUS nodes in the short-serial queue. If your job fails because you need more Wallclock time, you can re-run it with a higher number of hours.

This CEDA JASMIN webpage is useful for understanding how to specify ememory requests:

For example, in the ~/u-al752/site/suite.rc.CEDA_JASMIN, if you want to request 15GB of memory (if that is enough) you can expand the following section so that it is:

      inherit = None, JASMIN_LOTUS

            -W = 2:00
            -n = 1
            -R = “rusage[mem=15000]” 
            -M = 15000 

Does this help?

comment:3 Changed 5 months ago by NoelClancy

Thanks very much Patrick,

I'm running on MONSOON so I suppose, I can do it in a similar way.

nclancy@xcslc0:~/roses/u-bm066/site> vi suite.rc.MONSOON


inherit = None, METO_XC40

{#- We need different directives for the shared queue #}
-q = shared
-l ncpus = 2
-l walltime = 02:00:00

I've changed the above field to "-l walltime = 03:00:00" so I will see if that works.

The error message I got was as follows:
⇒> PBS: job killed: walltime 7225 exceeded limit 7200
2020-02-09T04:06:22Z CRITICAL - failed/TERM

7200 seconds is 2 hours, so I need slightly more walltime.

I'm re-running to see if that works and I will let you know the result.



comment:4 Changed 5 months ago by NoelClancy


I made the change, re-ran the suite and it worked.


Ticket Closed

Note: See TracTickets for help on using tickets.