Opened 11 months ago

Closed 9 months ago

#2323 closed help (fixed)

Help with Lotus Queue Settings

Reported by: dmhg Owned by: pmcguire
Priority: normal Component: JULES
Keywords: JASMIN/Lotus Cc:
Platform: Other UM Version: <select version>

Description

I hope it's OK to ask a couple of questions about running Jules on Jasmin-lotus.

1)

I am attempting to run a Jules job which works on Monsoon2.

The suite.rc is set up differently for Monsoon (the -q setting, among others).

Here is the Monsoon one:

http://www.precisrcm.com/DMHG/suite.rc.monsoon

Here is my attempt to port that to Jasmin

http://www.precisrcm.com/DMHG/suite.rc.jasmin

This post from Charlie makes me think the scheduler is killing my running job

http://cms.ncas.ac.uk/ticket/2204

I have tried to change the -W to the maximum for par-multi (24:00)

When running on Monsoon2, here are the stats reported at the end of job.out which show

the number of cores used and memory:

http://www.precisrcm.com/DMHG/stats.txt

On Monsoon the whole run (1980-2010) takes about 36 hours.

How might I alter the suite.rc for Jasmin-lotus so that it runs optimally and the scheduler doesn't kill it?

2)

Patrick posted this query which I don't think was resolved. I emailed him and he said the problem went away but wasn't sure why.

http://cms.ncas.ac.uk/ticket/2281

I am getting these same errors also, and I don't know why. Have you seen this before?

http://www.precisrcm.com/DMHG/job.err

Change History (3)

comment:1 in reply to: ↑ description Changed 11 months ago by pmcguire

  • Owner changed from um_support to pmcguire
  • Status changed from new to accepted

Hi David

Are you still having problems with #1 or #2 below? I actually told you that I sometimes still have problems with #2. And I don't know how to fix it.

Patrick McGuire?

Replying to dmhg:

I hope it's OK to ask a couple of questions about running Jules on Jasmin-lotus.

1)

I am attempting to run a Jules job which works on Monsoon2.

The suite.rc is set up differently for Monsoon (the -q setting, among others).

Here is the Monsoon one:

http://www.precisrcm.com/DMHG/suite.rc.monsoon

Here is my attempt to port that to Jasmin

http://www.precisrcm.com/DMHG/suite.rc.jasmin

This post from Charlie makes me think the scheduler is killing my running job

http://cms.ncas.ac.uk/ticket/2204

I have tried to change the -W to the maximum for par-multi (24:00)

When running on Monsoon2, here are the stats reported at the end of job.out which show

the number of cores used and memory:

http://www.precisrcm.com/DMHG/stats.txt

On Monsoon the whole run (1980-2010) takes about 36 hours.

How might I alter the suite.rc for Jasmin-lotus so that it runs optimally and the scheduler doesn't kill it?

2)

Patrick posted this query which I don't think was resolved. I emailed him and he said the problem went away but wasn't sure why.

http://cms.ncas.ac.uk/ticket/2281

I am getting these same errors also, and I don't know why. Have you seen this before?

http://www.precisrcm.com/DMHG/job.err

comment:2 Changed 9 months ago by pmcguire

FYI, regarding issue#2:
This ticket #2281 is still being worked on. The latest information is that we should be using (for a virtual machine) jasmin-cylc instead of jasmin-sci*. Then those https communication errors go away. But jasmin-cylc doesn't currently support GUIs from Rose/Cylc?, and its Python setup is different than jasmin-sci*.

comment:3 Changed 9 months ago by pmcguire

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.