Opened 2 years ago

Closed 2 years ago

#2719 closed help (fixed)

Unable to submit run

Reported by: anmcr Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: Monsoon2
UM Version: 10.4



I had a vn10.4 of the nested suite running a couple of years ago on Monsoon (pre Monsoon 2), called u-ag300. I would like to re-run it, but am unable to submit the run. I get an error 'RosePopenError?' 'No hosts selected'. See attachment for more info. I would be grateful for some advice.

Many thanks,


Attachments (2)

ncas_ticket.PNG (128.1 KB) - added by anmcr 2 years ago.
for_willie.PNG (118.1 KB) - added by anmcr 2 years ago.

Download all attachments as: .zip

Change History (12)

Changed 2 years ago by anmcr

comment:1 Changed 2 years ago by willie

Hi Andrew,

The clue is in the first part of the error message: SSH has failed because it does not recognise the computer 'xcf'.

In your site/monsoon-cray-xc40/suite-adds.rc file, change 'xcf' in the first line to 'xcs' and try again.


comment:2 Changed 2 years ago by anmcr

Dear Willie,

Thanks for the reply and looking at the problem.

I made the change you suggested, but I'm still getting the 'no hosts selected' error when I submit. See attachement.

Best wishes,


Changed 2 years ago by anmcr

comment:3 Changed 2 years ago by willie

Hi Andrew,

The first problem has been solved. This is a new problem of the same type: it does not understand what `linux' is. If you type

rose host-select linux

on exvmsrose you will see the error.

Looking at your site/monsoon-cray-xc40/suite-adds.rc you will see

{% set IDL_SERVER = "linux" %}

near the top. I think you can just change linux to postproc and then it will find the computer.


comment:4 Changed 2 years ago by anmcr

Hi again Willie,

Thank you again for looking at this issue, and resolving it.

I'm afraid that I'm met another hitch in the reconfiguration stage of the global model, that has me stumped. The STDOUT file is here: home/d01/amworr/cylc-run/u-ag300/log/job/20110118T0000Z/glm_um_recon/02/job.out. The STDERR file is here: /home/d01/amworr/cylc-run/u-ag300/log/job/20110118T0000Z/glm_um_recon/02/job.err.

You will see that there is almost no information as to why this step has failed. The only clue is 'line 80: /home/d01/amworr/cylc-run/u-ag300/share/data/etc/ancil_versions_dm: No such file or directory'. I have never came across this problem, and couldn't find it in any previous tickets.

I would be grateful if you could please advise.

Many thanks,


comment:5 Changed 2 years ago by willie

H Andrew,

Your ancil_versions_dm file is missing. This is because it is a link

 /home/d01/amworr/cylc-run/u-ag300/share/data/etc/ancil_versions_dm -> /home/swebst/CAP/ancil_versions/n512e/GA6.0/latest/ancils

Stuart's user name was changed from swebst to hadsw, so you will find this file under /home/d03/hadsw.


comment:6 Changed 2 years ago by anmcr

Hi Willie,

Thanks for fixing this.

I have one small remaining problem. The run is failing when it tries to archive the files with the error 'attempt to archive a zero-length file', which refers to a file with the ending 'pc000.pp'. See error output below. I don't know how this is being produced, as I am not archiving any files with 'pc000'. I was in a previous run, so I ran 'rose-suite-run —new' in order to have a clean start, but the problem still persists.

Are you able to please advise?



This computer is provided for the processing of official information.
Unauthorised access described in Met Office SyOps? may constitute a criminal offence.
All activity on the system is liable to monitoring.
[FAIL] moo put -f /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pvera000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pb000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverb000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverd000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverc000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pa000.pp /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pc000.pp moose:/devfc/u-be146/field.pp/ # return-code=11, stderr=
[FAIL] /home/d01/amworr/cylc-run/u-be146/work/20140701T0000Z/AntarcticCORDEX_0p44deg_ga6_archive/tmpQETzXU/20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pc000.pp: (ERROR_CLIENT_ZERO_LENGTH_FILE) attempted to archive a zero-length file.
[FAIL] put: failed (11)
[FAIL] ! moose:/devfc/u-be146/field.pp/ [compress=None, t(init)=2019-01-16T08:55:40Z, dt(tran)=1s, dt(arch)=3s, ret-code=11]
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pa000.pp (umnsaa_pa000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pb000.pp (umnsaa_pb000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pc000.pp (umnsaa_pc000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pvera000.pp (umnsaa_pvera000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverb000.pp (umnsaa_pverb000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverc000.pp (umnsaa_pverc000)
[FAIL] ! 20140701T0000Z_AntarcticCORDEX_0p44deg_ga6_pverd000.pp (umnsaa_pverd000)
2019-01-16T08:55:46Z CRITICAL - failed/EXIT

comment:7 Changed 2 years ago by willie

HI Andrew,

If you look at the file


in xconv, you will see it has no data. This will cause the pp conversion to fail. Did you run for long enough to get output?


comment:8 Changed 2 years ago by anmcr

Hi Willie,

Thanks for the reply.

What I don't understand is how the 'umsaa_pc000' file is being produced. In my usage profile I am only using '60_diags' and '61_diags', which are associated with 'pp0' and 'pp1', which in the model output streams refer to 'umsaa_pa000' and 'umsaa_pb000'.

Best wishes,


comment:9 Changed 2 years ago by anmcr

Dear Wllie,

I just deleted the file, and it archived properly. The 'umsaa_pc000' file was left over from an earlier run.

Please close this ticket, and thanks again for your help.


comment:10 Changed 2 years ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.