Opened 6 weeks ago

Closed 6 weeks ago

#3480 closed help (fixed)

pptransfer fail

Reported by: yb19052 Owned by: um_support
Component: SSH Keywords:
Cc: Platform: NEXCS
UM Version: 10.7

Description

Hi,

I run three suites (u-by975, u-bz669, and u-bo840), and all the suites failed the pptansfer task. I got the following message.

[WARN] file:atmospp.nl: skip missing optional source: namelist:moose_arch
[WARN] file:nemocicepp.nl: skip missing optional source: namelist:moose_arch
[WARN]  [SUBPROCESS]: Command: rsync -av --stats --rsync-path=mkdir -p /gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z && rsync /projects/nexcs-n02/kizumi/u-by975/20090701T0000Z/ jasmin-xfer2.ceda.ac.uk:/gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z
[SUBPROCESS]: Error = 255:
	ssh: connect to host jasmin-xfer2.ceda.ac.uk port 22: Connection timed out
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(641) [sender=3.0.4]

[WARN]  Transfer command failed: rsync -av --stats --rsync-path="mkdir -p /gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z && rsync" /projects/nexcs-n02/kizumi/u-by975/20090701T0000Z/ jasmin-xfer2.ceda.ac.uk:/gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z
[ERROR]  transfer.py: Unknown Error - Return Code=255
[FAIL]  Command Terminated
[FAIL] Terminating PostProc...
[FAIL] transfer.py # return-code=1
2021-02-25T08:51:10Z CRITICAL - failed/EXIT

In using "Trigger (run now)" on the GUI, the task started and failed a few minutes later.

Our storage in the GWS still has enough spaces,

quobyte@sds.jc.rl.ac.uk/gws_pmip4_vol2     219T  187T   33T  86% /gws/nopw/j04/pmip4_vol2


How do I fix the problem?

Thanks
Kenji

Change History (7)

comment:1 Changed 6 weeks ago by ros

Hi Kenji,

Did you have pptransfer working ok previously or is this the first time?

Please make sure you can run ssh jasmin-xfer2.ceda.ac.uk on the NEXCS command line and get to JASMIN without any prompt for passphrase.

Regards,
Ros.

comment:2 Changed 6 weeks ago by ros

Hi Kenji,

Sorry just realised that hostname is incorrect. The JASMIN high performance transfer server is now called hpxfer1.jasmin.ac.uk.

Cheers,
Ros.

Last edited 6 weeks ago by ros (previous) (diff)

comment:3 Changed 6 weeks ago by yb19052

Hi Ros,

Thank you for the responses.

Now, using hpxfer1.jasmin.ac.uk, I tried to implement the pptransfer task, but I still have the same problem.

[WARN] file:atmospp.nl: skip missing optional source: namelist:moose_arch
[WARN] file:nemocicepp.nl: skip missing optional source: namelist:moose_arch
[WARN]  [SUBPROCESS]: Command: rsync -av --stats --rsync-path=mkdir -p /gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z && rsync /projects/nexcs-n02/kizumi/u-by975/20090701T0000Z/ hpxfer1.jasmin.ac.uk:/gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z
[SUBPROCESS]: Error = 255:
	ssh: connect to host hpxfer1.jasmin.ac.uk port 22: Connection timed out
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(641) [sender=3.0.4]

[WARN]  Transfer command failed: rsync -av --stats --rsync-path="mkdir -p /gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z && rsync" /projects/nexcs-n02/kizumi/u-by975/20090701T0000Z/ hpxfer1.jasmin.ac.uk:/gws/nopw/j04/pmip4_vol2/users/kizumi/u-by975/20090701T0000Z
[ERROR]  transfer.py: Unknown Error - Return Code=255
[FAIL]  Command Terminated
[FAIL] Terminating PostProc...
[FAIL] transfer.py # return-code=1
2021-02-25T11:49:00Z CRITICAL - failed/EXIT

pptransfer works ok before, and "hpxfer" is active for my account.

Thanks
Kenji


comment:4 Changed 6 weeks ago by ros

Hi Kenji,

Can you ssh hpxfer1.jasmin.ac.uk from NEXCS command line without prompt from passphrase?

If you haven't already you will need to update your ~/.ssh/config file appropriately.

Regards,
Ros.

comment:5 Changed 6 weeks ago by yb19052

Hi Ros,

Finally, the pptransfer task works ok using hpxfer2.jasmin.ac.uk.
I do not know why hpxfer1.jasmin.ac.uk does not work for me.

I appreciate your help.

Thanks
Kenji

comment:6 Changed 6 weeks ago by ros

Hi Kenji,

Glad you've got it working again.

Authentication to hpxfer1 and hpxfer2 is the same so unable to offer any light on that.

I'll close this ticket now.

Cheers,
Ros.

comment:7 Changed 6 weeks ago by ros

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.