Opened 2 years ago
Closed 2 years ago
#2745 closed help (fixed)
Error archiving to mass
Reported by: | nx902220 | Owned by: | um_support |
---|---|---|---|
Component: | Archiving | Keywords: | |
Cc: | Platform: | Monsoon2 | |
UM Version: | 10.5 |
Description
Hi,
I am running my nesting suite u-bc220 on monsoon. It runs successfully until the very last stage which is 55m nest archiving to mass.
The error message is:
[FAIL] ????????????????????????????????????????????????????????????????????????????????
[FAIL] ???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
[FAIL] ? Error code: 20
[FAIL] ? Error from routine: READFLDS
[FAIL] ? Error message: READFLDS: start address of field not given
[FAIL] ? Error from processor: 0
[FAIL] ? Error number: 0
[FAIL] ????????????????????????????????????????????????????????????????????????????????
I have tried running
moo put -f -c umpp /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp0 moose:/crum/u-bc220/apa.pp/20160504T0300Z_55m.pp0
in the command line and I get the same error.
The 55 m archive has worked in the past. All I can think is that I have added extra stash which means .pp0 is bigger (25 GB). Is there a file size limit?
Please can you help with this?
Thanks,
Lewis
Change History (7)
comment:1 Changed 2 years ago by grenville
comment:2 Changed 2 years ago by nx902220
Hi Grenville,
Thank you for getting back to me. The conversion to 32-bit works. However when I do moo put on the 55m_um.pp0.pp file it fails with error message:
IO: Open: /home/d04/lblunn/55m_um.pp0.pp on unit 11
Request for 1464027307505440089 words, is not supported by a buffer of size 100
IO Error Report *
Unit Generating error= 11
—-File States —————————————
Unit 11 open on filename /home/d04/lblunn/55m_um.pp0.pp
—> File Type: 0 , Read Only: T , Write Only: F
—> Local: T AllLocal?: F Remote: F Broadcast: T
—> Local: T AllLocal?: F Remote: F Broadcast: T
—-End File States ———————————
????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 24
? Error from routine: io:buffin
? Error message: Supplied buffer too small
? Error from processor: 0
? Error number: 0
????????????????????????????????????????????????????????????????????????????????
Lewis
comment:3 Changed 2 years ago by nx902220
I have done some more tests:
1) I have tried archiving a smaller 55 m file (~0.5GB) but this also fails.
lblunn@xcslc0:~/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um> moo put -f -c umpp /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp1 moose:/crum/u-bc220/apa.pp/20160504T0300Z_55m.pp1
### put, command-id=668269981, estimated-cost=426696704byte(s), files=1
### /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp1: converted file format.
/home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp1: file transfer failure.
- task #0 (attempt 1 of 3): transfer failed (ERROR_TRANSFER).
/home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp1: file transfer failure.
- task #0 (attempt 2 of 3): transfer failed (ERROR_TRANSFER).
/home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp1: file transfer failure.
- task #0 (attempt 3 of 3): transfer failed (ERROR_TRANSFER).
uk.gov.meto.moose.business.command.exception.RetryableFileTransferException?: uk.gov.meto.moose.business.ftpclient.service.RetryableFtpClientException?: FTPService error: unable to login reply: 550 Requested action not taken.
put: failed (3)
2) I tried archiving 100 m file and this succeeded:
lblunn@xcslc0:~/cylc-run/u-bc220/share/cycle/20160504T0300Z/100m_um> moo put -f -c umpp /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/100m_um/100m.pp1 moose:/crum/u-bc220/apa.pp/20160504T0300Z_100m.pp1
### put, command-id=668270940, estimated-cost=148312064byte(s), files=1
### /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/100m_um/100m.pp1: converted file format.
lblunn@xcslc0:~/cylc-run/u-bc220/share/cycle/20160504T0300Z/100m_um> cd ../55m_um
The error I'm getting seems to be associated with my 55 m nest only.
3) My 55 m nest has worked in the past as can be seen from a moo ls:
lblunn@xcslc0:~/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um> moo ls -lt moose:/crum/u-bc220/apa.pp/
F adrian.hill 0.08 GBP 3020656144 2018-11-05 01:36:58 GMT moose:/crum/u-bc220/apa.pp/20160504T0300Z_55m.pp0
F adrian.hill 0.01 GBP 319708560 2019-01-24 18:34:10 GMT moose:/crum/u-bc220/apa.pp/20160504T0300Z_ukv.pp1
F adrian.hill 0.02 GBP 886195448 2019-01-24 18:34:27 GMT moose:/crum/u-bc220/apa.pp/20160504T0300Z_ukv_dymeaa_pd015.cutout
F adrian.hill 0.01 GBP 282094904 2019-01-24 18:35:09 GMT moose:/crum/u-bc220/apa.pp/20160504T0300Z_ukv_dymeaa_pe015.cutout
……
I'm not sure if these tests help.
Thanks,
Lewis
comment:4 Changed 2 years ago by grenville
Lewis
Request for 1464027307505440089 words, is not supported by a buffer of size 100 - this looks like an endianess problem - but that file seems to have disappeared?
transfer failed (ERROR_TRANSFER) - this appears to be a new problem; please email Monsoon - they may have more information.
Grenville
comment:5 Changed 2 years ago by nx902220
Hi Grenville,
Thanks will do. I'll let you know if the issue is resolved.
Lewis
comment:6 Changed 2 years ago by willie
- Component changed from Monsoon to Archiving
- Platform set to Monsoon2
- UM Version set to 10.5
comment:7 Changed 2 years ago by willie
- Resolution set to fixed
- Status changed from new to closed
Hi Lewis
Odd - it may be the file size. Try converting it to 32 bit pp - do this
/projects/um1/vn10.5/xc40/utilities/um-convpp /home/d04/lblunn/cylc-run/u-bc220/share/cycle/20160504T0300Z/55m_um/55m.pp0 ~/55m_um.pp0.pp
then run the moo put on ~/55m_um.pp0.pp — see what that does
Grenville