Opened 8 years ago

Closed 8 years ago

#962 closed error (fixed)

corrupted files on hector's /nerc disk

Reported by: jonny Owned by: um_support
Component: HECToR Keywords: corrupt UM files
Cc: Platform: HECToR
UM Version: 6.1

Description

Hi,
I have some issues with UM files I transferred to the /nerc disk using rsync. I am not sure what has caused this.

Many of my files on hector e.g.
/nerc/n02/n02/jonny/xhbma/archive/xhbmao.pmq8dec
seem to have a problem i.e.

jonny@hector-xe6-3:/nerc/n02/n02/jonny/xhbma/archive> uminfo xhbmao.pmq8dec

Error file xhbmao.pmq8dec has unknown file type
Error in getting file type
forrtl: error (76): IOT trap signal
Aborted

I have a copy of the same file (though not for all affected files) which I downloaded directly from the work disk onto jasmin which seems fine i.e.:

jasmin1$ uminfo xhbmao.pmq8dec | more

header ( 1) = 20
header ( 2) = 2
header ( 3) = 4
header ( 4) = 0
header ( 5) = 3
header ( 8) = 2
header ( 9) = 2
header ( 10) = 0
header ( 12) = 601
header ( 21) = 2068
header ( 22) = 12
header ( 23) = 1
header ( 24) = 0
header ( 25) = 0
header ( 26) = 0
header ( 27) = 331
header ( 28) = 2069
header ( 29) = 1
header ( 30) = 1
header ( 31) = 0
header ( 32) = 0

Has this problem been observed before? Do you have any idea what might have caused this? And do you think any such afflicted files are recoverable?

Thanks
Jonny

Change History (11)

comment:1 Changed 8 years ago by grenville

Jonny

Try running md5sum on both files, do you get the same checksum?

Grenville

comment:2 Changed 8 years ago by jonny

Hi Grenville,
I get:
jasmin1$ md5sum xhbmao.pmq8dec
d79aff1f2da6c61a480c073dda54d4f0 xhbmao.pmq8dec

for the good one of jasmin, and
jonny@hector-xe6-3:/nerc/n02/n02/jonny/xhbma/archive> md5sum xhbmao.pmq8dec
md5sum: xhbmao.pmq8dec: Input/output error

for the one on hector.

cheers
Jonny

comment:3 Changed 8 years ago by grenville

Jonny

Your data on jasmin also seems to be corrupted - uminfo works on xhbmao.pmq8dec, but xconv produces lots of errors, and won't display data (try viewing potential temperature for example), whereas the same file on /nerc can be coaxed to show the data.

Grenville

comment:4 Changed 8 years ago by jonny

Hi Grenville,
that's a bit weird I can view the potential temp fields in xhbmao.pmq8dec fine on jasmin, in xconv. Which version are you using? I am using /home/jeff/linux_x86_64/bin/xconv1.92 on jasmin1.

cheers
Jonny

comment:5 Changed 8 years ago by grenville

I copied /home/jonny/links/hadgem1_2/ctrl/raw/um/xhbmao.pmq8dec to my home directory and the two files don't have the same checksum. My copy is bad!

comment:6 Changed 8 years ago by grenville

Oh, I see what happened - I managed to mangle my copy of xhbmao.pmq8dec - all is file on jasmin

comment:7 Changed 8 years ago by emb66

Dear Grenville and Johny,

I'm having the same problem - my files on /nerc disc got corrupted and I can't open a large proportion of them. I did md5sum on them when I moved the files from /work to /nerc and it was all fine. Now, the md5sum shows loads of I/O errors. Moreover, a while ago a copied some files from /nerc directly to our local computer and these files are absolutely fine, while the ones on /nerc are corrupted.

Ewa

comment:8 Changed 8 years ago by emb66

And I don't know whether this can be any helpful but the corruption must have happened during the last 7 days as I have another set of files on /nerc that I was working on last week and there were 100% fine back then. Now they are also in part corrupted…

comment:9 Changed 8 years ago by jonny

Dear Grenville and Ewa,
I can second that! I did some work with my files on /nerc on 1st Nov which suggests that they were fine then.

Jonny

comment:10 Changed 8 years ago by grenville

Jonny, Ewa

I have forwarded your messages to HECToR - please direct queries to HECToR and cc us about this.

Grenville

comment:11 Changed 8 years ago by grenville

  • Platform changed from <select platform> to HECToR
  • Resolution set to fixed
  • Status changed from new to closed

HECToR problem with the RDF now fixed

Note: See TracTickets for help on using tickets.