#1722 closed help (answered)

Archiving error and "Stale NFS Handle" issue on ARCHER RDF archive

Reported by: gmann Owned by: um_support
Priority: highest Component: UM Model
Keywords: rdf Cc:
Platform: ARCHER UM Version: 8.4

Description (last modified by ros)

This evening I just logged in to ARCHER to check the status of a model run I
submitted earlier in the day and found that it had crashed with error whilst copying files over to the RDF archive.

The job is xlrfv — it's running UM-UKCA at v8.4 in 8-month continuation steps.

I'd submitted the NRUN on Thursday which had completed it's 8 months OK.

Earlier today (Friday 6th Nov) I'd then submitted the CRUN and the 1st 8-month
continuation step ran a few months OK but failed with archiving error whilst
copying the files over to the RDF — see the .leave file for that 1st CRUN step at:

/home/n02/n02/gmann/output/xlrfy000.xlrfy.d15303.t191744.leave

The error is "T qsserver failure at Fri Nov 6 21:57:12 GMT 2015"
"qscasedisp: return code after calling qshector_arch RCARC=2"

I just went to see whether the files had been copied over to the RDF and it's strange because I can't seem to list the files on the archive — it's giving
a "Stale NFS handle" error as below — so I'm wondering if there might be
some problem with the network connection to the RDF somehow?

Thanks for any help or advice you can give here.

Many thanks
Graham


gmann@eslogin005:~/output> cd
gmann@eslogin005:~> cd /nerc/n02/n02/gmann/xlrfv
-bash: cd: /nerc/n02/n02/gmann/xlrfv: Stale NFS file handle
gmann@eslogin005:~> cd /nerc/n02/n02/gmann/archive
-bash: cd: /nerc/n02/n02/gmann/archive: Stale NFS file handle
gmann@eslogin005:~> cd /nerc/n02/n02/
-bash: cd: /nerc/n02/n02/: Stale NFS file handle
gmann@eslogin005:~> ls


file xlrfva.pa19911021 is a byte swapped 64 bit ieee um file 
qscasedisp: return code after calling qshector_arch RCARC=0
qshector_arch: Successfull xlrfva.pa19911021 time taken 6.8539999999999992 seconds.
qsserver: Fri Nov  6 21:20:16 GMT 2015:  xlrfva.pa19911021 DELETE
xlrfva.pa19911021 deleted
qsserver: Fri Nov  6 21:24:02 GMT 2015:  xlrfva.pa19911022 ARCHIVE PPNOCHART

file xlrfva.pa19911022 is a byte swapped 64 bit ieee um file 
%%% /work/n02/n02/gmann/xlrfv/xlrfva.pa19911022 DELETE             
 ============================================================================== 
 =================================== ERRFLAG ================================== 
 ============================================================================== 
T qsserver failure at Fri Nov 6 21:24:08 GMT 2015
qscasedisp: return code after calling qshector_arch RCARC=2
 ============================================================================== 
 =================================== ERRFLAG ================================== 
 ============================================================================== 
T qsserver failure at Fri Nov 6 21:57:12 GMT 2015
0+1 records in
0+1 records out
35515 bytes (36 kB) copied, 0.00331741 s, 10.7 MB/s
Archiving failure:restart file moved to 
mv: missing destination file operand after `/xlrfv*'
Try `mv --help' for more information.
*****************************************************************
****************************************************************
     Job ended at :  Fri Nov  6 21:57:20 GMT 2015
****************************************************************
*****************************************************************

Change History (2)

comment:1 Changed 18 months ago by grenville

Hi Graham

I hope you got ARCHER's message - they have detected a problem with the RDF and are working on it.

Grenville

comment:2 Changed 17 months ago by ros

  • Description modified (diff)
  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.