Opened 11 years ago
Closed 10 years ago
#385 closed help (fixed)
xconv job for pp to nc processing running out of time in serial queue
Reported by: | seruth | Owned by: | lois |
---|---|---|---|
Component: | UM Tools | Keywords: | convsh |
Cc: | Platform: | ||
UM Version: | 4.5 |
Description
I'm not able to get a fairly simple script to run on Hector. This same
sort of script ran fine on hpcx. I'm processing UM output files to netcdf
format. I can process 1 year of the files in 15 minutes in the serial
queue but when I try processing all 6 years even a 6 hour job in the
serial queue doesn't succeed. We tried changing the memory allocation too
but that hasn't worked.
Change History (3)
comment:1 Changed 11 years ago by lois
- Owner changed from um_support to lois
- Status changed from new to assigned
comment:2 Changed 10 years ago by lois
A post processing service should be available on HECToR by March 2011 - the negotiations have taken a very long time!
comment:3 Changed 10 years ago by lois
- Resolution set to fixed
- Status changed from assigned to closed
Note: See
TracTickets for help on using
tickets.
Sorry for the delay in replying Ruth, there was a bit of confusion as to who would reply. This is not a new problem on HECToR, it is really a feature of the system!
If the files that you are converting are on /work, that is on the Lustre file system then it can be excruciating slow. The design of the Lustre file system is perhaps not HECToR's finest feature as it has only 1 metadata server which is a bottle neck. So converting files on /work is not a good idea.
Other solutions would be to copy the data to /home on HECToR and convert it there however we don't have a large /home allocation. Or you could ftp your raw data back to your own local workstation and convert it there.
There is an on-going discussion that NERC should have a workstation, with lots of disk, local to HECToR to resolve this issue. Everyone agrees and there is money but progress is slow.
Lois