Opened 2 years ago

Closed 2 years ago

#2261 closed help (answered)

"Disk quota exceeded" stops job building

Reported by: pmcjs Owned by: grenville
Component: Disk Space Keywords:
Cc: Platform: ARCHER
UM Version: 10.7

Description

Hello CMS,

I am building a UM job u-ap955 which got an hour or so into the compile when it stopped with the following messages (from the stderr log):

Switching to pmi/5.0.6-1.0000.10439.140.2.ari.
Switching to atp/1.8.3.
Switching to cray-libsci/13.2.0.
Switching to cray-mpich/7.2.6.
Switching to craype/2.4.2.
Switching to cce/8.4.1.
Switching to modules/3.2.10.3.
[FAIL] ftn -oo/ukca_calcnucrate_mod.o -c -I./include -s default64 -e m -J ./include -I/work/y07/y07/umshared/gcom/cce8.4.1/gcom6.2/archer_xc30_cce_mpp/build/include -O2 -Ovector1 -hfp0 -hflex_mp=strict -h omp /work/n02/n02/pmcjs/cylc-run/u-ap955/share/fcm_make_um/preprocess-atmos/src/um/src/atmosphere/UKCA/ukca_calcnucrate.F90 # rc=1
[FAIL] ftn-2116 crayftn: INTERNAL  
[FAIL]   "/opt/cray/cce/8.4.1/cftn/x86-64/lib/optcg" was terminated due to receipt of signal 013:  Segmentation fault.
[FAIL] compile    2.8 ! ukca_calcnucrate_mod.o <- um/src/atmosphere/UKCA/ukca_calcnucrate.F90
[FAIL] ftn -oo/thp_det_4a5a.o -c -I./include -s default64 -e m -J ./include -I/work/y07/y07/umshared/gcom/cce8.4.1/gcom6.2/archer_xc30_cce_mpp/build/include -O2 -Ovector1 -hfp0 -hflex_mp=strict -h omp /work/n02/n02/pmcjs/cylc-run/u-ap955/share/fcm_make_um/preprocess-atmos/src/um/src/atmosphere/convection/thp_det-thpdet4a.F90 # rc=1
[FAIL] ftn-2116 crayftn: INTERNAL  
[FAIL]   "/opt/cray/cce/8.4.1/cftn/x86-64/lib/optcg" was terminated due to receipt of signal 013:  Segmentation fault.
[FAIL] compile    0.4 ! thp_det_4a5a.o       <- um/src/atmosphere/convection/thp_det-thpdet4a.F90
[FAIL] ftn -oo/term_con_6a_mod.o -c -I./include -s default64 -e m -J ./include -I/work/y07/y07/umshared/gcom/cce8.4.1/gcom6.2/archer_xc30_cce_mpp/build/include -O2 -Ovector1 -hfp0 -hflex_mp=strict -h omp /work/n02/n02/pmcjs/cylc-run/u-ap955/share/fcm_make_um/preprocess-atmos/src/um/src/atmosphere/convection/term_con_mod-6a.F90 # rc=1
[FAIL] ftn-382 crayftn: ERROR in command line
[FAIL]   Cannot open Compiler Information File "<aux CIF>".
[FAIL] Reason: Disk quota exceeded

... [some more output] ...

My /home, /work and PUMA allocations all have plenty of space, but a check of SAFE seems to suggest that n02 has reached its file allocation on /work and is very close to the disk space on /nerc. I wonder if this is the cause of the problem. In the meantime I'll delete some unneeded files from Archer, although I don't think this will make much of a dent overall!

Thanks,
Chris

Change History (3)

comment:1 Changed 2 years ago by grenville

Chris

n02 has plenty of space on /work (not sure what you're looking at)— there have been a few odd compilation-related events on ARCHER today. There are other places on ARCHER where a quota could be exceeded - I'll ask them.

Grenville

comment:2 Changed 2 years ago by pmcjs

Hi Grenville,

The number I was looking at was the number of files on /work - I might have misinterpreted this but it looks like there is a limit of 45 million files (I assume between all users), and the total usage is very close to this (44.93M as of this morning). I don't know if these numbers mean anything or these limits are enforced.

Cheers,
Chris

comment:3 Changed 2 years ago by grenville

  • Resolution set to answered
  • Status changed from new to closed

Chris

You're right - n02 have hit the of files limit. ARCHER will address this.

Grenville

Note: See TracTickets for help on using tickets.