Opened 3 years ago

Closed 3 years ago

#1832 closed help (answered)

Disc quota exceeded on ARCHER

Reported by: MikeN Owned by: ros
Priority: normal Component: ARCHER
Keywords: Cc:
Platform: ARCHER UM Version: 7.3

Description

Hello,

I'm getting an error when running my job xlkes. The error seems to be in the writing of the output. It says disk quota is exceeded but I'm not sure if this is necessarily true.

I have included what seem to be the relevant parts of the .leave file below. The whole file is at /home/mnewland/output/xlkes000.xlkes.d16074.t113604.leave

I'd be very grateful for any help,

Thanks,

Mike

BUFFOUT: Write Failed: Disk quota exceeded
Rank 22 [Mon Mar 14 12:12:11 2016] [c1-3c0s13n2] application called MPI_Abort(MPI_COMM_WORLD, 9) - process 22
Application 20924676 is crashing. ATP analysis proceeding…

BUFFOUT: Write Failed: Disk quota exceeded
Rank 61 [Mon Mar 14 12:12:11 2016] [c1-3c2s11n3] application called MPI_Abort(MPI_COMM_WORLD, 9) - process 61

ATP Stack walkback for Rank 22 starting:

_start@…:113
libc_start_main@…:242
flumemain_@…:38
um_shell_@…:3817
u_model_@…:6579
ereport_@…:384
gc_abort_@…:136
mpl_abort_@…:46
pmpi_abort@0xe65ffc
MPI_Abort@0xe80e1c
MPID_Abort@0xea8101
abort@…:92
raise@…:42

ATP Stack walkback for Rank 22 done
Process died with signal 6: 'Aborted'
Forcing core dumps of ranks 22, 1, 16, 23, 0
View application merged backtrace tree with: stat-view atpMergedBT.dot
You may need to: module load stat

_pmiu_daemon(SIGCHLD): [NID 04855] [c1-3c0s13n3] [Mon Mar 14 12:12:43 2016] PE RANK 24 exit signal Killed
[NID 04855] 2016-03-14 12:12:43 Apid 20924676: initiated application termination
_pmiu_daemon(SIGCHLD): [NID 04854] [c1-3c0s13n2] [Mon Mar 14 12:12:43 2016] PE RANK 2 exit signal Killed
diff: /work/n02/n02/mnewland/tmp/tmp.mom2.17752/xlkes.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/mnewland/um/xlkes/xlkes.thist to backup thist file /work/n02/n02/mnewland/um/xlkes/xlkes.thist_keep
xlkes: Run failed

%PE22 OUTPUT%

TRANSOUT: Error in data transfer to disk
MEANCTL: RESTART AT PERIOD_ 1
U_MODEL: interim history file deleted due to failu re writing partial sum files
*
UM ERROR (Model aborting) :
Routine generating error: U_MODEL
Error code: 1
Error message:

TRANSOUT: I/O write error

%PE61 OUTPUT%

TRANSOUT: Error in data transfer to disk
MEANCTL: RESTART AT PERIOD_ 1
U_MODEL: interim history file deleted due to failu re writing partial sum files
*
UM ERROR (Model aborting) :
Routine generating error: U_MODEL
Error code: 1
Error message:

TRANSOUT: I/O write error

Change History (4)

comment:1 Changed 3 years ago by MikeN

Actually, I think it is just a case of needing more disc space:

Disk quotas for user mnewland (uid 15687):

Filesystem kbytes quota limit grace files quota limit grace

/work 10588316* 0 10485760 - 23554 0 0 -

Could this be arranged please.

Thanks,

Mike

comment:2 Changed 3 years ago by MikeN

  • Component changed from UKCA to ARCHER
  • Summary changed from Output error to Disc quota exceeded on ARCHER

comment:3 Changed 3 years ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi Mike,

Sorry for the delay in responding to this. I have put in a request to increase your quota. This will take a few hours for it to take effect.

Regards,
Ros.

comment:4 Changed 3 years ago by ros

  • Resolution set to answered
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.