Opened 4 years ago

Closed 3 years ago

#1824 closed help (answered)

MONC on ARCHER: Slow Writing

Reported by: CCC_MC Owned by: willie
Component: MONC Keywords: NetCDF, Parallel I/O
Cc: Platform: ARCHER
UM Version: <select version>

Description

I am experimenting with MONC on ARCHER. It seems like the model output is taking way too long to be completed. On average, the time required to complete the run is only 25 minutes, yet the output can take more than 50 minutes sometimes.

Please find the attached pbs output file for details for jobs. Job 3544899 is MONC run with the diagnostics files written at an interval, while Job 3540451 is another MONC run with diagnostics written in the same NetCDF file at the end of the model run.

Attachments (2)

monctest.o3544899 (55.4 KB) - added by CCC_MC 4 years ago.
MONC ARCHER PBS Output Job 3544899
monctest.o3540451 (80.1 KB) - added by CCC_MC 4 years ago.
MONC ARCHER PBS Output Job 3540451

Download all attachments as: .zip

Change History (5)

Changed 4 years ago by CCC_MC

MONC ARCHER PBS Output Job 3544899

Changed 4 years ago by CCC_MC

MONC ARCHER PBS Output Job 3540451

comment:1 Changed 4 years ago by willie

  • Component changed from Data to MONC
  • Keywords MONC, removed
  • Owner changed from um_support to willie
  • Status changed from new to accepted

Hi Jonathan,

I'm glad you've got this going. You can just increase the wall time to ensure that the job completes. If you are concerned about efficiency then you will need to experiment with the number of processors and IO servers to find the best combination.

To help you better, could you please do the following

chmod -R g+rX /home/n02/n02/cccmc825
chmod -R g+rX /work/n02/n02/cccmc825

then we will be able to see your setup and there will be less need to attach files.

Regards

Willie

comment:2 Changed 4 years ago by CCC_MC

Hi Willie,

Thank you for your help. I have been sticking to the ARCHER-ready settings provided in the MONC course. It is recommended to have one I/O server per node, thus for ARCHER 1 core as I/O server per 11 cores as MONCs. However, that seems not to work for my run.

I have changed the file permission on my directory. I would like to know whether there are settings which I have set incorrectly which may have led to the issue.

Regards,
Jonathan

comment:3 Changed 3 years ago by willie

  • Resolution set to answered
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.