Opened 5 years ago

Closed 5 years ago

#1334 closed help (fixed)

Job stuck on archer

Reported by: pclark Owned by: um_support
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 8.6

Description

Hi
Back to job xkfza.
This has built and reconfigured OK and appears to be running. However, a run which took just 133 s wallclock (2826 CPU s) on 24 processors on the Met Office HPC has been sitting running for an hour and 20 minutes on Archer (439845.sdb). Any thoughts?

Thanks

Change History (7)

comment:1 follow-up: Changed 5 years ago by grenville

Peter

Please let us have read permission on your ARCHER home and work spaces.

Thanks

Grenville

comment:2 in reply to: ↑ 1 Changed 5 years ago by pclark

Replying to grenville:

Peter

Please let us have read permission on your ARCHER home and work spaces.

Thanks

Grenville

Done, I believe.

comment:3 Changed 5 years ago by grenville

Peter

Please try changing the GCOM collectives limit to 1 (Miscellaneous Sections…) or increase the number of mpi tasks to more than 64. (We normally do the former).

Grenville

comment:4 follow-up: Changed 5 years ago by grenville

Peter

The permissions are still to restrictive - please do this

chmod -R g+rX /home/n02/n02/paclark
chmod -R g+rX /work/n02/n02/paclark

Grenville

comment:5 Changed 5 years ago by pclark

Changing the GCOM collectives limit to 1 seems to have done the trick - 97 s wallclock, 4626 s cpu.

comment:6 in reply to: ↑ 4 Changed 5 years ago by pclark

Replying to grenville:

Peter

The permissions are still to restrictive - please do this

chmod -R g+rX /home/n02/n02/paclark
chmod -R g+rX /work/n02/n02/paclark

Grenville

Done

comment:7 Changed 5 years ago by ros

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.