Opened 9 years ago
Closed 9 years ago
#783 closed help (fixed)
Shared Nodes Performance Slowdown
Reported by: | pliojop | Owned by: | jeff |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | ||
UM Version: | <select version> |
Description
Hi,
Not sure if this is the right place for this, but thought I'd start here.
I run UM4.5 on a HPC cluster at the University of Leeds, called ARC1. Recently for the UM only if a 16 processor job shares a node with another 16 processor job it causes a slow down in the CPU speeds on the shared node
ie
Node 1 = 12 cores used
node 2 = 4 cores and 4 cores used
node 3 = 12 cores used
Nodes 1 & 3 will run at close to 100% while node 2 will be down at 33-50%.
I was wondering if this had been encoutered before during any changes to the HECTOR computers over the years, and if it ahd been if a fix was applied.
Many Thanks
James Pope
eejop@…
Change History (2)
comment:1 Changed 9 years ago by jeff
- Owner changed from um_support to jeff
- Status changed from new to accepted
comment:2 Changed 9 years ago by jeff
- Resolution set to fixed
- Status changed from accepted to closed
Hi James
This situation can't arise on Hector as sharing of nodes between jobs is not allowed.
The problem looks to be that the codes sharing a node are running on the same cores instead of using separate cores. This is probably a problem with whatever program you use to launch the mpi executable and you will need to talk to your local support people.
Jeff.