Opened 12 years ago
Closed 10 years ago
#277 closed help (fixed)
memory leak in UM4.5 on HECToR ?
Reported by: | robin | Owned by: | um_support |
---|---|---|---|
Component: | UM Model | Keywords: | HECToR, UM4.5, memory |
Cc: | Platform: | ||
UM Version: | 4.5 |
Description (last modified by robin)
initial symptom: FAMOUS runs frequently being killed with
_pmii_daemon(SIGCHLD): PE 6 exit signal Killed
[NID 12315]Apid 120510: initiated application termination
[NID 12315] Apid 120510: OOM killer terminated this process.
message after ~7,8 wallclock hours of run. Compiled with pathscale -O3, as suggested by Annette Osprey.
Information from /proc/self/status collected during run implies up to 32k of extra memory is claimed every atmosphere timestep (depends on processor), which adds up over long runs. This seems to be made of 4k contributions, mostly from mpp_filter, but also other places in atm_dyn and (even less often) atm_phys.
Should be work roundable with more frequent resubmissions, of course.
cheers,
robin
Change History (2)
comment:1 Changed 12 years ago by robin
- Description modified (diff)
comment:2 Changed 10 years ago by lois
- Resolution set to fixed
- Status changed from new to closed
There is an issue about the memory usage of MPI on the XT systems which Cray are aware of, hopefully this will be resolved in new releases of the MPT libraries.