Opened 3 years ago

Closed 3 years ago

#2243 closed help (fixed)

Nested model recon step fails with UNRECOVERABLE library error

Reported by: shakka Owned by: um_support
Component: UM Model Keywords: memory, reconfiguration
Cc: Platform: Monsoon2
UM Version: 10.4



I am running the nested suite and repeatedly getting the error below. I have tried adjusting the number of cores per node as advised in, but am still getting the same error message every time I reload the suite and re-trigger the job.

Full error message:

lib-4205 : UNRECOVERABLE library error 
  The program was unable to request more memory space.
tcmalloc: large alloc 567348002619392 bytes == (nil)

Is there an optimum number of processors for something like this? Honestly, I was quite surprised to get this error message in the reconfiguration step because my understanding is that it is a relatively low-memory task compared to others.

Is there likely another cause of the problem? Are there perhaps unnecessary switches turned on as in the ticket above?



Change History (2)

comment:1 Changed 3 years ago by shakka

I have now solved this problem - I had created the ancil with the wrong endianness.

comment:2 Changed 3 years ago by shakka

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.