Opened 9 months ago

Closed 9 months ago

#3483 closed help (fixed)

UNRECOVERABLE library error - unable to request more memory

Reported by: Leighton_Regayre Owned by: um_support
Component: UM Model Keywords: memory library error
Cc: Platform: PUMA
UM Version: 11.1

Description

Hello,

My suite u-cc125 did not run the 'recon' task, submitted from pumatest to ARCHER2.

The recon task job.err file has multiple instances of the error:
"lib-4205 UNRECOVERABLE library error

The program was unable to request more memory space"

Other tickets that cite this error (e.g. #2570) suggest it may be a temporary issue related to the queue. My suite's tasks were taking a long time to get picked up by the ARCHER2 queue, relative to historical ARCHER queue times. Should I restart this task?

The job.out file gives no indication of any problems specific to the suite.

Thanks,

Leighton

Change History (6)

comment:1 Changed 9 months ago by Leighton_Regayre

Hello,

I cleared the logs and resubmitted the suite. It crashes with the same error in the recon task.

I can't see any clues as to the cause of the error, so need advice please.

Thanks,

Leighton

Last edited 9 months ago by Leighton_Regayre (previous) (diff)

comment:2 Changed 9 months ago by Leighton_Regayre

  • Keywords memory library error added
  • Platform set to PUMA
  • UM Version set to 11.1

comment:3 Changed 9 months ago by grenville

Leighton

please

chmod -R g+rX /home/n02/n02/<your-username>
chmod -R g+rX /work/n02/n02/<your-username>

Grenville

comment:4 Changed 9 months ago by Leighton_Regayre

Hi Grenville,

I changed access for these folders as requested.

I once again cleared logs and resubmitted the suite, which produces the same error.

Thanks,

Leighton

comment:5 Changed 9 months ago by Leighton_Regayre

I have reviewed other tickets related to this error again. Tickets #3095,#2588, #1937 and #2570 all have the same error for what seems like different problems.

In my case, the solution in #3095 gave me reason to review all ancillaries. I found an ancilliary that had been shared with me which was corrupted and is presumably the cause of my suite error.

Thanks,

Leighton

comment:6 Changed 9 months ago by Leighton_Regayre

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.