Opened 4 days ago

Last modified 5 hours ago

#2151 new help

Problem with a Global Reconfiguration job

Reported by: sam89 Owned by: um_support
Priority: high Component: UM Model
Keywords: Global Cc:
Platform: Monsoon2 UM Version: 8.2

Description

When I set the reconfig job to run it seems to fall over straight away and I keep getting this error

ERROR: the Ancil filenames version /projects/um1/ancil/ancil_versions/filenames/v2 not found

I then went looking in monsoon to find out where the ancillaries are now and I found what I think is them in
/projects/um1/ancil/data/ancil_versions/N512/v2 so I changed the orography and land sea mask file directory to this extentsion and then changed the filename to ancils. I reran it and got the same error again so I was wondering if you could look into it to figure out the issue.
The build job is xnjja and the recongif job I tried is xnjjj.
I assume I am missing changing something in the job since it appears to still be looking for the /projects/um1/ancil/ancil_versions/filenames/v2 directory but I can't see where I am missing changing it and I am unsure I have changed it to the right thing anyway…

Thanks

Sam Clarke

Change History (7)

comment:1 Changed 4 days ago by willie

Hi Sam,

It looks like the ancil version files have been moved on the new Monsoon. You should revert to the initial set up and then change the location in UMUI page Ancillary and Input data > infile related > Ancillary version files. Change

$UMDIR/ancil/ancil_versions/... -> $UMDIR/ancil/data/ancil_versions/...

in both boxes. That should fix it.

Regards
Willie

comment:2 Changed 4 days ago by sam89

I changed both boxes to $UMDIR/ancil/data/ancil_versions/n512/ps30/v2 but it is saying this error:

ERROR: the Ancil filenames version /projects/um1/ancil/data/ancil_versions/n512/ps30/v2 not found

Last edited 4 days ago by sam89 (previous) (diff)

comment:3 Changed 4 days ago by willie

Hi Sam,

I think you need to add /ancils to the end - it's looking for a file rather than a directory.

Willie

comment:4 Changed 4 days ago by sam89

Hi Willie

Thanks for that, it is working now!

Sam

comment:5 Changed 4 days ago by sam89

I seem to be having a seperate issue now:

lib-4205 : UNRECOVERABLE library error

The program was unable to request more memory space.

tcmalloc: large alloc 16744049999630761984 bytes == (nil)
tcmalloc: large alloc 16744049999630761984 bytes == (nil)
tcmalloc: large alloc 16744049999630761984 bytes == (nil)
tcmalloc: large alloc 16744049999630761984 bytes == (nil)

I have not seen this error before…

comment:6 Changed 3 days ago by willie

Hi Sam,

The new Monsoon has 36 cores per node, so you could use 12EW x 9NS to get an exact multiple and more processors, so more memory available per processor. See http://collab.metoffice.gov.uk/twiki/bin/view/Support/WhatIsMONSooN for details.

Willie

comment:7 Changed 5 hours ago by sam89

Hi Willie

I tried changing to 12 x 9 and other variations upon that but I am still receiving the same error in the output file.

Sam

Note: See TracTickets for help on using tickets.