Opened 9 years ago

Closed 9 years ago

#725 closed help (fixed)

Failure at the reconfiguration in 4km run of the NCAS standard LAM

Reported by: cplanche Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: 6.1

Description

Hi,

I'm doing nested runs (Global - 12km - 4km), using the NCAS standard nested LAM (UM v6.1 on Hector).
I've managed to run the global and 12km runs, but the model now crashes at the reconfiguration at 4km.
The job is xgocc.

The error message on the .leave file says:
xgocc: Starting run
_pmii_daemon(SIGCHLD): [NID 01825] [c1-1c1s0n1] [Mon Oct 31 18:57:53 2011] PE 42 exit signal Segmentation fault
_pmii_daemon(SIGCHLD): [NID 01412] [c5-0c1s2n2] [Mon Oct 31 18:57:53 2011] PE 5 exit signal Segmentation fault [NID 01412] 2011-10-31 18:57:53 Apid 1379697: initiated application termination
xgocc: Run failed

I looked the two .pe42 and .pe5 files and I cannot find any error messages, except for:

WARNING in reconfiguration in routine rcf_h_int_init_bl Warning Code:- -10 Warning Message:- Interpolating ozone from zonal to full field Warning generated from processor 42

This message however appears in all the PEs.

On the helpdesk, I found a similar problem but the response was that "UM uses zonal ozone for global runs and full field for LAMS" (as in the warning message). In the ancillary directory of the UMUI, ozone is well held as full field.

Many thanks

Celine

p.s: Before somebody tries to read the different files on Hector, I would like to know how I can change the permissions on my Hector directories?

Change History (5)

comment:1 Changed 9 years ago by grenville

Hi Celine,

Can you change the permissions on your HECToR /home and /work directories so that we can see them please?

chmod -R g+rx /home/n02/n02/dmitch

and similarly for /work

Grenville

comment:2 Changed 9 years ago by willie

Hi Celine,

This appears to be a copy of the standard job xdgnc. These jobs need to build the model and unfortunately the compiler is not available until Phase 3 is installed. The model is failing because it can't find the executable.

Regards,

Willie

comment:3 Changed 9 years ago by cplanche

Hi,

I changed the permissions on my home and work directories.
Willie, when the Phase3 will be exactly installed?

Regards,

Celine

comment:4 Changed 9 years ago by willie

Hi Celine,

These jobs were run before the maintenance session and should have worked. The executables are present. The only thing I can think of at the moment is that if you are switching between UM versions, you need to ensure that you keep the version number in your .profile in step.

Phase 3 will be installed next week - see the HECToR user page for status.

Regards,

Willie

comment:5 Changed 9 years ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.