Opened 10 years ago

Closed 10 years ago

#557 closed help (fixed)

baffling error messages and crash in 4.5 job on HECToR

Reported by: jonathan Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: <select version>

Description

Dear helpdesk

Today my 4.5 FAMOUS-Glimmer job xdnyp is crashing on HECToR. Yesterday it worked, and I haven't found what has caused the difference. As far as I know, all I have done is change the start dumps and recompiled it, but the error is a nasty segfault (see below), not shown as a UM abort.

Trying to explain this, I have found other things in the output I don't understand:

(1) /bin/sh: mc: line 1: syntax error: unexpected end of file
/bin/sh: error importing function definition for `mc'

appears many times. That is definitely new, and doesn't appear in the outputs of yesterday's jobs. This extra stdout is causing problems in my SCRIPT (which has some stuff to do with Glimmer) that weren't happening before.

Also I get lots of
module: not found [No such file or directory]
which arise from the calls to module in "loadcomp $TARGET_MC" in my .profile, and
module load netcdf
module load NCO
That is, the module command doesn't exist, apparently. It works fine, and these errors do not arise, in my interactive session, but apparently not in the UM jobs. These errors were also happening yesterday and so are not connected with the problem. But they concern me as they may imply the wrong compiler etc is being used.

Any advice welcome on any of these! Thanks

Jonathan

Segmentation fault! Fault address: 0x1e60

Fault address is 4186528 bytes below the first valid
mapping boundary, which is at 0x400000.

This may have been caused by a struct access through a null pointer.
_pmii_daemon(SIGCHLD): PE 2 exit signal Aborted
_pmii_daemon(SIGCHLD): PE 4 exit signal Aborted
[NID 15909] 2011-01-04 14:00:43 Apid 2449163: initiated application termination
_pmii_daemon(SIGCHLD): PE 15 exit signal Aborted
xdnyp: Run failed

Change History (1)

comment:1 Changed 10 years ago by jonathan

  • Resolution set to fixed
  • Status changed from new to closed

Dear Lois

Thank you for your reassurance that the module and mc errors don't matter, though they are disturbing, as we discussed. I have solved the true problem, which was due to the wrong dump and recon being needed to change the dates. The segfault prevented me from seeing the error messages about this.

Cheers

Jonathan

Note: See TracTickets for help on using tickets.