Opened 8 years ago

Closed 8 years ago

#1075 closed help (fixed)

HiGEM run crashes, job xgvwe

Reported by: till Owned by: um_support
Component: UM Model Keywords: grid point storm, targeted diffusion
Cc: Platform: HECToR
UM Version: 6.1


I've set up a new HiGEM run, with a 1%CO2 atm. CO2 increase. It's job xgvwe. OIt runs fine for just over half a year but then crashes. I've tried simply re-starting it but to no avail. In the leave file, the crucial line seems to be:

BUFFIN: Read Failed: No such file or directory
_pmiu_daemon(SIGCHLD): [NID 02342] [c5-1c2s3n2] [Mon May 13 15:07:10 2013] PE RANK 57 exit signal Segmentation fault
[NID 02342] 2013-05-13 15:07:10 Apid 4580768: initiated application termination

You can find the leave file here:

Many thanks for your help!

Change History (2)

comment:1 Changed 8 years ago by willie

Hi Till,

The last leave file from the continuation run, xgvwe006…, shows that there is a segmentation fault, but no read failure. When the model has been running for this length of time, this suggests that it has become unstable. To check this out you could,

  • switch the flush modset back on
  • in scientific sections > section select DIAG_PRN and change the printing frequency to every time step, and the threshold from 0.4 to 10.0.
  • in Output Options, select extra diagnostic messages.
  • restart from the last valid start dump, which I think is xgvwea.dat15l0

If it then has a seg. fault., we can think about halving the time step , or other measures.



comment:2 Changed 8 years ago by willie

  • Keywords grid point storm, targeted diffusion added
  • Resolution set to fixed
  • Status changed from new to closed

The user solved the problem as follows:

I changed the parameter "targeted diffusion: vertical velocity test value" from 0.4 to 0.5 to avoid the grid point storm forming. (This is in that section 13, button "TARG".)

Note: See TracTickets for help on using tickets.