qsexecute error at point of job resubmission

Hello Helpdesk,

I have compiled and submitted a UM job (mx020105 - xdwsf) which seemingly ran successfully for its first couple of re-submissions but which now shows Segmentation Fault errors at the qsexecute stage of the re-submission process. It thus fails to run and aborts. I am running another virtually identical job in parallel (xdwse) which doesn't seem to be having the same problem. The only differences between the two jobs are in the use of one particular mod (xdwse has the mod /home/n02/n02/mx020105/am.mf77 and xdwsf has /home/n02/n02/mx020105/am_high.mf77) but I can't see that the mod should affect this. I'm struggling to work out why it is failing now when it ran successfully for the first couple of re-submissions.

An example of a .leave file for one of the successful submission times is:

and the subsequent failed resubmission is:

I have tried simply resubmitting the same job without saving/processing it etc. but find now the same error at qsexecute when I do this.

Any help would be much appreciated.

Many thanks,
Amanda Maycock

Hi Amanda,

I've had a quick look. These jobs run for a large number of time steps before failing. The last thing in the .leave file is a warning "overwriting due to bi_linear_h". Another user had this problem due to not configuring the land-sea mask but this may be a red herring. Before this there is "error halo_j too small 4" (in both runs), so the first thing to do is try a run with the halo sizes set to 5 on UMUI page Atmos > Domain > Horizontal.

Let me know if that works.



