I have attempted to run my model xlell, which is a direct copy from xlelf except with a different SST ancillary file. It will run on ARCHER and output the first month of data but nothing thereafter but will hang on in the queue rather than crash. The only details I am receiving in the .leave file are saying about MPI_abort of a number of different processes e.g.:

Rank 35 [Wed Nov 4 13:01:37 2015] [c2-2c2s4n0] application called MPI_Abort(comm=0x84000006, 9) - process 34

I thought initially that this might be an intermittent problem so attempted it again to no avail. I then thought it might be a problem with the ancillary file or how I created it on xancil so I took an old netcdf sst file, put it through xancil and set the model to run, and this worked fine. I have also looked at the headers of both files and don't obviously see any problems. Any help on this would be greatly appreciated.


Hi Michelle,

The sst file /work/n02/n02/michmcr/data/tropic_sstx2_0312.anc has NANs in it - it is corrupt. You can see this by looking in xconv > view data on the first entry. Alternatively, you can cumf the file with itself. Obviously the file is identical with itself so there should be a perfect match, but when there are NaNs? these cause failures.

The file should be recreated.



Hi Willie,

Thank you for your help. I have rectified this and the model is now running properly and I am getting output files.

Many thanks

