Opened 9 years ago

Closed 9 years ago

#799 closed error (fixed)

Problems with setting a new nested run over Australia

Reported by: cbirch Owned by: willie
Component: UM Model Keywords: optimisation
Cc: Platform:
UM Version: 7.3

Description

Hi,

I'm trying to set up a new nested run over Australia. I have two issues at the moment, which I can't solve:

1) The global run (xgzta000.xgzta.d12060.t034625.leave) fails in the model executable when the 20110921_qwqg00.T+0 start dump is used. It does however work if the 20110921_qwqg12.T+0 start dump is used (xgztb). 12Z is however, to late to initialise the run for science reasons. I've tried recompiling and also using the executables from xgztb and neither of these things worked.

2) The 12km nest (tested using the LBC's generated from xgztb) fails due to an ancillary file problem (xgzth000.xgzth.d12060.t041234.leave). I think its something to do with the orography ancillary file, but I'm not sure what is wrong with it.

Sorry if I have missed something with these runs but I am setting this up from Australia and the umui runs so slowly it is almost impossible to work with (it takes up to 5mins to load up a window for a job in the umui). I guess there is nothing that can be done about this?

Cheers,
Cathryn

Change History (8)

comment:1 Changed 9 years ago by willie

Hi Cathryn,

Can you give me read permission on the core file in the directory xgzta, please? Also, can you run

lfs quota -u cbirch /esfs1 | grep "/esfs1" | awk '{printf("WORK %5.2f %%\n",100*$2/$3)}'

and let me know the result.

Regards,

Willie

comment:2 Changed 9 years ago by willie

  • Owner changed from um_support to willie
  • Status changed from new to accepted

comment:3 Changed 9 years ago by cbirch

Hi Willie,

WORK 2000.15% was the result.

I also changed the permissions on that file.

I think point (2) is due to an error in the ancillary file, which Grenville is looking into, so don't worry about that.

Thanks,
Cathryn

comment:4 Changed 9 years ago by willie

Hi Cathryn,

There is a segmentation fault in the dynamics advection code in the routine bi_linear_h just as it is about the produce the error message "overwriting due to dim_e_out size". There are many causes of this error. Things to do are,

  • go to output choices and switch on subroutine timer diagnostics
  • in scientific section 13, push DIAG_PRN and select "flush buffer if run fails"
  • Switch off your Using user STASH files (because you have no STASH)
  • on the compile options for the model untick the "fast but non-reproducible code"

Then submit the run again. Could you also do

lfs quota -u cbirch /esfs1

and let me know the result.

Regards.

Willie

comment:5 Changed 9 years ago by cbirch

Hi Willie,

lfs quota -u cbirch /esfs1

gives:

Disk quotas for user cbirch (uid 6138):

Filesystem kbytes quota limit grace files quota limit grace

/esfs1 292158096 1024000000 1024000000 - 1166 0 0 -

I made all those changes apart from number 3 and it worked fine, so thanks for your help. I have however been running the model for the wrong month (Sept instead of Nov) - my fault as I asked for the wrong start dumps and I have only just noticed(!). I have put in a request for the correct ones so it would be good if you could keep this open until I have checked that the job works with the new ones.

Thanks,
Cathryn

comment:6 Changed 9 years ago by willie

  • Keywords optimisation added

Hi Cathryn,

I suspect it was unticking the "fast but non-reproducible code" that made it work. I have transferred your November start dumps.

Regards

Willie =

comment:7 Changed 9 years ago by cbirch

Hi Willie,

I tested the runs with the November start dumps and they work fine so you can close this now. Thanks for your help.

Cathryn

comment:8 Changed 9 years ago by willie

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.