Opened 4 months ago

Closed 2 months ago

#2972 closed help (answered)

two suites failed at the same time

Reported by: xd904476 Owned by: um_support
Component: UM Model Keywords: HadGEM3-GC3.1 N96ORCA1
Cc: Platform: ARCHER
UM Version: 10.7

Description

Hi, I am running two suites and a couple of hours ago both failed within a minute. I can't find any obvious error in any of them under the job.err, job.out and ocean.output
they are u-bj538 and u-bk404.
suite u-bj538 was running the second year |(coupled) and suite bk-404 the first year (coupled).
is this a coincidence or is there anything going on with some maintenance that I don't know about? any suggestion?
thanks,
dani

Change History (8)

comment:1 Changed 4 months ago by grenville

Dani

u-bk404 has failed with

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!
? Error code: 1
? Error from routine: EG_BICGSTAB

on the first timestep — that's frequently associated with bad start data of some kind.

Grenville

comment:2 Changed 4 months ago by xd904476

Hi Grenville, thanks. I can see some nans errors (under Ros' suggestion), in ice_diag.d. I have substituted them with zeros and I've submitted the run again. I'm waiting to see what happens now.
Thanks,
dani

ps: in what file was this error?

comment:3 Changed 3 months ago by xd904476

Hi Grenville, I seem to have sorted the first error which was appearing as a nan in the initial reading of the cice start dump, but now I get an error in ocean.output which looks more like a timestep error. Should I adjust somehow the cice restart file for timesteps as well? or is the problem somewhere else?

thanks,
dani

comment:4 Changed 3 months ago by willie

  • Keywords HadGEM3-GC3.1 N96ORCA1 added
  • Platform set to ARCHER
  • UM Version set to 10.7

Hi Dani,

The job u-bj538 is running now. The only difference with u-bk404 is the CICE start dump. This indicates that u-bk404's CICE start dump is faulty: you should look at how it was generated.

Willie

comment:5 Changed 3 months ago by xd904476

Hi, I have just found another inconsistency that I can't explain/solve.
In the cice namelist I have set a number of variables to produce daily outputs (i.e. f_fcondtop_ai - value 'd'), but they don't appear in the daily outputs. This is true for a few other variables that I am interested in.
Since I am going to setup another run with some new STASH requests, I'd like to do this before setting up the new run. Is there anything else to tick in order for the variables in icenamelist to be written out?

Thanks,
Dani

comment:6 Changed 3 months ago by grenville

Dani

I am not familiar with CICE set up; I'd need to consult CICE documentation & run some tests - have you got access to CICE documentation?

Grenville

comment:7 Changed 3 months ago by xd904476

Hi grenville, sorry for the late reply.
I have found the problem: there is an issue with the padding of the sea ice thickness/volume/concentration where there are the hinges of the antarctic land.
I can show you properly when I get back for future references.
Thanks a lot,
dani

comment:8 Changed 2 months ago by grenville

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.