Opened 8 years ago

Closed 7 years ago

#1050 closed help (fixed)

HiGEM job from UM startdump - instability problem

Reported by: m.k.hawcroft Owned by: willie
Component: UM Model Keywords:
Cc: Platform: HECToR
UM Version: 6.1

Description

Hi

I am trying to run HiGEM from some UM startdumps in June 2007. The reconfiguration appears to work, but when the model runs, after a number of timesteps the solution stops converging and the run fails. Attempting this from two different startdumps has the same problem (albeit at different timesteps). Len has assisted and we have tried altering the ancillary files and the reconfiguration of the startdump, but haven't been able to resolve the instability problem.

This setup has been run from a HiGEM startdump (a job from Margaret Woodage) without any problems, so I assume the issue is from the reconfiguration/startdump, though can't see the source of the problem thus far.

The most recent job is xikub (/home/n02/n02/mkh/work/um/xikub) and the error is:

Atm_Step: Timestep 87

==============================================
initial Absolute Norm : 8627168.7284560911
GCR( 2 ) failed to converge in 100 iterations.
Final Absolute Norm : 3295.20047692041
==============================================

(/home/n02/n02/mkh/um/umui_out/xikub000.xikub.d13099.t112238.leave)

If you have any suggestions as to what the issue might be, it would be much appreciated.

Thanks

Matt

Change History (7)

comment:1 Changed 8 years ago by willie

  • Owner changed from um_support to willie
  • Platform changed from <select platform> to HECToR
  • Status changed from new to accepted
  • UM Version changed from <select version> to 6.1

Hi Matt,

These problems can some times be solved by halving the time step. Change this in the UMUI page Atmos > Scientific parameters > Time Stepping.

Regards

Willie

comment:2 Changed 8 years ago by m.k.hawcroft

Hi Willie

Thanks - I tried halving the timestep and the job crashed much earlier with the following error:

Atm_Step: Timestep 10

==============================================
initial Absolute Norm : 1118.2674488365953
GCR( 2 ) failed to converge in 100 iterations.
Final Absolute Norm : 2.28438692806029776E-2
==============================================

Atm_Step: Timestep 11

==============================================
initial Absolute Norm : 1153.7334280002906
GCR( 2 ) converged in 1 iterations.
Final Absolute Norm : NaN
==============================================

WARNING q_POS : 61920 points were less than 0. and the scaling factor has been reset to 1
WARNING q_POS : VALUES RESET NON CONSERVATIVELY MANNER
WARNING q_POS : 61920 points were less than 0. and the scaling factor has been reset to 1
WARNING q_POS : VALUES RESET NON CONSERVATIVELY MANNER

It was run as xikuc. If you have any other likely candidates for the problem, please let me know.

Thanks

Matt

comment:3 Changed 8 years ago by willie

Hi Matt,

I notice that some of your hand edits are not working: when you press Process in the UMUI, these show a question mark in the panel. Perhaps this could be a cause?

Regards,

Willie

comment:4 Changed 8 years ago by m.k.hawcroft

Hi Willie

The handedits that are not working are those relating to the ocean so aren't relevant to this job - they are a legacy of the original job which this run is based on, so shouldn't affect this. Do you have any other suggestions as to likely cause?

Thanks

Matt

comment:5 Changed 8 years ago by willie

Hi Matt,

Did job xiide work?

regards

Willie

comment:6 Changed 7 years ago by willie

Hi Matt,

The job chain for this is

xgdlb → xhtkp → xiide → xikub

xiide fails after 67 time steps in a similar manner to xikub. The job xhtkp works. By doing a job comparison between xhtkp and xiide, you will get a list of likely culprits for the problem.

When you're compiling and building for running you should ensure that you write it to the $RUNID directory. The job xiide writes its executable into another jobs directory (xiida). This will cause confusion later on.

I hope that helps.

Regards,

Willie

comment:7 Changed 7 years ago by willie

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.