#666 closed error (fixed)

FP overflow

Reported by: a.elvidge Owned by: willie
Running a 1.5km LAM on MONSooN I am getting an error:

  Signal received: SIGFPE - Floating-point exception
    Signal generated for floating-point exception:
      FP overflow

  Instruction that generated the exception:
    fmadd fr00,fr00,fr01,fr15

    Offset 0x00002cc4 in procedure screen_tq_, near line 582 in file /working/swebst/xfmee/ummodel/ppsrc/UM/atmosphere/boundary_layer/screen_tq.f90
    Offset 0x00003678 in procedure sf_impl2_, near line 1109 in file /working/swebst/xfmee/ummodel/ppsrc/UM/atmosphere/boundary_layer/sf_impl2.f90
    Offset 0x000013ac in procedure imp_solver_, near line 1293 in file /working/swebst/xfmee/ummodel/ppsrc/UM/atmosphere/boundary_layer/imp_solver.f90
    Offset 0x00000db0 in procedure imps_intct_, near line 914 in file /working/swebst/xfmee/ummodel/ppsrc/UM/atmosphere/boundary_layer/imps_intct.f90
    Offset 0x00002168 in procedure ni_imp_ctl_, near line 2441 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/ni_imp_ctl.f90
    Offset 0x0000b204 in procedure atmos_physics2_, near line 5135 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/atmos_physics2.f90
    Offset 0x0001d33c in procedure atm_step_, near line 13405 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/atm_step.f90
    Offset 0x00084f44 in procedure u_model_, near line 4852 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/u_model.f90
    Offset 0x00002048 in procedure um_shell_, near line 3581 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/um_shell.f90
    Offset 0x00000090 in procedure flumemain, near line 48 in file /working/swebst/xfmee/ummodel/ppsrc/UM/control/top_level/flumeMain.f90
    --- End of call chain ---
ERROR: 0031-300  Forcing all remote tasks to exit due to exit code 1 in task 23

I can't see any more details about the error in the output (an no appearance of any of these error messages in previous tickets). Help is much appreciated.

Thanks, Andy

Change History (4)

comment:1

Sorry, the job is xfxkc

comment:2

  • Owner changed from um_support to willie
  • Status changed from new to assigned

Hi Andy,

This job runs for 15 time steps before crashing. A floating point overflow has occurred in the calculation of WeightDcl?, which involves an exponentiation operation. This is very worrying. The only thing I can suggest is to repeat the run and see if it crashes at the same place.



comment:3

Hi Andy,

You could increase the diagnostic output level. In Output choices select extra diagnostic messages. In scientific sections > section by sect > sect 13, push diag_prn button and then change from printing every 15 to every time step. Also select operational prints and tick both the "two norms" buttons.



comment:4

  • Resolution set to fixed
  • Status changed from assigned to closed
