Opened 5 years ago

Closed 5 years ago

#1347 closed help (completed)

Not getting the model running anymore (Creating a new branch)

Reported by: fcentoni Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: 7.3

Description

Hi,

I recently created a new Wesely dry dep branch from Luke's (um_br/dev/luke/VN7.3_UKCA_CheM_vn1.1_Wesely@11177) but the job keeps failing.
I have just replaced the original subroutine asad_chem_flux_diags.F90 with a modified version that Luke had given me in order to output 3D fluxes.

Either with the orginal asad_chem_flux_diags.f90 or modified one I get similar errors:

_pmiu_daemon(SIGCHLD): [NID 02160] [c3-1c0s12n0] [Sun Aug 17 17:40:19 2014] PE RANK 49 exit signal Segmentation fault
[NID 02160] 2014-08-17 17:40:19 Apid 9820528: initiated application termination
diff: /work/n02/n02/fcentoni/tmp/tmp.mom4.8356/xjypt.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/fcentoni/um/xjypt/xjypt.thist to backup thist file /work/n02/n02/fcentoni/um/xjypt/xjypt.thist_keep
xjypt: Run failed
*****************************************************************
   Ending script   :   qsexecute
   Completion code :   137
   Completion time :   Sun Aug 17 17:40:20 BST 2014
*****************************************************************

or activating O3 3D dry dep fluxes :

*********************************************************************************
 UM ERROR (Model aborting) :
 Routine generating error: ASAD_FLUX_PUT_STASH
 Error code:  34321
 Error message:
ASAD_FLUX_PUT_STASH  copydiag ERR: @▒h▒▒
 *********************************************************************************
_pmiu_daemon(SIGCHLD): [NID 02286] [c3-1c2s11n2] [Sun Aug 17 17:30:45 2014] PE RANK 48 exit signal Aborted
[NID 02286] 2014-08-17 17:30:45 Apid 9820503: initiated application termination
diff: /work/n02/n02/fcentoni/tmp/tmp.mom4.6065/xjypt.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/fcentoni/um/xjypt/xjypt.thist to backup thist file /work/n02/n02/fcentoni/um/xjypt/xjypt.thist_keep
xjypt: Run failed
*****************************************************************


Could you help me out with that?

Many thanks.
Kind regards,
Federico

Change History (7)

comment:1 Changed 5 years ago by willie

Hi Federico,

You're getting a segmentation fault. You could try setting the variable ATP_ENABLED to 1 in UMUI>script inserts and modification. Also in the compile panel, select debug and try running again. This should give a trace back indicating the problem.

Your /work/n02/n02/fcentoni/tmp directory is very large at 45GB. If you are not running anything, you could delete the contents of this to give you more space.

Regards,

Willie

comment:2 Changed 5 years ago by fcentoni

Hi Willie,

I have done that but the run failed again.
This is .leave file which came out:

xjypt000.xjypt.d14230.t192302.leave

It seems there may be something related to the routine asad_chem_flux.f90 and asad_flux_dat.f90.

Could you help me out with that?

Many thanks.
Kind regards,
Federico.

comment:3 Changed 5 years ago by fcentoni

Hi,

I keep getting the same error when attempting to output 3D O3 dry dep.

*********************************************************************************
 UM ERROR (Model aborting) :
 Routine generating error: ASAD_FLUX_PUT_STASH
 Error code:  34321
 Error message:
ASAD_FLUX_PUT_STASH  copydiag ERR: @▒h▒▒
 *********************************************************************************
_pmiu_daemon(SIGCHLD): [NID 02721] [c6-1c0s8n1] [Wed Aug 20 11:25:02 2014] PE RANK 7 exit signal Aborted
_pmiu_daemon(SIGCHLD): [NID 02982] [c7-1c1s9n2] [Wed Aug 20 11:25:02 2014] PE RANK 48 exit signal Aborted
[NID 02721] 2014-08-20 11:25:02 Apid 9845054: initiated application termination
diff: /work/n02/n02/fcentoni/tmp/tmp.mom3.1247/xjypu.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/fcentoni/um/xjypu/xjypu.thist to backup thist file /work/n02/n02/fcentoni/um/xjypu/xjypu.thist_keep
xjypu: Run failed

I have replaced the routine asad_chem_flux_diags.F90 with the one modified by Luke in order to deal with 3D fields but I does not work with this new branch.

I had previously run turning off 3D O3 dry dep and it got running.

Many thanks.
Kind regards,
Federico

comment:4 Changed 5 years ago by fcentoni

Hi,

I commented out these lines at the end of the routine asad_chem_flux_diags.f90:

         IF (icode >  0) THEN 
!            write(cmessage,'(A20,A15,A45)') 'ASAD_FLUX_PUT_STASH ',&
!                 'copydiag ERR: ',&
!                 TRIM(ADJUSTL(cmessage(1:45)))
            icode=(1000*section) + item
            write(out,*) cmessage,icode 

It seems now I didn't get the same error anymore but the job is still failing as you can see looking at the file xjypu000.xjypu.d14232.t132310.leave:

_pmiu_daemon(SIGCHLD): [NID 01490] [c7-0c2s4n2] [Wed Aug 20 13:37:55 2014] PE RANK 25 exit signal Segmentation fault
_pmiu_daemon(SIGCHLD): [NID 02269] [c3-1c2s7n1] [Wed Aug 20 13:37:55 2014] PE RANK 62 exit signal Segmentation fault
[NID 01490] 2014-08-20 13:37:55 Apid 9846283: initiated application termination
diff: /work/n02/n02/fcentoni/tmp/tmp.mom1.7837/xjypu.xhist: No such file or directory
qsexecute: Copying /work/n02/n02/fcentoni/um/xjypu/xjypu.thist to backup thist file /work/n02/n02/fcentoni/um/xjypu/xjypu.thist_keep
xjypu: Run failed

I urgently need to get a new clean working copying of my model running.
Could please help me out with that?

Many thanks.
Kind regards,
Federico.

comment:5 Changed 5 years ago by luke

Hi Federico,

Are you still having problems with this, or can we close this ticket?

Thanks,
Luke

comment:6 Changed 5 years ago by fcentoni

Hi Luke,

everything works fine.

Thank you,
Federico

comment:7 Changed 5 years ago by luke

  • Resolution set to completed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.