Opened 3 years ago

Closed 3 years ago

#1997 closed help (answered)

no resubmit, xmrsm. Turning off radiation response to clouds.

Reported by: 21001998 Owned by: um_support
Component: UM Model Keywords: resubmit fcm clouds radiation
Cc: Platform: ARCHER
UM Version: 8.5

Description (last modified by willie)

Hi,

I have tried implementing a model run (xmrsm) where the radiation components are insensitive to the clouds present. I've done this by adding the line sw_control(j)%i_cloud = 5 (and same for lw) in the sw_rad_input_mod.F90 module. My branch is called vn8.5_removeCLOUDradiationRESPONSE. (/home/21001998/remove_cloud_response/vn8.5_removeCLOUDradiationRESPONSE/src)

Unfortunately my model won't resubmit and I don't know where to look next. I've made sure both PUMA and ARCHER have space.

Will the i_cloud option exist in version8.5. I know it is called i_cloud_representation in ROSE and that you can select option 5.

The appropriate leave file is /home/n02/n02/jtalib/output
/xmrsm000.xmrsm.d16281.t100202.leave although the reconfiguration also happens very quickly!

Kind regards
Josh.

''
xmrsm: Run failed
*****************************************************************
   Ending script   :   qsatmos
   Completion code :   137
   Completion time :   Fri Oct  7 10:44:51 BST 2016
*****************************************************************


/work/n02/n02/jtalib/um/dataw/xmrsm/bin/qsmaster: Failed in qsatmos in job xmrsm
***************************************************************
   Starting script :   qsfinal
   Starting time   :   Fri Oct  7 10:44:51 BST 2016
***************************************************************

Checking requirement for atmosphere resubmit...
/work/n02/n02/jtalib/um/dataw/xmrsm/bin/qsresubmit: Error: no resubmit details found
*****************************************************************
   Ending script   :   qsfinal
   Completion code :   0
   Completion time :   Fri Oct  7 10:44:51 BST 2016
*****************************************************************


/work/n02/n02/jtalib/um/dataw/xmrsm/bin/qsmaster: Failed in qsfinal in job xmrsm
 <<<< Information about How Many Lines of Output follow >>>>
 50  lines in main OUTPUT file.
 1354 lines of O/P from pe0.
 <<<<         Lines of Output Information ends          >>>>
''

Change History (9)

comment:1 Changed 3 years ago by willie

Hi Josh,

Your job is aborting in MPI because Karthee's hand edit

~karthee/umui_jobs/hand-edits/mpi_rank_order.ed

assumes the work directory is $DATADIR/$RUNID, which doesn't exist, but the rest of your job assumes variously $DATADIR/um/$RUNID/dataw (=DATAM) and $DATADIR/um/dataw/$RUNID (=DATAW).

So either create a modified hand edit or modify DATAM/DATAW and try again.

Regards
Willie

comment:2 Changed 3 years ago by willie

  • Description modified (diff)

comment:3 Changed 3 years ago by 21001998

Hi Willie,

The additional handedit has previously worked in other simulations. I don't understand why this would become an issue by changing the representation of cloud.

I'm currently attempting changing i_cloud_representation rather than i_cloud.

Kind regards,
Josh.

comment:4 Changed 3 years ago by 21001998

New .leave file created when inserting i_cloud_representation = 5 in both sw… and lw_rad_mod_input.F90

/home/n02/n02/jtalib/output/xmrsm000.xmrsm.d16292.t112050.leave

Now getting many unrecoverable library errors.

Kind regards,
Josh.

comment:5 Changed 3 years ago by willie

Hi Josh,

Did this work without the cloud fix? I took copy this morning and tried to run it, but it won't compile:

ftn-428 crayftn: ERROR MONOCHROMATIC_RADIANCE_TSEQ, File = ../../../../../../../../home2/n02/n02/wmcginty/um/xmyua/umatmos/ppsrc/UM/atmos
phere/radiance_core/monochromatic_radiance_tseq.f90, Line = 308, Column = 16 
  An allocate object must be either a pointer or an allocatable array in an ALLOCATE statement.

Regards
Willie

comment:6 Changed 3 years ago by 21001998

We've tried three different experiments to attempt removing the SW and LW impact from clouds in version8.5 of the UM.

1) sw_rad_input_mod.F90:

+ sw_control(j)%i_cloud = 5

lw_rad_input_mod.F90

+ lw_control(j)%i_cloud = 5

In this model run the mid-convection goes to the top of the model at the first timestep. The model fails in glue_conv_5a.F90.

leave file = /home/n02/n02/jtalib/output/xmrsm000.xmrsm.d16284.t171426.leave

2) sw_rad_input_mod.F90:

+ sw_control(j)%i_cloud = 5
+ sw_control(j)%l_cloud = .FALSE.
+ sw_control(j)%i_solver = ip_solver_homogen_direct
+ sw_control(j)%l_microphysics = .FALSE.

lw_rad_input_mod.F90

+ lw_control(j)%i_cloud = 5
+ lw_control(j)%l_cloud=.FALSE.
+ lw_control(j)%l_microphysics=.FALSE.
+ lw_control(j)%i_solver=ip_solver_homogen_direct

In this model run a _DEALLOCATE issue occurs when calling radiance_calc in r2_swrad3z_.F90.
Two _DEALLOCATE functions occur causing the following to be printed:

tcmalloc::ThreadCache::ReleaseToCentralCache?(tcmalloc::ThreadCache::FreeList?*, unsigned long, int).

leave file = /home/n02/n02/jtalib/output/xmrsm000.xmrsm.d16291.t120550.leave

3) sw_rad_input_mod.F90:

+ sw_control(j)%i_cloud_representation = 5

lw_rad_input_mod.F90

+ lw_control(j)%i_cloud_representation = 5

Now in this model run for some strange reason an unrecoverable library error occurs. The model crashes at glue_rad_@…:2912. which is where the model calculates the sea ice albedo?!

leave file = /home/n02/n02/jtalib/output/xmrsm000.xmrsm.d16292.t112050.leave

I am going to change the code to match experiment 2 and this will be my working copy to work with. I have changed the permissions of all the .leave files so they should be readable by all on my archer system.

comment:7 Changed 3 years ago by 21001998

Hi all.

The compiling didn't work due to a working copy branch that I applied in the FCM configuration. I have now restarted experiment 2 (see above) and the model builds and (I think) reconfigs successfully. Issues arise during running.

The appropriate .leave file is /home/n02/n02/jtalib/output/xmrsm000.xmrsm.d16293.t151136.leave.

From what I gather the DEALLOCATE function in src/atmosphere/radiance_core/monochromatic_radiance_tseq.F90 is causing the issues but I wouldn't know where to go next.

Kind regards,
Josh.

comment:8 Changed 3 years ago by willie

Hi Josh,

I have had a look in the debugger, but can't find an obvious cause. The array phase_fnc_clr_f is not being deallocated, although the other two arrays are. The contents of the array seem sensible enough, so I conclude that this is not the fundamental error and that something else is going wrong. Other processes are failing due to MPI problems (MPI_Barrier). I switched off Karthee's hand edit, but that only removed the error at the beginning and did not change the final result.

It might be an idea to switch off your branch and check that the basic set up works.

Regards,
Willie

comment:9 Changed 3 years ago by willie

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.