Opened 5 years ago

Closed 5 years ago

#1461 closed help (answered)

Changing from Nudged mode back to free-running

Reported by: StevenTurnock Owned by: um_support
Component: UKCA Keywords:
Cc: Platform: MONSooN
UM Version: 7.3

Description

Hi,

I have been running a Vn7.3 UKCA job (xkngd) in nudged mode with double call radiation on Monsoon and it has been working fine for me. I have recently set up a new job (xknge) to attempt to run in single call radiation mode and with nudging turned off, following the reverse of the instructions on the UKCA website but I have received the following message in the .leave file.

ERROR!!! in reconfiguration in routine Rcf_Exppx
Error Code:- 2
Error Message:- Cant find required STASH item 1 section 39 model 1 in STASHmaster
Error generated from processor 0

Graham Mann here at Leeds suggested that is still something to do with nudging files. Can anyone suggest what might be the problem here and if there is something I have either not turned on or off when changing from nudged to free-running mode.

Thanks
Steven

Change History (12)

comment:1 Changed 5 years ago by luke

Hi Steven,

I think you would need to remove the nudging prognostics from the dump. This can be done using the reconfiguration step.

In your original job you need to locate the user pre-STASHmaster file associated with the nudging, which will define the section 39 prognostics. You should take a copy of this, and then change the SPACE code to 10. The SPACE code is the first entry on the second line of each STASH item.

You should then include this new STASHmaster file in your job, and set to reconfigure. If you needed to take the values of the other fields from the start-dump, remember to initialise these to 1 in the initialisation of user prognostics panel.

Can you try this and let me know how you get on?

Many thanks,
Luke

comment:2 Changed 5 years ago by StevenTurnock

Hi Luke,

I have had a look at all the STASHmaster files that have been included within my job and I can't seem to find one relating to nudging. I have checked inside all of the files and can't find a reference to items within section 39 either so I am not really sure what I could change.

I am afraid I wasn't involved in the original set up of this job but have subsequently picked it up so not quite sure about the original set up I am afraid and what/wasn't included. However when I have been previously running in in nudged mode it has been working fine with the hand edits turned on and relevant branches included. So following the instructions on the UKCA website I have turned off the branch and hand edit relating to nudging but kept the 365 calendar set up as that it what all my ancillaries are for.

Do you know of a STASHmaster file I could use for my Vn7.3 job to include and set up as you have suggested above?

Any other help/suggestions would be appreciated.

Many thanks
Steven

comment:3 Changed 5 years ago by luke

Hi Steven,

If you look at the hand-edit

/home/mdalvi/umui_jobs/hand_edits/nudge/nudg_eraint_on_L63.ed

included in the d job, this is where all the STASH is coming from. It sets up s39i001-015 to be sent to a tagged stream (99). It also picks up the user pre-STASHmaster file from

/home/mdalvi/umui_jobs/hand_edits/nudge/ndg_presma

You should take a copy of the above and set the space code to 10 from 0 and include it in your user STASHmaster file section.

Try the reconfiguration step again and see how you get on.

Thanks,
Luke

comment:4 Changed 5 years ago by StevenTurnock

Hi Luke,

I have tried what you have said above but the UMUI doesn't allow me to add the STASHmaster file that you have suggested as it says that selections are only available from sections 0-38 and cannot allow one from section 39.

I am guessing that this is a problem generated from trying to re-initialise this job from one that was previously nudged.

Do you have any other suggestions about what might be best the way to proceed?

Thanks
Steven

comment:5 Changed 5 years ago by luke

Hi Steven,

Is the error in the UMUI or when you run the job? Does it stop you from closing any UMUI windows?

I would suggest using a job just for reconfiguration so that you can make the start dump, then use this dump in another job with the s39 user pre-STASHmaster file removed. It may be that reconfiguration lets you add or remove fields, but it's the UM that has a problem.

If it is reconfiguration that has the issue, you could take a look at

/home/ukca/hand_edits/VN8.4/sect35_on.ed

and change this so it is for section 39. However, you don't want s39 on for the job, just to reconfigure to make the start dump. Use this hand-edit and the user pre-STASHmaster file with the space code to 10 to produce the new start dump, then use this .astart file in a new job without the hand-edit or user PSM file. You can then turn off reconfiguration in this job and just run from the .astart file you have made. I often work like this, especially if I'm doing something complicated to the dump.

I hope this helps.

Thanks,
Luke


comment:6 Changed 5 years ago by StevenTurnock

Hi Luke,

I am afraid the error is in the UMUI. When I add the new STASH master file with the section 39 diagnostics the UMUI does not allow me to close the user_STASHmaster files window as it says that I can only select items from sections 0-38. So at the moment I don't think I can even try out your reconfiguration step as it just won't let me add that STASH master file with section 39 in.

Sorry this is a bit of pain to sort out.

Steven

comment:7 Changed 5 years ago by luke

Ah - this is why Mohit had to use a hand-edit to include the user pre-STASHmaster file. I assumed that he was just doing it to be self-contained, but there seems to be another reason as well.

I would suggest editing Mohit's

/home/mdalvi/umui_jobs/hand_edits/nudge/nudg_eraint_on_L63.ed

hand-edit, but only include the parts which turn on s39 and include the PSM file. However, change it to point to your user PSM file with SPACE code 10. Don't include the bits which add the STASH requests to tag 99.

Does that make sense?

Thanks,
Luke

comment:8 Changed 5 years ago by StevenTurnock

Hi Luke,

I think I thought I understood what you meant but maybe not. I had changed Mohit's hand edit file to point towards my copy of his STASH master file (the one with SPACE code 10 inserted) and also removed the only reference I could find relating to tag 99, which was:

&USE NAME="UPTAG", LOCN=1
IUNT=99,
/

I then included my copy of Mohit's hand edit file and tried to run the model again. In the .leave file I then received the following message/

"/projects/ukca/sturno/xknge/ummodel/ppsrc/UM/control/top_level/readcntl.f90", line 1679: 1525-088 The NAMELIST READ statement cannot be complete
d because item L_NUDGING is not a member of the NAMELIST group nlstcatm. The program will recover by discontinuing further processing of the REA
D statement.

*
UM ERROR (Model aborting) :
Routine generating error: INITIAL
Error code: 102
Error message:

INITDUMP: Wrong no of atmos prognostic fields

*
gc_abort (Processor 79 ): INITDUMP: Wrong no of atmos prognostic fields

Looking at the messages above I think my model is still seeing the wrong number of fields from the dump file. Do you think that I need to do the reconfiguration step first to make a new dump file now instead of just trying to run the model (like you suggested above)?

Thanks,
Steven

comment:9 Changed 5 years ago by luke

Hi Steven,

Sorry - I wasn't clear above - you need to remove all the sections from the hand-edit except the bits that turn on s39 and add the user pre-STASHmaster file. Something like that below, but with the PSM changed to your one:

################################################################
# Add A39_1A to CPPDEFS & switch on A39 in SIZES & RECONA

ed FCM_UMUI_MODEL_CFG <<\EOF
/ A38
+1
i
   A39_1A=a39_1a \
.
w
q
EOF
e3=$?

# Could be dangerous as atmos_sr contains section defs. of other sections.
ed SIZES<<\EOF
/ATMOS_SR
+3
d
i
 '1A','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ',
.
w
q
EOF
e4=$?

ed RECONA<<\EOF
/ATMOS_SR
+3
d
i
 '1A','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ','  ',
.
w
q
EOF

#########################################################################



################################################################

# Add Section 39 items to PRESM_A

# increment no. of records by 15
onum=`grep RECORDS PRESM_A | awk '{print $1}'`
nnum=`expr $onum + 15`

echo "$nnum RECORDS in this file. " >nrec

# Replace the 1st line & append item list from file ndg_presma
# If existing records ---
if [ $onum -ne 0 ]; then
ed PRESM_A <<\EOF
/RECORDS
d
i

.
. r nrec
-1
d
w
/ end of file
-1
. r /home/mdalvi/umui_jobs/hand_edits/nudge/ndg_presma
w
q
EOF
e5=$?

# If no existing records
else

ed PRESM_A <<\EOF
/RECORDS
d
a
.
. r nrec
a

.
. r /home/mdalvi/umui_jobs/hand_edits/nudge/ndg_presma
.
w
q
EOF
e5=$?

fi
rm -f nrec

########################################################################
# Check for errors
err=`expr $e3 + $e4 + $e5`
if [ $err -gt 0 ]; then
 echo 'ERROR executing Nudging Hand_Edit'
fi

Let me know if this doesn't work.

Thanks,
Luke

Last edited 5 years ago by luke (previous) (diff)

comment:10 Changed 5 years ago by StevenTurnock

Hi Luke,

Thanks for the advice above and sorry I have been a bit slow in getting back to you but I was away at the end of last week. But I have now had chance to incorporate your latest revisions above and taking out all of the other references did seem to get the model to run as an NRUN, so that is good. However when I tried to set the model off as a CRUN I think that there are some low values being generated now and these are causing the model to crash. For example I am getting these kind of messages:


Signal received: SIGFPE - Floating-point exception

Signal generated for floating-point exception:

FP division by zero

Instruction that generated the exception:

fdiv fr20,fr01,fr18
Source Operand values:

fr01 = 5.23078782861096e-06
fr18 = 0.00000000000000e+00


This seems to be a re-occurring message which is then followed up by references like:


Traceback:

Offset 0x00000940 in procedure deep_grad_stress_, near line 120 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/atmosphere/co

nvection/deep_grad_stress-dpgrstrs4a.f90


Is this to do with the free-running model generating low values and not being able to allow them within the model?

Sorry to create another problem for you but any advice that you have on this would be appreciated.

Many thanks
Steven

comment:11 Changed 5 years ago by StevenTurnock

Hi Luke,

Sorry didn't get to speak to you more at ACITES meeting but it always seems rushed and busy at those kind of meetings. Anyway I think you mentioned about trying to run my job with a 20 day length to see if that helped with some of the floating point values I was generating above. I have tried that and have got an NRUN and a CRUN to complete for 2 months. But I am having trouble now restarting from the completed CRUN. I think I am getting similar messages to that above:

Signal received: SIGFPE - Floating-point exception

Signal generated for floating-point exception:

FP invalid operation

Instruction that generated the exception:

fsqrt fr02,fr02
Source Operand values:

fr02 = -3.91611442568355e+04

Traceback:

Offset 0x00008a80 in procedure ukca_mode_ems_um_, near line 2613 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/atm

osphere/UKCA/ukca_mode_ems_um.f90

Offset 0x0000a994 in procedure ukca_emission_ctl_, near line 2873 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/at

mosphere/UKCA/ukca_emission_ctl.f90

Offset 0x00018434 in procedure ukca_main1_, near line 8338 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/atmospher

e/UKCA/ukca_main1-ukca_main1.f90

Offset 0x000ef344 in procedure u_model_, near line 5518 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/control/top_

level/u_model.f90

Offset 0x00001eb0 in procedure um_shell_, near line 3823 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/control/top

_level/um_shell.f90

Offset 0x00000090 in procedure flumemain, near line 36 in file /projects/ukca/sturno/xknge/ummodel/ppsrc/UM/control/top_l

evel/flumeMain.f90

Do you have any further advice about how to run this job that keeps generating this messages?

Many thanks,
Steven

comment:12 Changed 5 years ago by annette

  • Resolution set to answered
  • Status changed from new to closed

This ticket is being closed due to lack of activity.

Annette

Note: See TracTickets for help on using tickets.