Opened 9 years ago

Closed 9 years ago

#664 closed help (fixed)

Diagnostics from 4km resolutuon LAM are all NaNs

Reported by: cbirch Owned by: um_support
Component: UM Model Keywords: LAM NaN
Cc: Platform:
UM Version: 7.1



I did a whole load of vn7.1 global-12km-4km nested LAM runs over West Africa last year on phase2a. These were really successful and reproduced a mesoscale convective system case study. I want to try one of the runs again but with smoothed soil properties over the domain. The run I set up complied and ran (xfrde) normally but many of the diagnostics were NaN, especially over land (as compared to over the ocean).

I thought this was because there was something wrong with my new ancillary files so I tried to re-run one of the old 'control' jobs to check that still worked. I get the same issue with this job. The global and 12km job are fine but the data produced by the 4km job are mostly NaN (see xfrdo as an example). The reconfigured data in the .astart file looks fine (I have flicked through all the variables in xconv) but something happens in the first few timesteps because data in the restart dump after 1 hour is not ok.

This job is the same as the original job that worked many times other than it is run on phase2b instead of 2a. I had to make a few changes to get it to work on phase2b (do a difference between xfrdo and xfrdr). At the time of the switch over I checked my nested runs worked on phase2b (experiment xfyl). These jobs compiled and ran fine although I don't recall whether I actually checked that the data produced by the 4km model was ok.

I set up jobs similar to these using version 7.3 for North Africa. I could use these jobs but I have already done a quite a few high resolution runs with 7.1. I really need to use the same version to do some experiments with the soil properties.


Change History (5)

comment:1 Changed 9 years ago by grenville


I note that your job xfylr (the 4km run on phase2b) didn't work properly (the leave files are quite hard to follow having lots of extra trace), but the 12km seems to be OK, so the problem has been around since the switch to phase2b. A phase2a vn7.1 job won't give the same results as the same job on phase2b (even if it didn't crash), so it's not clear that comparing vn7.1 phase2a results with either vn7.1 or vn7.3 phase2b results is more appropriate (just different). Do the vn7.1 and 7.3 12km results differ much on phase2b?


comment:2 Changed 9 years ago by cbirch

Hi Grenville,

I haven't yet run the 7.3 version for the mesoscale convective system case studies. I've only used it over North Africa for Fennec (recent field campaign), which has a different domain. This means I can't compare the vn7.1 and vn7.3 12km results.

How different would the same job on phase2a and phase2b be? Why would they be so different? I know that comparing vn7.1 and 7.3 versions could be problematic because there may have been updates to model parameterisations, physics etc that would cause significant differences.

I could set up a nested LAM domain over West Africa for vn7.3 and then re-run the control simulation. I could see from this how different the results are compared to the vn7.1 run. I could then run the smoothed soil experiments with the 7.3 version. This is quite a lot of work. I don't mind spending the time doing it but I'm also concerned about how much CPU time I'm going to have to use to repeat stuff. At the very least I would have to do two runs that are 48 hours each over a large 4km domain. If there are big differences between the vn7.1 and 7.3 control results I may have to repeat 3-4 additional 48 hour runs.

I guess the only other option is to fix the vn7.1 4km job and live with the phase 2a/2b differences.

What do you think?


comment:3 Changed 9 years ago by grenville


Jobs on phase2a and phase2b don't necessarily give bit reproducible results - my 7.1 jobs don't, and clearly neither do yours. It is hard to say why, but phase 2a and phase2b are different machines possible with different compilers, so perhaps not unexpected.

I am currently looking into why my previously successful 12km 7.1 job fails (at the first timestep) on phase2b and shall let you know if I find anything.

Have you tried the usual trick of writing out a dump at the last successful timestep, reconfiguring it and then rerunning your 4km job?

It probably is worthwhile devoting some effort to fixing the 7.1 job before moving to 7.3.



comment:4 Changed 9 years ago by cbirch

Hi Grenville,

My vn7.1 4km job also fails at the first timestep so I couldn't writing out a dump, reconfigure it and then rerun.

I decided to re-run the 2 day control experiment using vn7.3. The development of the MCS's compared to the vn7.1 job is similar on the first day and then it diverges a lot on the second. I have no way of knowing whether this is due to the differences between phase2a/2b or vn 7.1/7.3, For this reason I have to use vn7.3 for my soil property experiments (and use the vn7.3 control for comparison).

Unlike the vn7.1 run on 2a, the vn7.3 control does not reproduce my MCS case study unless I start the vn7.3 run at 06Z on the morning of the storm using the restart dump from the vn7.1 job. I can however, live with this for my soil property experiments.

I guess you can close this ticket now.


comment:5 Changed 9 years ago by willie

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.