Opened 12 years ago

Closed 11 years ago

#94 closed help (fixed)

Any lingering problems with the reconfiguration

Reported by: sws04jc Owned by: um_support
Component: UM Model Keywords: reconfiguration
Cc: Platform:
UM Version: 6.1

Description

(Lois,Jeff… I probably should have posted this here rather than send you the email earlier… especially if this is a problem being experienced by other).

I realize that there were some problems caused by a recent upgrade to
compiler software on HPCx. Thanks very much for the fix to this problem…
after clearing out some previously compiled files and recon executables,
I've tamed a job that was acting very badly last week.

I wonder if this fix is final, or if there are lingering problems? I ask
because I've been running two other jobs (high res LAM) that are behaving
mysteriously (i.e., works ok on one go, badly after a change to run length
or something else trivial). By "behaving badly" I mean that the job begins
to execute the reconfiguration (creates an ok .start file and begins to
initialize the output dump) and then does nothing else.

Does this sound familiar? Are other problems still being reported?

The job I'm referring to is xbyzo (user sws04jc). It's a 4km job on a large
grid (300by360). I have another job that is behaving strangely (xbyzm), but
this is a more complicated job (1km, 76 levels, with tracers) that is
likely to be suffering from some other illness.

The problems with these jobs may be my own to work out, but I thought it
would wise to report these problems to you in light of the recent problems
with the reconfiguration.

Jeffrey
(U. of Reading)

Change History (3)

comment:1 Changed 12 years ago by lois

We are aware of lingering UM vn 6.1 reconfiguration problems, especially after the 'minor' upgrade to the compiler that was installed without notice during the October 24th manitenance session.
It is not clear what the problems quite are yet or the potential solutions but we will post information as and when we get some!
Thanks for pointing out some potential examples.

Lois

comment:2 Changed 12 years ago by sws04jc

Just to follow up, after much tinkering I've discovered that the abovementioned jobs will only run successfully if the requested "job time limit" is sufficiently long compared to the "run length". If it is not, then the job will "hang" and do nothing until the job time limit is reached. This is a change from what would happen previously, i.e., if not enough time was allocated to complete the entire requested tun, then the job would run just fine until the time limit was reached.

comment:3 Changed 11 years ago by lois

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.