Opened 10 months ago

Closed 5 weeks ago

#3125 closed help (worksforme)

WRITHEAD, UM

Reported by: pmcguire Owned by: um_support
Component: UM Model Keywords: UM, WRITHEAD
Cc: mtodt Platform: ARCHER
UM Version:

Description

Hello CMS helpdesk:
I have been able to get my UM AMIP suite (u-bq290) working on Archer with UM11.0 (I have also done this for UM11.5). It ran fine to the Wallclock limit of 15 minutes in the short queue. I tried to increase the Wallclock limit to 300 minutes in the standard queue, but I get this error message now:

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!       ERROR        ???!!!???!!!???!!!???!!!???!!!
?  Error code: 24
?  Error from routine: WRITHEAD
?  Error message: WRITHEAD: Addressing conflict
?  Error from processor: 52
?  Error number: 99
????????????????????????????????????????????????????????????????????????????????

I am sure I am doing something silly wrong, but I don't immediately know or see what it is.
Can you point me in the right direction?
Patrick

Change History (10)

comment:1 Changed 10 months ago by dcase

If you've only changed the queue settings and the thing submits and starts to run, then I'd just do the basic things first, i.e. check your disk quota and rerun the task. Possibly there was a temporary disk problem???

I'll look at it later if the trivial things above don't fix it, but it's probably worth re-triggering just in case they help.

comment:2 Changed 10 months ago by grenville

Patrick

Please allow group read permission on your ARCHER /home and /work spaces

Grenville

comment:3 Changed 10 months ago by pmcguire

Thanks, Dave & Grenville
I just did a restart of the u-bq290 suite.
The u-bq296 suite had similar problems, if you want to look at things for that suite instead. (The restart might have erased some files from u-bq290).
I changed the group read permissions for my ARCHER /home and /work spaces.
Patrick

comment:4 Changed 9 months ago by pmcguire

Hi Grenville & Dave

I have done a 'rose suite-run' without the '—restart', and I have reduced the dump frequency from 10 days to 10 time steps. It's the u-bq290 suite. I still get the same WRITHEAD errors. The WRITHEAD error messages seem to have multiple possible causes.
See: puma:/home/pmcguire/um/vn11.0/src/control/dump_io/writhead.F90.

I guess one possible next step is to disambiguate these 'WRITHEAD: Addressing conflict' error messages in this routine with better print statements. I can try to do that.

Or maybe there are other possible next steps?
Patrick

comment:5 Changed 9 months ago by grenville

Patrick

I think the simplest thing will be to take a standard 11.5 job and add easyaerosol etc to that. I can't imagine the problem is with writhead.F90 - you may end up be going down an endless rabbit hole chasing this.

Grenville

comment:6 Changed 5 months ago by mtodt

Hi Grenville

I've come across this error message again and noticed that it is due to the start dump. In my suite u-bt231 I've tried running with start dumps /work/n02/n02/wmcginty/ai718a.da19820101_00 and /work/y07/y07/umshared/hadgem3/initial/atmos/N216L85/ab680a.da19880901_00 without changing any other settings. The run succeeds when using the 1988 u-ab680 start dump but the WRITHEAD error occurs when using the 1982 u-ai718 start dump. I don't understand how the choice of a start dump has an affect on writing the next dump. Does it mean anything to you? Can you help us with this?

Cheers
Markus

comment:7 Changed 4 months ago by mtodt

Hi

I'm still trying to figure out this WRITHEAD error that seems to stem from the input dump. I've compared the two aforementioned dumps, and there are lots of differences in their fields, at least in terms of their order. For example, the start dump that works includes aerosol-related and dust-related fields while the other one does not. I run with dust modelling turned off, but could that have an impact on what gets written into dumps or what the UM tries to write into dumps?
Does anyone know what the dependence of the WRITHEAD error on start dumps could signal? Many thanks in advance for your help!

Cheers
Markus

comment:8 Changed 8 weeks ago by pmcguire

Hi Markus:
Did you figure anything else out about the WRITHEAD error?
Patrick

comment:9 Changed 5 weeks ago by mtodt

Hi Patrick

Sorry I forgot about this ticket! I haven't figured out this issue, but since we settled on using the 1988 start dump it's not a problem anymore. So I suppose this ticket can be closed.

Cheers
Markus

comment:10 Changed 5 weeks ago by mtodt

  • Resolution set to worksforme
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.