Opened 3 years ago
Closed 3 years ago
#2315 closed help (answered)
pumf: Failed to extract header info from ar481a.pa1999sep.pp
Reported by: | gn907779 | Owned by: | ros |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | Monsoon2 | |
UM Version: | 10.6 |
Description (last modified by ros)
I am running a 10.6 suite on Monsson u-ar481. The ATMOS task is successful, but the postproc task fails.
This is job.err:-
[whdav@exvmsrose:~/cylc-run/u-ar481/log/job/19990901T0000Z/postproc/01]$ more job.err [WARN] [SUBPROCESS]: Command: /projects/um1/vn10.6/xc40/utilities/um-pumf -h /home/d03/whdav/cylc-run/u-ar481/log/job/19990901T0000Z/postproc/01/job-pumfhead.out /home/d03/whda v/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp [SUBPROCESS]: Error = 1: [INFO] File(1): /home/d03/whdav/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp [INFO] NPRINT: 8 [INFO] XPRINT: 5 [WARN] Using default STASHmaster as none provided "/projects/um1/vn10.6/ctldata/STASHmaster". [INFO] Using script: /projects/um1/vn10.6/xc40/utilities/um-pumf [INFO] Using executable: /projects/um1/vn10.6/xc40/utilities/um-pumf.exe /projects/um1/vn10.6/xc40/utilities/um-pumf: line 198: 17252 Aborted (core dumped) $pumf_exec > $PUMF_OUT 2>&1 [INFO] Header output in: /home/d03/whdav/cylc-run/u-ar481/log/job/19990901T0000Z/postproc/01/job-pumfhead.out [INFO] Field output in: /home/d03/whdav/cylc-run/u-ar481/work/19990901T0000Z/postproc/pumf_out_Dtez/pumf_field [FAIL] Problem with PUMF program [ERROR] pumf: Failed to extract header information from file /home/d03/whdav/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp [FAIL] Command Terminated [FAIL] Terminating PostProc... [FAIL] main_pp.py atmos # return-code=1 2017-11-07T15:11:40Z CRITICAL - Task job script received signal EXIT
This is pumf_out_Dtez/pumf_field
[whdav@exvmsrose:~/cylc-run/u-ar481/log/job/19990901T0000Z/postproc/01]$ more /home/d03/whdav/cylc-run/u-ar481/work/19990901T0000Z/postproc/pumf_out_Dtez/pumf_field Warning in umPrintMgr: umPrintLoadOptions : Failed to get filename for IO control file from environment ===================================================== GCOM Version 6.1 MetO_XC40_Serial Using precision : 64bit INTEGERs and 64bit REALs Built at Thu Oct 6 08:04:35 BST 2016 ===================================================== UMPRINTSETLEVEL: PrintStatus initialised= 4 [0] exceptions: Setting option 2 [0] exceptions: Registering callback at 0x0042ef40 ****************************************** App ID: 5, Name:Print UM File ------------------------------------------ - Data size is 64 bit. Program is 64 bit. - Program is serial. ****************************************** FILE_MANAGER: Assigned : pseudo-file for UNIX operations FILE_MANAGER: : id : io_reserved_unit FILE_MANAGER: : Unit : 10 (portio) Buffered I/O active. Buffer size set to 524288, 8 byte words IO: Initialised IO Host is shared100 FILE STATUS =========== FILE_MANAGER: Assigned : /home/d03/whdav/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp FILE_MANAGER: : Unit : 11 (portio) IO: Switching file mode to local because there is no IO server IO: Opening unit 11 with collective(broadcast) semantics IO: Read Only mode OPEN: File /home/d03/whdav/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp to be Opened on Unit 11 Exists IO: Open: /home/d03/whdav/cylc-run/u-ar481/share/data/History_Data/ar481a.pa1999sep.pp on unit 11 loadHeader: Model Version: 128849019.87 tcmalloc: large alloc 6734511509115011072 bytes == (nil) ???????????????????????????????????????????????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????? ???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!! ???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!! ? Error code: 4205 ? Error code: 4205 ? Error from routine: main_compare ? Error from routine: main_compare ? Error message: Failed to allocate 3147656947853068556 words for the integer header of the file ? Error message: Failed to allocate 3147656947853068556 words for the integer header of the file ? Error from processor: 0 ? Error from processor: 0 ? Error number: 0 ? Error number: 0 ???????????????????????????????????????????????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????? [0] exceptions: An non-exception application exit occured. [0] exceptions: whilst in a serial region [0] exceptions: Task had pid=0 on host [0] exceptions: Program is "/projects/um1/vn10.6/xc40/utilities/um-pumf.exe" [0] exceptions: calling registered handler @ 0x0042ef40 Warning in umPrintMgr: umPrintExceptionHandler : Handler Invoked [0] exceptions: Done callbacks gc_abort (Processor 0): Job aborted from ereport.
Any ideas on how I can fix this?
Thanks
Change History (3)
comment:1 Changed 3 years ago by ros
- Description modified (diff)
- Owner changed from um_support to ros
- Status changed from new to accepted
comment:2 Changed 3 years ago by ros
I have spoken to the owner of the postproc app and it's probably a bug in postproc_2.0 where .pp files created by a former failed task aren't explicitly excluded from any attempt to pumf them in subsequent runs of the task. This is fixed in the next version of postproc so the advice is to upgrade to postproc_2.1
Cheers,
Ros.
comment:3 Changed 3 years ago by ros
- Resolution set to answered
- Status changed from accepted to closed
Note: See
TracTickets for help on using
tickets.
Hi William,
pumf shouldn't be being run on the resulting .pp file. If you've tried to run the postprocessing more than once then perhaps it has got itself confused. I see that you did have problems at one point due to permissions on MASS. I've tried running your suite and it runs the atmos task and then successfully runs the postproc task as well.
I can only suggest clearing out the ~/cylc-run/share/data/History_Data directory and start the suite again.
Cheers,
Ros.