#3004 closed help (fixed)

Failure of fcm_make_pp

Reported by: charlie Owned by: ros
Component: UM Model Keywords:
Cc: Platform: NEXCS
UM Version: 10.7

Description

Hi,

Sorry to bother you with another ticket, but I'm having a problem building a new suite. The suite is u-bm327, and should be a copy of u-bk944 (which built and ran okay). u-bm327 is starting from the year 30 restart dumps from u-bk944, so other than using these dumps, turning off reconfiguration and turning BITCOMP_NRUN=true, they should be identical. However, when trying to build, I get the error below. Please can you advise?

Thanks,

Charlie

/usr/lib64/python2.6/site-packages/requests/packages/urllib3/connection.py:337: SubjectAltNameWarning: Certificate for xcslc0 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
[FAIL] mirror.target = : incorrect value in declaration
[FAIL] config-file=/working/d05/cwilliams/cylc-run/u-bm327/work/18800101T0000Z/fcm_make_pp/fcm-make.cfg:4
[FAIL] config-file= - file:///home/d04/fcm/srv/svn/moci.xm/main/trunk/Postprocessing/fcm_make/postproc.cfg@2381:12
[FAIL] config-file= -  - file:///home/d04/fcm/srv/svn/moci.xm/main/trunk/Postprocessing/fcm_make/inc/remote.cfg@2381:6

[FAIL] fcm make -f /working/d05/cwilliams/cylc-run/u-bm327/work/18800101T0000Z/fcm_make_pp/fcm-make.cfg -C /home/d05/cwilliams/cylc-run/u-bm327/share/fcm_make_pp -j 4 # return-code=9
2019-08-30T11:56:19Z CRITICAL - failed/EXIT
/usr/lib64/python2.6/site-packages/requests/packages/urllib3/connection.py:337: SubjectAltNameWarning: Certificate for xcslc0 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning

Change History (11)

comment:1 Changed 13 months ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi Charlie,

In site/meto_cray.rc please add the 2 lines indicated below to the [[PPBUILD_RESOURCE]] section:

    [[PPBUILD_RESOURCE]]
        inherit = HPC_SERIAL
        [[[remote]]]                  <--- add this line
            host = xcs-c              <--- and this one

The system does not like extracting and mirroring to itself. You would have had to have done some jiggery pokery when you ran the original job u-bk944 but adding the lines above should mean you won't have to keep switching between host = localhost and host = xcs-c to get tasks to run.

Cheers,
Ros.

comment:2 Changed 13 months ago by charlie

Thanks Ros. I have now done that, but it has now failed at the fcm_make2_ocean and fcm_make2_drivers stages:

fcm_make2_ocean

[FAIL] no configuration specified or found

[FAIL] fcm make -C /var/spool/jtmp/6469041.xcs00.saqiNy/fcm_make2_ocean.18800101T0000Z.u-bm32731BZTh -n 2 -j 6 --archive # return-code=2
2019-09-01T14:43:49Z CRITICAL - failed/EXIT

fcm_make2_drivers

[FAIL] no configuration specified or found

[FAIL] fcm make -C /home/d05/cwilliams/cylc-run/u-bm327/share/fcm_make_drivers -n 2 -j 1 # return-code=2
2019-09-01T14:42:45Z CRITICAL - failed/EXIT

comment:3 Changed 13 months ago by ros

Hi Charlie,

In that case put those 2 lines in the [[HPC_SERIAL]] block instead.

Cheers,
Ros.

comment:4 Changed 13 months ago by charlie

Hi Ros,

Okay, I did that. This appears to have fixed the error with fcm_make2_drivers, but not fcm_make2_ocean, which again fails with the same error:

[FAIL] no configuration specified or found

[FAIL] fcm make -C /var/spool/jtmp/6521577.xcs00.rSJO02/fcm_make2_ocean.18800101T0000Z.u-bm327by415g -n 2 -j 6 --archive # return-code=2
2019-09-02T09:19:16Z CRITICAL - failed/EXIT

Charlie

comment:5 Changed 13 months ago by ros

And add to the [[OCEANBUILD_RESOURCE]] block.

Cheers,
Ros.

comment:6 Changed 13 months ago by charlie

Thanks Ros, that seems to have worked. I don't remember doing all those changes with the parent suite (u-bk944) to get that running, so don't entirely understand why it was a problem here. If I was to copy again, either this suite or the parent, will I have to make the changes all over again?

The suite is now running but has failed at the first timestep, but that's almost certainly because of one of the changes I have made to my various ancillaries, so if I can't fix it myself I will raise a separate ticket.

Thanks for your help,

Charlie

comment:7 Changed 13 months ago by ros

Hi Charlie,

If you originally ran u-bk944 before exvmsrose went it would have worked fine, if after the demise of exvmsrose you would have had to change the host= line in the HPC section from localhost to xcs-c for the compiles to go through and then back to localhost for the model to run. That was the 'jiggery-pokery' I referred to in my first response. I suggested changing the host in the build tasks only this time so if you need to rebuild the suite you don't have to go through changing the host palaver again and again. If you commit these changes and then copy this suite - you obviously won't need to make these changes again.

Cheers,
Ros.

comment:8 Changed 13 months ago by charlie

Great, that make sense. All understood. If only my next blow up was so easy to fix!

Thanks again, I will close the ticket.

Charlie

comment:9 Changed 13 months ago by charlie

  • Resolution set to fixed
  • Status changed from accepted to closed

comment:10 Changed 13 months ago by charlie

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:11 Changed 13 months ago by charlie

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.