Opened 3 months ago

Closed 2 months ago

#2479 closed help (fixed)

Compiler Issue - cray-mpich/7.5.5' conflicts with the currently loaded module(s) 'cray-mpich/7.1.1

Reported by: pliojop Owned by: ros
Priority: normal Component: UM Model
Keywords: Compiler Cc:
Platform: ARCHER UM Version: 8.4

Description

This looks to me to be a similar error to ticket #2299, where the suggested fix was to give a resubmit, however I have tried that with my job and recieved the same error.

I have job xlayn, vn8.4 trying to run on Archer. It is from a family that have previously run happily on Archer, but about 15 months ago.

The compile for this job dies immediately, with the error message:

cray-mpich/7.5.5(34):ERROR:150: Module 'cray-mpich/7.5.5' conflicts with the currently loaded module(s) 'cray-mpich/7.1.1'
cray-mpich/7.5.5(34):ERROR:102: Tcl command execution failed: conflict cray-mpich

FCM 2016.12.0 (/fs2/y07/y07/umshared/software/fcm-2016.12.0)
COMP_UMSCRIPTS is true - run build command
cray-mpich/7.5.5(34):ERROR:150: Module 'cray-mpich/7.5.5' conflicts with the currently loaded module(s) 'cray-mpich/7.1.1'
cray-mpich/7.5.5(34):ERROR:102: Tcl command execution failed: conflict cray-mpich

[FAIL] /home/n02/n02/japope/um/xlayn/umscripts/cfg/bld.cfg: cannot locate config file, abort at /fs2/y07/y07/umshared/software/fcm-2016.12.0/bin/../lib/FCM1/ConfigSystem.pm line 539

Build command started on Fri Jun 1 14:45:38 2018.
→Parse configuration: start
UMSCRIPTS build failed

Any suggestions on how I can resolve this issue?

Thanks

James

Change History (8)

comment:1 Changed 2 months ago by Leighton_Regayre

I'm having exactly the same problem with a couple (but not all) of my vn8.4 jobs. Advice welcome.

comment:2 Changed 2 months ago by ros

Leighton's issue moved to #2498

comment:3 Changed 2 months ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi James,

In FCM Configuration → FCM Extract directories and Output levels you need to set the Target machine root extract directory (UM_ROUTDIR) to be /home/n02/n02/japope/um. You can't use the $USERID variable here as this resolves to your PUMA username which is different to your ARCHER username.

The mpich errors here can be ignored.

Cheers,
Ros.

comment:4 Changed 2 months ago by pliojop

Hi Ros,

I changed it to my Archer user name however I still get the same error in the compile stage.

Thanks

James

comment:5 Changed 2 months ago by ros

Hi James,

Can you please submit the job again and copy here the contents of the UMUI submission output window please? Looking in your extract directory on PUMA (~/um/um_extracts/xlayn) the extract hasn't run since July 2017 so something very weird is going on.

Cheers,
Ros.

comment:6 Changed 2 months ago by ros

Hi,

I take some of that back. I've just found a part of it that has tried to run. If you look in the ~um/um_extracts/xlayn/baserepos/UMATMOS/ext.out you'll see it's failed to connect to ARCHER to copy over the code.

Please make sure your ssh-keys are setup properly and you can login to ARCHER from PUMA without any prompt for passphrase or password.

Regards,
Ros.

comment:7 Changed 2 months ago by pliojop

Morning Ros,

All resolved now, unusually the files passed across from PUMA to Archer, usually the authentication issue stopped when doing the submit in the UMUI.

Regarding the year gap, that is because I thought these jobs were finished, but it transpires I need to run them a little bit more for some further analysis, hence the sudden decision to run the suite again.

Thanks again,

James

comment:8 Changed 2 months ago by ros

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.