Opened 4 years ago

Closed 4 years ago

#1621 closed help (answered)

jobs not compiling

Reported by: Leighton_Regayre Owned by: ros
Component: UM Model Keywords:
Cc: Platform: ARCHER
UM Version: 8.4

Description

Hello,

I'm currently attempting to compile several jobs, none of which are compiling as usual. I've waited some minutes for the compilation to start, but the windows seem to have paused after loading modulefiles.

I was able to compile and run jobs yesterday so don't think this is caused by the change in cray compiler.

The jobs are xlou.b/d/e.

Thanks,

Leighton.

Change History (7)

comment:1 Changed 4 years ago by Leighton_Regayre

Hi again,

Whatever this compilation problem was, it was temporary. I've deleted all folders and resubmitted these jobs. They now compile.

It would be useful to have an explanation as to why this may have occurred, especially if there's anything I need to do or avoid doing in the future.

Thanks,

Leighton.

Last edited 4 years ago by Leighton_Regayre (previous) (diff)

comment:2 Changed 4 years ago by Leighton_Regayre

Hello,

Further to the messages above:

Although the compilation starts on these jobs it does not complete because of what look to be compilation errors. Each job fails with different errors, a sample of which is below. It looks like the compilations are competing for resources.

ftn-2210 crayftn: ERROR in command line

Could not fork to run /opt/cray/cce/8.3.7/cftn/x86-64/lib/ftnfe (Resource temporarily unavailable).

fcm_internal compile failed (11)
gmake: * [sat_opt_mod.o] Error 1

cm_internal compile failed (256)
fcm_internal compile failed (256)
gmake: * [rad_pcf.o] Error 1
gmake:
* Waiting for unfinished jobs….
gmake: * [earth_constants_mod.o] Error 1

Thanks,

Leighton.

Last edited 4 years ago by Leighton_Regayre (previous) (diff)

comment:3 Changed 4 years ago by ros

  • Owner changed from um_support to ros
  • Status changed from new to accepted

Hi Leighton,

As you are compiling interactively (I assume) we can't see the full compile output which is necessary to be able to help. Please try compiling again and direct the output and error output to a file.

E.g.

your_command > xloub.compile 2>&1

Cheers,
Ros.

comment:4 Changed 4 years ago by Leighton_Regayre

Hi Ros,

Yes I'm compiling interactively. I've followed your advice.

Two of the jobs compiled this morning. The 3rd failed with output in this file:
/home/n02/n02/lre/umui_runs/xloub.compile

Thanks,

Leighton.

comment:5 Changed 4 years ago by Leighton_Regayre

Further to this:

When I compile the job again (when no other jobs are compiling) it compiles without the resource allocation problem arising.

Output in:
xloub.compile_II

It's important to me to be able to compile numerous jobs simultaneously. If I were running a small number of simulations this wouldn't be a major problem, but I'm planning to create a large ensemble of simulations and compiling in series would be laborious.

Thanks again,

Leighton.

comment:6 Changed 4 years ago by ros

Hi Leighton,

If you run lots of compilations interactively you are very likely to experience some form of contention. The login nodes are not supposed to be used for doing big compilations and ARCHER are liable to terminate any processes that consume more than 10minutes CPU to prevent overloading the nodes. Is there a reason why you are compiling all your jobs interactively? If you compiled these jobs in the serial queues you wouldn't encounter this problem.

Also I'm wondering if you need a different executable for each ensemble member? Obviously I don't know exactly what you are trying to achieve, so this might not be possible, but would save a lot of time if it was.

Regards,
Ros.

comment:7 Changed 4 years ago by annette

  • Resolution set to answered
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.