Opened 9 years ago

Closed 9 years ago

#708 closed help (fixed)

aprun on hector phase 2b

Reported by: jonathan Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: 4.5

Description

Dear NCAS-CMS

I am trying to continue a HadCM3 run on HECToR which I did a couple of years ago viz xcwxg. I have reprocessed the job for phase2b and when I try to run it with a CRUN I get

xcwxg: Starting run
address space limit (kbytes)   (-M)  13230240
core file size (blocks)        (-c)  unlimited
cpu time (seconds)             (-t)  unlimited
data size (kbytes)             (-d)  unlimited
file size (blocks)             (-f)  unlimited
locks                          (-L)  unlimited
locked address space (kbytes)  (-l)  64
nofile                         (-n)  1024
nproc                          (-u)  129138
pipe buffer size (bytes)       (-p)  4096
resident set size (kbytes)     (-m)  unlimited
socket buffer size (bytes)     (-b)  4096
stack size (kbytes)            (-s)  8192
threads                        (-T)  not supported
process size (kbytes)          (-v)  unlimited
rm: cannot remove `/work/n02/n02/gregoryj/xcwxg/xcwxg.requests': No such file or directory
aprun: -N cannot exceed -n
xcwxg: Run failed

in /home/n02/n02/gregoryj/um/umui_out/xcwxg000.xcwxg.d11284.t081133.leave. It appears not to have run at all. What have I forgotten?

Thanks

Jonathan

Change History (2)

comment:1 Changed 9 years ago by ros

  • UM Version changed from <select version> to 4.5

Hi Jonathan,

It's because you are trying to run on only 16 cores (-n) but still have selected "use default number of cores per node" which is now 24 (-N). You either need to increase the number of cores you are running on to some multiple of 24 or select "use non-default number of cores per node" in subindep → Job submission window and enter 16 and thus run on an underpopulated node. If you choose the latter, you will be charged for using 24 cores.

Regards,
Ros.

P.S. You also need to change the target machine to be phase2b.hector.ac.uk instead of login.hector.ac.uk

comment:2 Changed 9 years ago by ros

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.