Custom Query (3269 matches)

Filters
 
Or
 
  
 
Columns

Show under each result:


Results (43 - 45 of 3269)

Ticket Resolution Summary Owner Reporter
#3053 fixed Suite failing after decades um_support ChrisWells
Description

Hi,

A suite, u-bm505, has failed mysteriously after running for >100 years and I'm unsure why; this is the error at the end of job.err:

Rank 602 [Thu Oct 24 17:06:34 2019] [c5-1c1s10n2] application called MPI_Abort(comm=0xC4000003, 1) - process 602
atpAppSigHandler: Back-end never delivered its pid. Re-raising signal.
atpAppSigHandler: Back-end never delivered its pid. Re-raising signal.
_pmiu_daemon(SIGCHLD): [NID 03754] [c5-1c1s10n2] [Thu Oct 24 17:10:32 2019] PE RANK 589 exit signal Aborted
atpAppSigHandler: Back-end never delivered its pid. Re-raising signal.
[NID 03754] 2019-10-24 17:10:32 Apid 84031174: initiated application termination
[FAIL] run_model # return-code=137

I have similar runs which haven't failed, and restarting the suite gave the same error.

Sorry that I haven't got very far in figuring this out - do you know what I should do?

Cheers, Chris

#3054 fixed Suite stopped ros ChrisWells
Description

(Originally post onto #3029)

Hi Ros,

I thought I'd put this on here as it might be related - it's about 1 of the suites affected here (u-bl918); the others are running ok.

The suite stopped on 23460401, with no warning - just stopped with submitted. When I run rose suite-run —restart, the gui appears with the tasks, including some 2345 postproc tasks on submitted, and then immediately goes blank on stopped.

I just tried it again - gcylc u-bl918 showed stopped with submitted again, and restarting opened the gui showing 23460401 coupled is running! qstat then shows the task coupled.2346040 is Running, and gcylc shows "stopped with running" now.

I also got this in /var/mail/chwel after this most recent restart:

suite event: aborted reason: 23450101T0000Z suite: u-bl918 host: xcslc0 port: 43044 owner: chwel

But I can't see any tasks for that time period in the gui.

So I'm confused as to what this suite is doing - do you know what I should do with it?

Cheers, Chris

#1059 fixed ssh_exchange_identification: Connection closed by remote host um_support D.Hamilton
Description

Hi

I could log on to my account a moment ago, but after logging out I can no longer get back in.

Error message:

ssh_exchange_identification: Connection closed by remote host

username:

D.Hamilton

Thanks

Note: See TracQuery for help on using queries.