Opened 2 years ago

Closed 2 years ago

#2254 closed help (duplicate)

UM parallel running problem

Reported by: jfgu Owned by: um_support
Component: UM Model Keywords:
Cc: Platform:
UM Version: <select version>

Description

Dear CMS helpdesk,

I have a problem in running UM model. My suite u-an747 runs OK previously. Today, when I tried to run this suite, it failed at the beginning of the integration. I didn't make any change to the suite. What I do is just "rose suite-run —new" to clear all the prebuilded files. However, the job failed with some error information pointing to the code lines like this:
[17] exceptions: [backtrace]: ( 9) : mpi_waitall (* Cannot Locate *)
[17] exceptions: [backtrace]: ( 10) : Address: [0x02666623]
[31] exceptions: [backtrace]: ( 10) : mpl_waitall_ in file /home/n02/n02/annette/cylc-run/vn6.1_gcom_trunk/share/archer_xc30_cce_mpp/preprocess/src/gcom/mpl/mpl_waitall.F90 line 46
[31] exceptions: [backtrace]: ( 11) : Address: [0x0065e050]

This seems to be the problem of parallel. I don't know much about this. When I tried to read through the Fortran code, it was said that there was no such a Fortran code. The directory "vn6.1_gcom_trunk does not exist".

Are there any changes to the setup of parallel computing? How could I figure this problem?

Could someone help me with this? I have been stuck in the problem for a week. Thanks a lot!

Jian-Feng

Change History (1)

comment:1 Changed 2 years ago by annette

  • Resolution set to duplicate
  • Status changed from new to closed

Jian-Feng,

This looks like the same issue as #2251 with your suite u-ap518 which is a copy of u-an747. You should add instead a comment to that ticket, rather than creating a new one.

Annette

Note: See TracTickets for help on using tickets.