Opened 5 months ago

Closed 4 months ago

#2121 closed help (answered)

Restarting job, need modify_CICE_header for UM8.2 job

Reported by: dilshadshawki Owned by: um_support
Priority: normal Component: UM Model
Keywords: restart, seaice Cc:
Platform: Monsoon2 UM Version: 8.2

Description

Hello helpdesk,

Since moving from xcm to xcs-c, I noticed that the modify_CICE_header program no longer exists under the directory /home/aospre/bin/modify_CICE_header

I need to restart a couple of jobs: xlzck and xlzci and need to use modify_CICE_header to prepare the sea ice restart files.

Can anyone help me find the right directory for this?

Or is there anything else that needs to be done to make jobs work on xcs-c?
http://collab.metoffice.gov.uk/twiki/bin/view/Support/Monsoon2MigrationUserAction
Is there another source of information aside from the link above describing any changes needed to make a job work on xcs-c?

Many thanks,

Dill

Change History (15)

comment:1 Changed 5 months ago by grenville

Dill

Please look in

/home/d01/aosprey/bin

It'd be worth taking a little time to become familiar with the new file structure.

Grenville

comment:2 Changed 5 months ago by dilshadshawki

Many thanks Grenville, the modify_CICE_header is there!

Yep I will do and going to try and see if the job work, might be a bit of trial and error.

Thanks again,

Dill

comment:3 Changed 5 months ago by dilshadshawki

Hi Grenville,

I was hoping you could help me with another issue, I ran the job, xlzck and I got the following error in the .comp.leave file:

/home/d02/dshawk/output/xlzck000.xlzck.d17080.t161012.comp.leave

Towards the end of the file it says:

ftn-868 crayftn: ERROR ASYNC_MPI_ERROR_HANDLER, File = ../../../../../projects/ukca-imp/dshawk/xlzck/umatmos/ppsrc/UM/io_services/common/ios_mpi_error_handle
rs.f90, Line = 76, Column = 18
  "MPL_INT_KIND" is used in a constant expression, therefore it must be a constant.

Not sure what this means and how it relates to the fact that the system has changed from xcml00 to xcs-c, even though that is the only change that has been made.

Please could you help me with this issue?

Many thanks,
Dill

comment:4 Changed 5 months ago by grenville

The problem is

"MPL" is specified as the module name on a USE statement, but the compiler cannot find it.

Your config file refers to user ksival - we'll need to get the MO to shift his data, since he no longer works here.

Grenville

comment:5 Changed 5 months ago by dilshadshawki

Ok please could you let me know when the data has been shifted? Who I should inform regarding this issue?

I am also having a similar problem with another job xlqqg, which is a ukca job and I am getting the error at the reconfiguration stage:

/home/d02/dshawk/output/xlqqg000.xlqqg.d17080.t191658.rcf.leave

NEar the beginning of the .rcf.leave file the first error that occurs is:

/projects/ukca-imp/dshawk/xlqqg/bin/qsrecon[123]: cd: /scratch/jtmp/pbs.9078084.xcs00.x8z: [No such file or directory]

????????????????????????????????????????????????????????????????????????????????
???!!!???!!!???!!!???!!!???!!!???!!! ERROR ???!!!???!!!???!!!???!!!???!!!???!!!?
? Error in routine: Calc_nlookups
? Error Code:    10
? Error Message: Ancillary files have not been found - Check output for details
? Error generated from processor:     0
? This run generated   1 warnings
????????????????????????????????????????????????????????????????????????????????

The same error message is repeated towards the end. Could this also be because of the username changes that occurred when switching from xcml00 to xcs-c?

Many thanks,
Dill

comment:6 Changed 5 months ago by ros

Hi Dill,

A few lines up from the error message at the end it tells you the file it is missing is: /home/mkasoa/ancil/cmip5_cp/qrclim.sulpvolc85. If you cd ~mkasoa and do a pwd you will find Matthew's home directory is now under /home/d01/mkasoa. If you can't find a user's home directory this way they probably have changed username which you can find on the Twiki here: http://collab.metoffice.gov.uk/twiki/bin/viewfile/Static/Monsoon/usermigration.txt

Regards,
Ros.

comment:7 Changed 5 months ago by grenville

Dill

I have moved ksival's /projects/umadmin/ksival to /projects/umadmin/gmslis/ksival

You'll need to change the user path overrides and $OASIS_BLDS to point correctly.

Grenville

comment:8 Changed 5 months ago by dilshadshawki

Hi Grenville,

Thank you, I have now changed the paths as you have suggested.

I run the job, xlzck and get the following compilation error:

/home/dshawk/output/xlzck000.xlzck.d17081.t201714.comp.leave
ERROR: /projects/ukca-imp/dshawk/xlzck/umatmos/fcm.bld.lock: lock file exists,
       /projects/ukca-imp/dshawk/xlzck/umatmos: destination is busy.

I went to the directory and found the fcm.bld.lock file but there the file size is 0 bytes, is that meant to be that way? Is this file the reason for the crash?

Thanks,
Dill

comment:9 Changed 5 months ago by ros

Hi Dill,

The fcm.bld.lock file is there to prevent multiple compiles occurring at the same time and gets deleted when the compile finishes. Sometimes the fcm.bld.lock file is left behind when the compile stops; e.g. if you kill the compile process or it runs out of wallclock. Just remove the file and resubmit to compile.

Regards,
ros.

comment:10 Changed 5 months ago by dilshadshawki

Hi Ros,

I removed the fcm.bld.lock file and when I resubmitted, a different error appears:

/home/d02/dshawk/output/xlzck000.xlzck.d17082.t160641.comp.leave

Towards the end, at the beginning of where the errors begin it says:

ftn-855 crayftn: ERROR OASIS3_ATMOS_INIT, File = ../../../../../projects/ukca-imp/dshawk/xlzck/umatmos/ppsrc/UM/control/coupling/oasis3_atmos_init.f90, Line = 9, Column = 8
  The compiler has detected errors in module "OASIS3_ATMOS_INIT".  No module information file will be created for this module.


ftn-292 crayftn: ERROR OASIS3_ATMOS_INIT, File = ../../../../../projects/ukca-imp/dshawk/xlzck/umatmos/ppsrc/UM/control/coupling/oasis3_atmos_init.f90, Line = 23, Column = 7
  "MOD_PRISM_PROTO" is specified as the module name on a USE statement, but the compiler cannot find it.

Seems to be an error with the file oasis_atmos_init.f90 ?

Any ideas why this is happening? IS there something else that I need to change in the UMUI besides those mentioned by Grenville in the comments above?

Thanks,
Dill

comment:11 Changed 5 months ago by grenville

Dill

You appear to have changed $OASIS_BLDS ? (input/output..→time convention…)

Grenville

comment:12 Changed 5 months ago by grenville

Not changed that should read.

comment:13 Changed 5 months ago by dilshadshawki

Hi Grenville,

thanks for pointing that out, I'm sorry I missed that. I changed the directory for OASIS_BLDS as instructed but I still get the same error in:

/home/d02/dshawk/output/xlzck000.xlzck.d17087.t132430.comp.leave

I just want to make sure of something, I have also changed the directory in:

→ Compilation and Run Options → UM User Over Override Files → User path overrises changing the 'gcom_path' to:

/projects/umadmin/gmslis/ksival/gcom/cce/gcom4.7/xc40_cce_mpp

I checked and that path does indeed exist. So what else could be cuasing the OASIS related errors as shown in the .comp.leave file above?

Many thanks,
Dill

comment:14 Changed 5 months ago by ros

Dill

You needed only change the value of $OASIS_BLDS in one place (as directed previously). By changing "Location of OASIS3 build" in sub-model..→OASIS Coupling Switches… you have introduced an error.

Grenville

comment:15 Changed 4 months ago by ros

  • Resolution set to answered
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.