Opened 9 months ago

Closed 9 months ago

#3199 closed help (fixed)

JULES on JASMIN

Reported by: NoelClancy Owned by: pmcguire
Component: JULES Keywords:
Cc: Platform:
UM Version:

Description

Patrick,

In historic FLUXNET-JULES suites that I have ran on JASMIN, the following locations are specified in rose-suite.conf

JULES_FCM:'fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'
SUITE_DATA:'/group_workspaces/jasmin2/jules/pmcguire/fluxnet/kwilliam/suite_data'
OUTPUT_FOLDER:'/work/scratch/nmc/fluxnet/u-br916/jules_output'
PLOT_FOLDER:'/work/scratch/nmc/fluxnet/u-br916/peg/plots'

I've been running FLUXNET-JULES suites on MONSOON for a very long time and now I want to run a suite on JASMIN again. Is everything above still ok for JASMIN?

Noel

Change History (25)

comment:1 Changed 9 months ago by NoelClancy

I've tried to run the suite on JASMIN but it's failed

[FAIL] cylc validate -o /tmp/tmpkcK9Fp —strict u-br916 # return-code=1, stderr=
[FAIL] Jinja2Error:
[FAIL] File "<unknown>", line 4, in template
[FAIL] TemplateSyntaxError?: Expected an expression, got 'end of statement block'
[FAIL] Context lines:
[FAIL] {# Rose Configuration Insertion: Init #}
[FAIL] {% set CYLC_VERSION="7.8.1" %}
[FAIL] {% set INCLUDE_SPINUP=True %}
[FAIL] {% set JULES_FCM= %} ←- Jinja2Error
(base) [nmc@jasmin-cylc u-br916]$

This suite is a copy of u-bm066 (nothing changed) and it runs without any problems on MONSOON.

However, on MONSOON, JULES_FCM is specified in 'rose-suite.conf' as

JULES_FCM='fcm:jules.xm_tr'

rather than

JULES_FCM:'fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'

But what is the difference?

comment:2 Changed 9 months ago by NoelClancy

[FAIL] file:bin/make_time_coord.py=source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/make_time_coord.py@15046: bad or missing value

Having checked out a copy of this suite, I did a "rose edit" and changed the following settings because I am running on JASMIN and not MONSOON.

LOCATION: 'CEDA_JASMIN'
OUTPUT_FOLDER: '/work/scratch/nmc/fluxnet/u-br916/jules_output'
PLOT_FOLDER: '/work/scratch/nmc/fluxnet/u-br916/jules_plot'
SUITE_DATA: '/group_workspaces/jasmin2/jules/pmcguire/fluxnet/kwilliam/suite_data/vn1.1'

comment:3 Changed 9 months ago by grenville

Is this not the same problem as in ticket #3092

comment:4 Changed 9 months ago by NoelClancy

Hi ,

I did log out and back in, forcing myself to re-enter the MOSRS password. So that problem must have been solved, Thanks, However, I have a new error message.

[FAIL] cylc validate -o /tmp/tmpaDcqq6 —strict u-br916 # return-code=1, stderr=
[FAIL] Jinja2Error:
[FAIL] File "<unknown>", line 4, in template
[FAIL] TemplateSyntaxError?: Expected an expression, got 'end of statement block'
[FAIL] Context lines:
[FAIL] {# Rose Configuration Insertion: Init #}
[FAIL] {% set CYLC_VERSION="7.8.1" %}
[FAIL] {% set INCLUDE_SPINUP=True %}
[FAIL] {% set JULES_FCM= %} ←- Jinja2Error

comment:5 Changed 9 months ago by pmcguire

Hi Noel:
It looks like it is failing since you didn't define JULES_FCM in rose edit, or you didn't save changes in rose edit.
Your rose-suite.conf file has JULES_FCM undefined:
JULES_FCM=

On JASMIN, you can't use:
JULES_FCM='fcm:jules.xm_tr'
This is the mirror link, which isn't available.

On JASMIN, you can use:
JULES_FCM='fcm:jules.x_tr'

If you want to use the branch instead of the trunk, you can use:
JULES_FCM='fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'

If you want to see the differences between the branch and the trunk, one way to do this is to use the fcm co
command at the command line to check out the branch and then to check out the trunk, and then to use diff -r to compare the branch and the trunk.

Patrick

comment:6 Changed 9 months ago by pmcguire

  • Status changed from new to accepted

comment:7 Changed 9 months ago by pmcguire

  • Keywords FLUXNET, JULES, JASMIN, FCM added
  • Platform set to JASMIN

comment:8 Changed 9 months ago by NoelClancy

  • Keywords FLUXNET, JULES, JASMIN, FCM removed
  • Platform JASMIN deleted

Thanks very much Patrick,

I had left that field blank. I've populated that
field as follow:

`JULES_FCM='fcm:jules.x_tr'

and the suite is running now.

I will try to check out the use branch also and see what the difference between the branch and the trunk are.

Is there a major difference or any advantage to use one over the other?

`JULES_FCM='fcm:jules.x_tr'
JULES_FCM='fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'

comment:9 Changed 9 months ago by pmcguire

Hi Noel

I am glad it's working.

Can you do the comparison that I suggested to find out the differences?
One difference from the branch name might be that it adds a GPP unstressed diagnostic.
Another difference from the branch name is that it branched off at revision 9227, which I think is JULES4.9. There might be other differences too.
Patrick

comment:10 Changed 9 months ago by NoelClancy

Hi Patrick,

I used
`JULES_FCM='fcm:jules.x_tr'
in suite, u-br916
This suite ran and completed successfully so I did an "fcm ci" on that suite.

I then checked out a copy of u-br916 as u-bs019 and did a "rose edit" on that second suite.
I then changed the suite number in the output and plot folders to u-bs019 and made the following change:
JULES_FCM='fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'

However, this is failing immediately at the fcm_make stage.

(base) [nmc@jasmin-sci1 roses]$ diff -r u-br916 u-bs019
Only in u-br916/.svn/pristine: 13
Only in u-bs019/.svn/pristine/33: 33feb67727a91b5fc9df218dbb89d352f05c986f.svn-base
Only in u-br916/.svn/pristine/56: 56bc64d2ad0bf9e9a27c77805bdb52d4182c9924.svn-base
Only in u-br916/.svn/pristine: fc
Binary files u-br916/.svn/wc.db and u-bs019/.svn/wc.db differ
diff -r u-br916/rose-suite.conf u-bs019/rose-suite.conf
15c15
< JULES_FCM='fcm:jules.x_tr'
—-

JULES_FCM='fcm:jules.x_br/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic'

19,20c19,20
< OUTPUT_FOLDER='/work/scratch/nmc/fluxnet/u-br916/jules_output'
< PLOT_FOLDER='/work/scratch/nmc/fluxnet/u-br916/jules_plot'
—-

OUTPUT_FOLDER='/work/scratch/nmc/fluxnet/u-bs019/jules_output'
PLOT_FOLDER='/work/scratch/nmc/fluxnet/u-bs019/jules_plot'

diff -r u-br916/rose-suite.info u-bs019/rose-suite.info
1c1
< description=Copy of u-bm066/trunk@129112
—-

description=Copy of u-br916/trunk@149809


comment:11 Changed 9 months ago by pmcguire

Hi Noel
What do the fcm_make .err and .out log files say?
Patrick

comment:12 Changed 9 months ago by NoelClancy

job.err

Traceback (most recent call last):

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 439, in <module>

main()

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 435, in main

tmpfile_edit(out, options.geditor)

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 268, in tmpfile_edit

proc = Popen(cmd, stderr=PIPE)

File "/usr/lib64/python2.6/subprocess.py", line 642, in init

errread, errwrite)

File "/usr/lib64/python2.6/subprocess.py", line 1238, in _execute_child

raise child_exception

OSError: [Errno 2] No such file or directory

comment:13 Changed 9 months ago by NoelClancy

job.out

Traceback (most recent call last):

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 439, in <module>

main()

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 435, in main

tmpfile_edit(out, options.geditor)

File "/apps/contrib/metomi/cylc-7.8.1/bin/cylc-cat-log", line 268, in tmpfile_edit

proc = Popen(cmd, stderr=PIPE)

File "/usr/lib64/python2.6/subprocess.py", line 642, in init

errread, errwrite)

File "/usr/lib64/python2.6/subprocess.py", line 1238, in _execute_child

raise child_exception

OSError: [Errno 2] No such file or directory

comment:14 Changed 9 months ago by NoelClancy

Also, just logged out and back in and was asked to re-enter MOSRS password.
same error messages though

comment:15 Changed 9 months ago by ros

Those tracebacks are not the contents of the .err or out files, this is cylc saying it can't find the log files for some reason. On the command line go to the ~/cylc-run/<suiteid>/log/job/<cycle>/fcm_make directory and look at the log files that way.

comment:16 Changed 9 months ago by pmcguire

Hi Noel
Can you use fcm co to make sure the JULES branch exists and that you can check it out on JASMIN?
Patrick

comment:17 Changed 9 months ago by NoelClancy

cd /home/users/nmc/cylc-run/u-bs019/log/job/1/fcm_make/01/fcm_make

vi job.err

Environment variables set for netCDF Fortran bindings in

/apps/libs/netCDF/intel14/fortran/4.2/

You will also need to link your code to a compatible netCDF C library in

/apps/libs/netCDF/intel14/4.3.2/

[FAIL] config-file=/work/scratch/nmc/cylc-run/u-bs019/work/1/fcm_make/fcm-make.cfg:2
[FAIL] config-file= - https://code.metoffice.gov.uk/svn/jules/main/branches/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic/etc/fcm-make/make.cfg@14107
[FAIL] https://code.metoffice.gov.uk/svn/jules/main/branches/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic/etc/fcm-make/make.cfg@14107: cannot load config file
[FAIL] https://code.metoffice.gov.uk/svn/jules/main/branches/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic/etc/fcm-make/make.cfg@14107: not found
[FAIL] svn: warning: W170000: URL 'https://code.metoffice.gov.uk/svn/jules/main/branches/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic/etc/fcm-make/make.cfg' non-existent in revision 14107
[FAIL]
[FAIL] svn: E200009: Could not display info for all targets because some targets don't exist

[FAIL] fcm make -f /work/scratch/nmc/cylc-run/u-bs019/work/1/fcm_make/fcm-make.cfg -C /home/users/nmc/cylc-run/u-bs019/share/fcm_make -j 4 # return-code=1
2020-02-25T10:11:54Z CRITICAL - failed/EXIT

comment:18 Changed 9 months ago by pmcguire

Hi Noel
What happens when you try to paste the code.metoffice links above in your browser? Can you figure out what is wrong with the link?
Patrick McGuire?

comment:19 Changed 9 months ago by NoelClancy

cd /home/users/nmc/cylc-run/u-bs019/log/job/1/fcm_make/01

vi job.out

Suite : u-bs019
Task Job : 1/fcm_make/01 (try 1)
User@Host: nmc@…

Currently Loaded Modulefiles:

1) lsfmodules/10.1 7) libpnetcdf/intel/14.0/1.5.0
2) lotus-mpi/8.2 8) netcdf/intel/14.0/4.3.2
3) intel/cce/14.0.1.106 9) netcdff/intel/14.0/4.2
4) intel/fce/14.0.1.106 10) intel/fce/14.0.2.144
5) intel/14.0 11) intel/cce/14.0.2.144
6) libPHDF5/intel/14.0/1.8.12 12) parallel-netcdf/intel/20141122

LD_LIBRARY_PATH=/apps/intel/2013_sp1.2.144/composer_xe_2013_sp1.2.144/compiler/lib/intel64:/apps/intel/2013_sp1.2.144/composer_xe_2013_sp1.2.144/ipp/lib/intel64:/apps/libs/netCDF/intel14/fortran/4.2/lib:/apps/libs/netCDF/intel14/4.3.2/lib:/apps/intel/2013_sp1.1.106/composer_xe_2013_sp1.1.106/ipp/lib/intel64:/apps/intel/2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64:/opt/platform_mpi/lib/linux_amd64:/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib
LD_LIBRARY_PATH=/apps/intel/2013_sp1.2.144/composer_xe_2013_sp1.2.144/compiler/lib/intel64:/apps/intel/2013_sp1.2.144/composer_xe_2013_sp1.2.144/ipp/lib/intel64:/apps/libs/netCDF/intel14/fortran/4.2/lib:/apps/libs/netCDF/intel14/4.3.2/lib:/apps/intel/2013_sp1.1.106/composer_xe_2013_sp1.1.106/ipp/lib/intel64:/apps/intel/2013_sp1.1.106/composer_xe_2013_sp1.1.106/compiler/lib/intel64:/opt/platform_mpi/lib/linux_amd64:/apps/lsf/10.1/linux2.6-glibc2.3-x86_64/lib:/apps/libs/PHDF5/intel14/1.8.12/lib
2020-02-25T10:11:46Z INFO - started
[INFO] create: /home/users/nmc/cylc-run/u-bs019/share/data
"job.out" 42L, 4160C

comment:20 Changed 9 months ago by NoelClancy

When I copy and paste, I get asked to enter a username and password
Then when the user name and password are accepted
I get the following message

Not Found
The requested URL /svn/jules/main/branches/dev/karinawilliam/r9227_add_gpp_unstressed_diagnostic/etc/fcm-make/make.cfg@14107 was not found on this server.

comment:21 Changed 9 months ago by pmcguire

Hi Noel
Excellent work.
So that path doesn't exist.
Can you figure out what is wrong with the path, and fix it?
Patrick McGuire

comment:22 Changed 9 months ago by NoelClancy

I just bumped into Karina here at the Met Office and she explained some minor differences but unfortunately I didn't fully understand.

However, I don't need the branch version for now, I only need the trunk version.

She said that the trunk should be working and it is, but that some minor updates from that branch that is not working would need to be implemented. However, as I said, I didn't fully understand so maybe you need to investigate further.

comment:23 Changed 9 months ago by NoelClancy

Thanks very much for solving my problem of running on JASMIN.

Thanks, ticket closed

comment:24 Changed 9 months ago by pmcguire

Hi Noel
I am glad you were able to talk with Karina Williams about the branch versus the trunk.

If at some point, you want to use the branch, you just have to use the correct path to the branch. You have a typo in it. It should point to the karinawilliams subdirectory and not the karinawilliam subdirectory.

PatrickMcGuire

comment:25 Changed 9 months ago by pmcguire

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.