Opened 3 months ago

Closed 8 weeks ago

#2870 closed help (fixed)

fail of run due to bad or missing value

Reported by: awright Owned by: pmcguire
Component: JULES Keywords: FAIL list_sites_fluxnet2015.txt@13496
Cc: Platform: JASMIN
UM Version:

Description

Hello Patrick,

I tried to run u-bh383 suite with 'med' fluxnet sites by changing the SUBSET to 'med' in rose-suite.conf. The error is:

[INFO] symlink: rose-conf/20190410T205919-run.conf ⇐ log/rose-suite-run.conf
[INFO] symlink: rose-conf/20190410T205919-run.version ⇐ log/rose-suite-run.version
[FAIL] file:bin/list_sites_fluxnet2015.txt=source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_fluxnet2015.txt@13496: bad or missing value

That is what I have in my rose-suite.conf:
[file:bin/list_sites_fluxnet2015.txt]
source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_fluxnet2015.txt@13496

I think I should check to see if the fluxnet sites in 'med' (which are 'FR_Pue', 'IT_Noe', 'IT_SRo', 'ES_Amo', 'IT_Cp2', 'IT_Cpz'; I know it from roses/u-bh383/bin/make_plots.py) exist in list_sites_fluxnet2015.txt. in a directory like this:
pkg/karinawilliams/r6715_python_packages/share/
I don't know how to get to that directory?

All the best,

Azin

Change History (10)

comment:1 Changed 2 months ago by pmcguire

Hi Azin
In your rose-suite.conf file, it is telling Rose/Cylc to retrieve the list_sites_fluxnet2015.txt and list_sites_lba.txt files from the jules archive at MOSRS, in the pkg/karinawilliams/r6715_python_packages/share directory there. This directory does not exist on JASMIN; it is only on the jules archive at MOSRS.

I tried to see what files are there at the jules archive at MOSRS, with:
fcm list fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share
and those two files (list_sites_fluxnet2015.txt and list_sites_lba.txt) do not exist currently at the MOSRS jules archive.
You can try that for yourself.

But there are versions of these two files in your ~azin/roses/u-bh383/bin already.
I compared your version with the version that is in ~pmcguire/roses/u-al752/bin (the version of u-al752 from a few months ago), with:
diff ~pmcguire/roses/u-al752/bin/list_sites_fluxnet2015.txt ~azin/roses/u-bh383/bin/list_sites_fluxnet2015.txt
and
diff ~pmcguire/roses/u-al752/bin/list_sites_lba.txt ~azin/roses/u-bh383/bin/list_sites_lba.txt
and there are no differences.
You can try that for yourself too.

So these two files were/are part of the u-al752. They are in the bin directory already. You don't need to check those two files out from MOSRS every time you run the suite. So, when I suggested recently that you could uncomment out all the fcm lines in your rose-suite.conf file, I was mistaken. You should probably comment out or delete the lines:

[file:bin/list_sites_fluxnet2015.txt]
source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_fluxnet2015.txt@13496

[file:bin/list_sites_lba.txt]
source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_lba.txt@13496

Does this help?
Patrick

comment:2 Changed 2 months ago by pmcguire

  • Status changed from new to accepted

comment:3 Changed 2 months ago by awright

Hello Patrick,

As mentioned, we have 5 files in fcm which are used by jules, and are all named in rose-suite.conf. The names of the files are : jules.py, make_time_coord.py, fluxnet_evaluation.py, list_sites_fluxnet2015.txt, list_sites_lba.txt. I have now commented out the latter two files in rose-suite.conf as below:

# [file:bin/list_sites_fluxnet2015.txt]
# source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_fluxnet2015.txt@13496

# [file:bin/list_sites_lba.txt]
# source=fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/list_sites_lba.txt@13496

When I run u-bh383 for 'med' fluxnet sites, I get this new error:

[INFO] source: https://code.metoffice.gov.uk/svn/jules/main/branches/pkg/karinawilliams/r6715_python_packages/share/make_time_coord.py@13496 (fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share/make_time_coord.py@13496)
[INFO] delete: suite.rc
[INFO] install: suite.rc
[INFO] REGISTERED u-bh383 → /home/users/azin/cylc-run/u-bh383
[FAIL] cylc validate -o /tmp/tmpbzuSmE —strict u-bh383 # return-code=1, stderr=
[FAIL] ERROR, family trigger on non-family namespace JULES:finish-all

I have noticed that jules.py, make_time_coord.py and fluxnet_evaluation.py exist in fcm list fcm:jules.x_br/pkg/karinawilliams/r6715_python_packages/share and in /home/users/azin/cylc-run/u-bh383/bin/ but they do not exist /home/users/azin/roses/u-bh383/bin/. I am not sure how much that helps? Shall I copy them into /home/users/azin/roses/u-bh383/bin/ as jules.py, make_time_coord.py, fluxnet_evaluation.py are absent in /home/users/azin/roses/u-bh383/bin/)?

I changed the SUBSET in rose-suite.conf from 'med' to 'lba' (with list_sites_fluxnet2015.txt and list_sites_lba.txt commented out in rose-suite.conf) and it seems to be running OK. It also ran with 'mead' sites previously with no problem. So the problem seems to be with 'med' sites. Do you have any suggestions?

Best wishes,

Azin

comment:4 Changed 2 months ago by pmcguire

Hi Azin
Right now, I do not know the cause of the crashing for you with the 'med' sites. I am glad it works for other SUBSET's.

I can advise that you should not directly copy the Python files into ~/roses/u-bh383/bin.
Those files should never exist there. Your Rose/Cylc suite instead checks those Python files out directly from the MOSRS jules repository. That is what the fcm commands are for in your rose-suite.conf file. If you for some reason were to copy the files into ~/roses/u-bh383/bin then either (i) you or others would get confused later or (ii) there might be some possibility that the suite will run with the version that is there instead of running with the version that is in the MOSRS jules repository.

If you for some reason do not want to use the MOSRS jules repository version of those files, then you would comment out the fcm commands in the rose-suite.conf file and then you would also need to copy the files from somewhere to ~/roses/u-bh383/bin.
But for now, I don't recommend that you do this. If you want to modify the Python files and run with those versions instead of the MOSRS versions, then you might later want to do this. But let's work with the software that we currently have for the moment, if possible, ok?
Patrick

comment:5 Changed 2 months ago by pmcguire

Hi Azin
Can you study the log files for your suite, to see if you can figure out the cause of it crashing for 'med'?
The messages on the screen don't always tell the whole story.
Patrick

comment:6 Changed 2 months ago by awright

Hello Patrick,

OK, I just carry on running it with 'lba' for now.

Many thanks for your help. I appreciate it very much!

All the best,

Azin

comment:7 Changed 2 months ago by awright

Hello Patrick,

Just to let you know, I tried running the suite for 'imprex' and gives an error similar to 'med':

[INFO] delete: suite.rc
[INFO] install: suite.rc
[INFO] REGISTERED u-bh383 → /home/users/azin/cylc-run/u-bh383
[FAIL] cylc validate -o /tmp/tmpnyW0ai —strict u-bh383 # return-code=1, stderr=
[FAIL] ERROR, family trigger on non-family namespace JULES:finish-all

I will look at the log files next.

All the best,

Azin

comment:8 Changed 2 months ago by pmcguire

Hi Azin
I can reproduce your error with these steps:
1) cp -pr /home/users/azin/cylc-run/u-bh383 ~/u-bh383test
2) edit your ~/u-bh383test/suite.rc file so that it has the line:
{% set SUBSET='med' %}
3) run the command on jasmin-cylc:
cylc validate -o /tmp/tmpnyW0ai —strict ~/u-bh383test
4) you will get the same error message that you did.
5) But I don't get that error if I use in that file:
{% set SUBSET='lba' %}

You will note that there are no options defined for SUBSET == 'med' in the ~azin/roses/u-bh383/suite.rc file in these lines:
{%- if SUBSET == 'all' or SUBSET == site or ( SUBSET == 'lba' and 'LBA' in site ) or ( SUBSET == 'peg' and not 'LBA' in site ) or ( SUBSET == 'mead' and 'US_Ne' in site ) or ( SUBSET == 'china' and site[0:2] == 'CN') or ( SUBSET == 'wetland' and site in wetland_sites ) or ( SUBSET == 'selection' and site in selected_sites )%}.

This probably means that Rose/Cylc doesn't know what to do when you select the option SUBSET = 'med' in your rose-suite.conf file. Maybe you can add that option to the ~azin/roses/u-bh383/suite.rc file?

I have worked with the SUBSET='china' for example, and that works. And it partly works for you for the SUBSET='lba' option.
Patrick

comment:9 Changed 2 months ago by awright

Hello Patrick,

Thank you very much for your help. It is running perfectly now.

All the best,

Azin

comment:10 Changed 8 weeks ago by pmcguire

  • Resolution set to fixed
  • Status changed from accepted to closed
Note: See TracTickets for help on using tickets.