Opened 12 months ago
Closed 7 months ago
#3146 closed help (fixed)
Archiving data from Monsoon to JASMIN
Reported by: | simon.tett | Owned by: | david |
---|---|---|---|
Component: | UM Model | Keywords: | |
Cc: | Platform: | JASMIN | |
UM Version: | 10.0 |
Description
Hi,
is there a guide to what needs to be done to set things up to use the automatic archiving system that moves data from monsoon to JASMIN?
ta
(model version is a guess as I can't log into monsoon right now…)
Simon
Change History (24)
comment:1 Changed 12 months ago by grenville
comment:2 Changed 12 months ago by simon.tett
Hi Grenville,
thanks a lot. After some quick reading I think I need some ssh magic…
"Set up ssh-key to connect to JASMIN
Some setup is required to enable non-interactive authentication from NEXCS to JASMIN. Please contact cms_support@… for details. "
I inherited the job from Christoph who, I assume, had done the ssh magic needed… So what is the incantation needed…
Simon
comment:3 Changed 12 months ago by simon.tett
Hi,
with help from Ros I am set up. To user xfer2 jasmin ned an ip address — what IP address should I give them?
Simon
comment:4 Changed 12 months ago by ros
Hi Simon,
IP sent by email
Cheers,
Ros.
comment:5 Changed 12 months ago by simon.tett
All setup and transfer now working — and once the ice-impact workspace is created I will move to using that rather than edin_cesd. But pp data is being transfered. Is there a way of making & transfering netcdf files rather than pp files?
ta
Simon
comment:6 Changed 12 months ago by ros
- Owner changed from um_support to ros
- Status changed from new to accepted
Hi Simon,
It is possible to configure the suite to output streams in NetCDF direct from the UM rather than pp (except for the climate meaning).
I then have a branch to the post-processing which will archive the netcdf files and transfer them, however, to use this in your suite you would need to upgrade the post-processing app.
If you want to try this out I can point you to instructions, otherwise you will need to manually convert the transferred pp files to netcdf.
Cheers,
Ros
comment:7 Changed 12 months ago by simon.tett
Hi Ros,
except the climate meaning.. That is a pain! because the climate meaned output is what I normally use… Though I don't understant UM10.X well enough to know if climate meaning is what I think it is. I'd normally want monthly, seasonal & annual means….
How come the climate meaning doesn't produce netcdf???
Simon
comment:8 Changed 12 months ago by ros
Hi Simon,
Climate meaning at 10.x is exactly what it was at older UM versions.
Climate Meaning as produced by the UM was never in the plan to be modified to output as NetCDF for multiple reasons; not least the Met Office's plans to remove climate meaning from the UM and put it into postproc. This only happened at the latest postproc release. We have modified postproc to allow archiving of NetCDF files, but due to higher priority work, like ARCHER2 preparation, we just have not yet had the resources to implement climate meaning of NetCDF files within post-processing. This will be done as soon as is practicable.
Regards,
Ros.
comment:9 Changed 12 months ago by simon.tett
Hi Ros,
thanks — guess I will have to use PP data then — yes can understand complexity of this and why you prioritise archer2 transition. Are there tools on Jasmin to convert from pp to netcdf? I guess I could just read it using iris and write it out again as NetCDF.
Simon
comment:10 Changed 12 months ago by simon.tett
I have it all working with data going to my new groupspace on JASMIN! But I changed the run to run in 1 year chunks and the archiving ran out of time… The wallclock time looks to be set to 3600 seconds = 1 hour. How do I increase the wall clock time?
Though 1 hour to convert all data to PP and transfer 6.1 Gbytes to Jasmin seems rather slow. Or should I reduce my cycle time to 3 months?
Job is bo595
Simon
comment:11 Changed 12 months ago by ros
Hi Simon,
In the site/MONSooN.rc file change the execution time limit in the [[PPTRANSFER_RESOURCE]] section.
Then reload the suite (rose suite-run --reload)
Cheers,
Ros.
comment:12 Changed 12 months ago by simon.tett
Hi Ros,
thanks — back to editing files I see… And once I have done the —reload then I do rosie go and resubmit?? How do I put the suite back into revision control so when I copy it that will persist. fcm commit ??
I am rerunning the pp_transfer. Before doing anything else. Will report back if have trouble.
Simon
comment:13 Changed 12 months ago by ros
Hi Simon,
rose suite-run --reload loads the changes into the already running suite so no need to do anything else to it apart from retriggering the failed pptransfer task.
Yes you commit changes to a suite in exactly the same way as for UM branches, ie. fcm commit
Cheers,
Ros.
comment:14 Changed 12 months ago by grenville
Hi Simon (re comment 9),
cfa for pp→netcdf conversion is available on jasmin. Just set up your environment as follows:
export PATH=/home/users/ajh/anaconda3/bin:$PATH
ln -s /home/users/ajh/cfplot_data ~
Note, that this uses cf-python version 3.
Hope that helps.
comment:15 Changed 12 months ago by simon.tett
Hi Ros,
thanks — transfers seem to be happening though one seems to have stopped sending after pushing across 11 ppa files…
To the next stage — converting to netcdf. So I grabed my old archer way of doing it:
cfa —reference_datetime='1750-1-1' —unsqueeze —single -f 'NETCDF4' —no_aggregation —outfile=bo595a.py19891201.nc bo595a.py19891201.pp
and got an assertion error (see below)
Anything I should be doing first? Some conda magic??
[tetts@jasmin-sci4 19890901T0000Z]$ cfa —reference_datetime='1750-1-1' —unsqueeze —single -f 'NETCDF4' —no_aggregation —outfile=bo595a.py19891201.nc bo595a.py19891201.pp
Traceback (most recent call last):
File "/home/users/ajh/anaconda3/bin/cfa", line 9, in <module>
import cf
File "/home/users/ajh/anaconda3/lib/python3.7/site-packages/cf/init.py", line 134, in <module>
import cfunits
File "/home/users/ajh/anaconda3/lib/python3.7/site-packages/cfunits/init.py", line 36, in <module>
from .units import Units
File "/home/users/ajh/anaconda3/lib/python3.7/site-packages/cfunits/units.py", line 212, in <module>
assert(0 == _ut_unmap_symbol_to_unit(_ut_system, _c_char_p(b'Sv'), _UT_ASCII))
AssertionError?
Simon
comment:16 Changed 12 months ago by ros
- Owner changed from ros to david
- Status changed from accepted to assigned
comment:17 Changed 12 months ago by simon.tett
Back to jasmin transfer…
Even after having increased the time for the transfer to 2 hours it is still failing. Looking at the log file I think it used 189 seconds and from what I have on JASMIN I think the transfer is hanging.
I started a rerun of the transfer which failed with an error:
rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32)
rsync: close failed on "/gws/nopw/j04/iceimpact/stett2/u-bo595/19900901T0000Z/.bo595a.pd1991jul.pp.ksK4A4": Input/output error (5)
rsync error: error in file IO (code 11) at receiver.c(730) [receiver=3.0.6]
rsync: connection unexpectedly closed (392 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(641) [sender=3.0.4]
transfer is going to /gws/nopw/j04/iceimpact/stett2/u-bo595 and using xfer2.
Simon
comment:18 Changed 12 months ago by simon.tett
and running the rsync interactively to get files over.
Simon
comment:19 Changed 12 months ago by simon.tett
Which ran for a while — transfering 2 Gbytes then hung.. Suggests some problem with monsoon→ jasmin system rather than UM…
Simon
comment:20 Changed 12 months ago by simon.tett
And I wonder if the solution is to add the following to the rsync command:
—timeout=10 # timeout after 10 seconds of no I/O
and perhaps add an option to the job to automatically resubmit if there was a failure…
S
comment:21 Changed 12 months ago by grenville
Simon
The problem is overloading of JASMIN - users are trying to wring out every last AU on ARCHER while streaming to JASMIN, those who switched to NEXCS to avoid the ARCHER hiatus are transferring data to JASMIN, and users are getting data off the RDF.
The pptransfer app will retry automatically - saving you the bother.
Grenville
comment:22 Changed 12 months ago by david
Hi Simon,
The cfa problem is sporadic, and environment based. I'm not sure how to eradicate it properly, but setting the UDUNITS2_XML_PATH environment variable ought do sort it. see
https://ncas-cms.github.io/cf-python/installation.html#unidata-udunits-2-library for details.
Thanks, David
comment:23 Changed 12 months ago by simon.tett
Hi David,
thanks for that — the document you point me to suggests setting UDUNITS2_XML_PATH to /home/user/anaconda3/share/udunits/udunits2.xml. That path is not correct as /home/user/anaconda3/ does not exist. Can you advice me what I should set it to?
Presumably if that works I should add this to my .bashrc file
Simon
comment:24 Changed 7 months ago by ros
- Resolution set to fixed
- Status changed from assigned to closed
Simon
These should reveal all:
https://code.metoffice.gov.uk/trac/moci/wiki/app_postproc
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingApp
http://cms.ncas.ac.uk/wiki/Docs/PostProcessingAppNexcsSetup
Grenville