wiki:Docs/PostProcessingAppArcherSetup

Version 13 (modified by ros, 13 months ago) (diff)

ARCHER specific setup instructions for data transfer to JASMIN

Adding the PP Transfer task to a suite

You only need to follow these instructions if your suite doesn't already have the "PP Transfer" option available.

Search in the rose edit GUI for the variable PPTRANSFER. If it is not found you will need to proceed with the following instructions.

rose-suite.conf

  1. Add PPTRANSFER=true
  1. The PPTRANSFER variable will, by default, appear under "suite conf → jinja2". To tell Rose to place it with all the other suite control switches (e.g. "Build UM" & "Run Reconfiguration") usually found in a panel in the suite conf section under "Build and Run" or Tasks edit the meta/rose-meta.conf file to add in the metadata for the PPTRANSFER variable. Place it under the definition for POSTPROC. (This step is optional.)
    [jinja2:suite.rc=PPTRANSFER]
    compulsory=true
    description=Transfer files archived with PostProc to a remote machine
    help=
    ns=<panel_namespace>
    sort-key=runPostproc1
    title=PP Transfer
    type=boolean
    

Where <panel_namespace> is the same value as for the POSTPROC entry in this file; e.g. ns=Build and Run

suite.rc

Note: Depending on the suite setup you may find the appropriate sections to modify in the site/archer.rc file rather than the suite.rc.

  1. Add the build & run of pptransfer task into the cylc graph initial cycle. Add the line:
    {{ 'fcm_make_pptransfer => fcm_make2_pptransfer' + (' => pptransfer' if RUN else '') if PPTRANSFER else '' }}
    
    to the cylc graph for the initial cycle, indicated by [[[ R1 ]]]. For example: (insertion indicated by "⇐ Add line here")
        [[dependencies]]
            [[[ R1 ]]]
                graph = """
    {{ 'fcm_make_pp => fcm_make2_pp' + (' => postproc' if RUN else '') if POSTPROC else '' }}
    {{ 'fcm_make_pptransfer => fcm_make2_pptransfer' + (' => pptransfer' if RUN else '') if PPTRANSFER else '' }}    <== Add line here
    {{ 'fcm_make_ocean => fcm_make2_ocean' + (' => recon' if RECON else ' => coupled' if RUN else '') if BUILD_OCEAN else '' }}
    {{ 'fcm_make_um => fcm_make2_um' + (' => recon' if RECON else ' => coupled' if RUN else '') if BUILD_UM else '' }}
    {{ 'install_ancil => recon ' if RECON else ('install_ancil => coupled' if RUN else '')}}
    {{ 'recon' + (' => coupled' if RUN else '') if RECON else '' }}
    {{ 'clearout' + (' => coupled' if RUN else '') if CLEAROUT else '' }}
    """
    
  1. Add the pptransfer task into the graph for all subsequent cycles such that it runs after the postproc task and also waits for the previous pptransfer task to complete. As an example for a coupled suite (All added lines indicated with "⇐"):
            [[[ {{FMT}} ]]]
                graph = """
    {% if RUN %}
    coupled[-{{FMT}}] => coupled {{ '=> \\' if POSTPROC or HOUSEKEEP else '' }}
      {% if POSTPROC %}
    postproc {{ '=> \\' if PPTRANSFER or HOUSEKEEP else '' }}     <= "PPTRANSFER or" added here
      {% endif %}
      {% if PPTRANSFER %}                                         <=
    pptransfer {{ '=> \\' if HOUSEKEEP else '' }}                 <=
      {% endif %}                                                 <=
      {% if HOUSEKEEP %}
    housekeeping
      {% endif %}
      {% if POSTPROC %}
    postproc[-{{FMT}}] => postproc
      {% endif %}
      {% if PPTRANSFER %}                                         <=
    pptransfer[-{{FMT}}] => pptransfer                            <=
      {% endif %}                                                 <=
    {% endif %}
    """
    

Take care to ensure there is no trailing whitespace at the end of each added line.

  1. In the [[postproc]] task check that pre-script is set as follows:
        [[postproc]]
            inherit = ...
            pre-script = "module load nco/4.5.0; module load anaconda/2.2.0-python2;  module list 2> &1; export PYTHONPATH=$PYTHONPATH:$UMDIR/lib/python2.7; ulimit -s unlimited"
            ...
    

Note: If your are running the coupled model and using netcdf module cray-netcdf/4.4.1.1 then the pre-script line will need to be:

        pre-script = "module load nco/4.6.8; module load anaconda; module list 2>&1; export PYTHONPATH=$PYTHONPATH:$UMDIR/lib/python2.7; ulimit -s unlimited"
  1. At the end of the file add the pptransfer task definitions:
        [[PPTRANSFER]]
            [[[remote]]]
                host = dtn02.rdf.ac.uk
            [[[environment]]]
                UMDIR=~um
    
        [[PPTRANSFER_BUILD]]
            [[[environment]]]
                ROSE_TASK_APP=fcm_make_pp
    
        [[fcm_make_pptransfer]]
            inherit = None, LINUX_UM, PPTRANSFER_BUILD
    
        [[fcm_make2_pptransfer]]
            inherit = None, PPTRANSFER, PPTRANSFER_BUILD
       
        [[pptransfer]]
            inherit = PPTRANSFER
            pre-script = "module load anaconda"
            [[[environment]]]
                CYCLEPERIOD = $( rose date $CYLC_TASK_CYCLE_POINT $CYLC_TASK_CYCLE_POINT --calendar {{CALENDAR}} --offset2 {{FMT}} -f y,m,d,h,M,s )
                ROSE_TASK_APP=postproc
                PLATFORM = linux
    

Note: LINUX_UM may be called something different (e.g EXTRACT_RESOURCE) depending on the suite.

Set up ssh-key to connect from PUMA to Data Transfer Node

These instructions assume you already have an ssh-agent running on PUMA which you would already have needed in order to run a UM suite on ARCHER. If this is not the case then please setup ssh-agent from PUMA to ARCHER before continuing.

  1. Login to dtn02.rdf.ac.uk. Assuming you have not already setup ssh-agent to access the Data Transfer Nodes you will be prompted for your ARCHER password. If you are logged on with no password/passphrase prompt then you are already setup and can move onto the next section "Set up ssh-key to connect from Data Transfer Node to JASMIN" below.
  1. Create .ssh directory:
    $ mkdir ~/.ssh
    
  1. Logout of dtn02.
  1. Login to ARCHER: ssh <username>@login.archer.ac.uk
  1. Copy your authorized_keys file from ARCHER home disk to /nerc disk:
    $ cat ~/.ssh/authorized_keys >> /nerc/n02/n02/<username>/.ssh/authorized_keys
    
    where <username> is your ARCHER username.
  1. Logout of ARCHER.
  1. Login to dtn02 from PUMA: ssh <username>@dtn02.rdf.ac.uk.
    You should now be logged straight in without prompt for password/passphrase.

Set up ssh-key to connect from Data Transfer Node to JASMIN

  1. Login to dtn02.rdf.ac.uk.
  1. Add the following lines to your ~/.profile (Create one if it doesn't already exist):
    export PATH=/general/y07/umshared/software/bin:$PATH
    
    # ssh-agent setup
    . ~/.ssh/ssh-setup
    
  1. Copy ~/.ssh/ssh-setup script.
$ cp /nerc/n02/n02/ros/software/bin/ssh-setup ~/.ssh
  1. Copy the ssh-key you use to access JASMIN to ~/.ssh directory.
  1. Logout of dtn02 and then log back in again to start up your ssh-agent.
  1. Run ssh-add ~/.ssh/<jasmin_key> where <jasmin_key> is the name of your JASMIN ssh-key E.g. id_rsa_jasmin. (This is the key you generated when you applied for access to JASMIN). Type in your passphrase when prompted to do so.
  1. You should now be able to login to JASMIN without being prompted for passphrase/password.