wiki:Archer/Transition2020/PPTransfer

Version 25 (modified by pmcguire, 7 months ago) (diff)

changed host in suite

Configuring pptransfer app and UM suite to push to JASMIN from ARCHER serial nodes

The pptransfer task used to run on the ARCHER Data Transfer Node (dtn02) which we no longer have access to. The instructions show how to modify a suite to run the pptransfer task on the ARCHER serial nodes and push the data across to JASMIN from ARCHER /work disk.

Note: Instructions cannot cover all possible suite setup combinations so you may need to adjust them accordingly. For example tasks may be named slightly differently or inherit differently.

Suite Changes

In the rose suite editor go to "postproc → Post processing - common settings":

  • In panel "Archer Archiving" change archive_root_path to be a directory in your $DATADIR on /work disk (e.g. /work/n02/n02/<username>/archive where <username> is your ARCHER username). This will be a temporary area to stage your data before transfer to JASMIN.
  • In panel "JASMIN Transfer" change:
    • Set gridftp to false

In the ~/roses/<SUITEID>/site/archer.rc file edit the section [[PPTRANSFER_RESOURCE]]:

  • Replace the line host = dtn02.rdf.ac.uk with host = login.archer.ac.uk
  • Add the line inherit = HPC_SERIAL directly above the line beginning pre-script =
  • Add these 2 lines:
    [[[job]]]
        execution time limit = PT12H
    

The section should now look something like the following (Remember it may not be identical due to suite differences) :

    [[PPTRANSFER_RESOURCE]]
        inherit = HPC_SERIAL
        pre-script = """module load anaconda"""
        [[[remote]]]
            host = login.archer.ac.uk
        [[[job]]]
            execution time limit = PT12H
        [[[environment]]]
            UMDIR = ~um
            PLATFORM = linux

Setup required on ARCHER

You now need to setup ssh-agent on both ARCHER post-processing nodes (espp1 & espp2) to be able to login to xfer2.jasmin.ac.uk.

  1. Login to ARCHER (Note: this needs to be from somewhere other than PUMA)
  1. From ARCHER login node login to espp1 (You will be prompted for your ARCHER password only)
  1. Add the following lines to your ~/.bashrc or ~/.profile (May also be ~/.bash_profile):
    # ssh-agent setup on pp nodes
    if [[ `hostname` = esPP00* ]]
    then
      . ~/.ssh/ssh-setup
    fi
    
  1. Copy the ~/.ssh/ssh-setup script.
    $ cp /work/y07/y07/umshared/um-training/ssh-setup ~/.ssh
    
  1. Copy the ssh-key you use to access JASMIN to ~/.ssh directory (e.g. id_rsa_jasmin)
  1. Add the following to your ~/.ssh/config file (create one if it doesn't already exist):
    Host xfer?.jasmin.ac.uk
    User <jasmin_username>
    IdentityFile ~/.ssh/<jasmin_key>
    ForwardAgent no
    

Where <jasmin_username> is your JASMIN username and <jasmin_key> is the name of you ssh-key.

Note: in order to use xfer2.jasmin.ac.uk you need to have requested access to the High Performance Data Transfer service via the JASMIN accounts portal.

  1. Logout of espp1 and then log back in again to start up your ssh-agent.
  1. Run ssh-add ~/.ssh/<jasmin_key> where <jasmin_key> is the name of your JASMIN ssh-key E.g. id_rsa_jasmin. (This is the key you generated when you applied for access to JASMIN). Type in your passphrase when prompted to do so.
  1. You should now be able to login to the required JASMIN transfer node (either xfer1.jasmin.ac.uk or the high performance node xfer2.jasmin.ac.uk) without being prompted for passphrase/password.
  1. Start up ssh-agent on the other post-processing node. Login to espp2 to start up ssh-agent.
  1. Run ssh-add ~/.ssh/<jasmin_key> where <jasmin_key> is the name of your JASMIN ssh-key
  1. Again you should now be able to login to the JASMIN transfer node without being prompted for passphrase/password.

Updating a Running Suite

  1. Due to recent changes in hostnames, make sure that the file app/postproc/rose-app.conf has remote_host=xfer2.jasmin.ac.uk instead of remote_host=jasmin-xfer2.ceda.ac.uk
  2. Reload the suite: rose suite-run --reload
  3. Hold the whole suite, or just the next pptransfer task
  4. In the Cylc GUI: Control —> Insert Task(s)…
  5. Set TASK-NAME.CYCLE-POINT=fcm_make_pptransfer.<YYYYMMDDT0000Z>, where <YYYYMMDDT0000Z> is an active cycle point
  6. Leave stop-point=POINT blank
  7. Check the "Do not check if a cycle point is valid or not" box
  8. Insert, and wait for the task to complete.
  9. If nothing happens: You probably typed something incorrectly! Try again.
  10. In the Cylc GUI: Control —> Insert Task(s)…
  11. Set TASK-NAME.CYCLE-POINT=fcm_make2_pptransfer<YYYYMMDDT0000Z>, where <YYYYMMDDT0000Z> is an active cycle point
  12. Leave stop-point=POINT blank
  13. Check the "Do not check if a cycle point is valid or not" box
  14. Insert, and wait for the task to complete.
  15. Release the held suite/pptransfer task