wiki:Archer/Transition2020/PPTransfer

Configuring pptransfer app and UM suite to push to JASMIN from ARCHER serial nodes

last updated 16 September 2020

The pptransfer task used to run on the ARCHER Data Transfer Node (dtn02) which we no longer have access to. The instructions show how to modify a suite to run the pptransfer task on the ARCHER serial nodes and push the data across to JASMIN from ARCHER /work disk.

Note 1: Instructions cannot cover all possible suite setup combinations so you may need to adjust them accordingly. For example tasks may be named slightly differently or inherit differently.

Note 2: In order to use hpxfer1.jasmin.ac.uk you need to have requested access to the High Performance Data Transfer service via the JASMIN accounts portal.

Suite Changes

In the rose suite editor go to "postproc → Post processing - common settings":

  • In panel "Archer Archiving" change archive_root_path to be a directory in your $DATADIR on /work disk (e.g. /work/n02/n02/<username>/archive where <username> is your ARCHER username). This will be a temporary area to stage your data before transfer to JASMIN.
  • In panel "JASMIN Transfer":
    • Set gridftp to false
    • Set remote_host to hpxfer1.jasmin.ac.uk
      (If you do not have access to hpxfer1.jasmin.ac.uk use either xfer1.jasmin.ac.uk or xfer2.jasmin.ac.uk instead.)

In the ~/roses/<SUITEID>/site/archer.rc file edit the section [[PPTRANSFER_RESOURCE]]:

  • Replace the line host = dtn02.rdf.ac.uk with host = login.archer.ac.uk
  • Add the line inherit = HPC_SERIAL directly above the line beginning pre-script =
  • Add these 2 lines:
    [[[job]]]
        execution time limit = PT12H
    

The section should now look something like the following (Remember it may not be identical due to suite differences) :

    [[PPTRANSFER_RESOURCE]]
        inherit = HPC_SERIAL
        pre-script = """module load anaconda"""
        [[[remote]]]
            host = login.archer.ac.uk
        [[[job]]]
            execution time limit = PT12H
        [[[environment]]]
            UMDIR = ~um
            PLATFORM = linux

Setup required on ARCHER

You now need to setup ssh-agent on both ARCHER post-processing nodes (espp1 & espp2) to be able to login to hpxfer1.jasmin.ac.uk.

  1. Login to ARCHER (Note: this needs to be from somewhere other than PUMA)
  1. From ARCHER login node login to espp1 (You will be prompted for your ARCHER password only)
  1. Add the following lines to your ~/.bashrc or ~/.profile (May also be ~/.bash_profile):
    # ssh-agent setup on pp nodes
    if [[ `hostname` = esPP00* ]]
    then
      . ~/.ssh/ssh-setup
    fi
    
  1. Copy the ~/.ssh/ssh-setup script.
    $ cp /work/y07/y07/umshared/um-training/ssh-setup ~/.ssh
    
  1. Copy the ssh-key you use to access JASMIN to ~/.ssh directory (e.g. id_rsa_jasmin)
  1. Add the following to your ~/.ssh/config file (create one if it doesn't already exist):
    Host xfer?.jasmin.ac.uk hpxfer?.jasmin.ac.uk
    User <jasmin_username>
    IdentityFile ~/.ssh/<jasmin_key>
    ForwardAgent no
    

Where <jasmin_username> is your JASMIN username and <jasmin_key> is the name of you ssh-key.

Note: in order to use hpxfer1.jasmin.ac.uk you need to have requested access to the High Performance Data Transfer service via the JASMIN accounts portal.

  1. Logout of espp1 and then log back in again to start up your ssh-agent.
  1. Run ssh-add ~/.ssh/<jasmin_key> where <jasmin_key> is the name of your JASMIN ssh-key E.g. id_rsa_jasmin. (This is the key you generated when you applied for access to JASMIN). Type in your passphrase when prompted to do so.
  1. You should now be able to login to the required JASMIN transfer node (either xfer[1-2].jasmin.ac.uk or the high performance node hpxfer1.jasmin.ac.uk) without being prompted for passphrase/password.
  1. Start up ssh-agent on the other post-processing node. Login to espp2 to start up ssh-agent.
  1. Run ssh-add ~/.ssh/<jasmin_key> where <jasmin_key> is the name of your JASMIN ssh-key
  1. Again you should now be able to login to the JASMIN transfer node without being prompted for passphrase/password.

Updating a Running Suite

  1. Reload the suite: rose suite-run --reload
  2. Hold the whole suite, or just the next pptransfer task
  3. In the Cylc GUI: Control —> Insert Task(s)…
  4. Set TASK-NAME.CYCLE-POINT=fcm_make_pptransfer.<YYYYMMDDT0000Z>, where <YYYYMMDDT0000Z> is an active cycle point
  5. Leave stop-point=POINT blank
  6. Check the "Do not check if a cycle point is valid or not" box
  7. Insert, and wait for the task to complete.
  8. If nothing happens: You probably typed something incorrectly! Try again.
  9. In the Cylc GUI: Control —> Insert Task(s)…
  10. Set TASK-NAME.CYCLE-POINT=fcm_make2_pptransfer<YYYYMMDDT0000Z>, where <YYYYMMDDT0000Z> is an active cycle point
  11. Leave stop-point=POINT blank
  12. Check the "Do not check if a cycle point is valid or not" box
  13. Insert, and wait for the task to complete.
  14. Release the held suite/pptransfer task
Last modified 2 weeks ago Last modified on 16/09/20 14:11:02