Tips for NASA Pleiades#

This section provides general tips for setting up NASA NAS account and running GCHP on Pleiades.

Account setup#

NASA provides detailed walk-through NASA Account Setup:

  • The difference among LDAP and launchpad passwords, PIN and passcode:

    • LDAP password is for logging on sfe and pfe/lfe node

    • Launchpad password is the password for logging on id.nasa.gov

    • PIN is the password set for RSA SecurID

    • Passcode is the instantaneous password generated by RSA SecurID

  • Setting up public key and SSH passthrough would be helpful to make subsequent logging process easier:

    • Instructions: NASA SSH Passthrough

    • Setting up SSH passthrough requires linux-based terminal. Windows users may need to resort to terminal such as Cygwin

    • Tips: keep the Cygwin installer for the sake of future package installation such as vim (Cygwin does not install vim by default)

    • Compute1 may lose the added SSH Passthrough to NASA after re-logging. We can manually add it to .bash_profile with:

      # add for nasa
      eval `ssh-agent -s`
      ssh-add ~/.ssh/id_rsa
      
  • Differences between sfe, pfe, and lfe

    • sfe will be only used for logging into NASA NAS system

    • pfe is ususally where we land on for compiling and submitting GCHP jobs

    • lfe is usually where we store massive data, such as restart files and outputs from GCHP simulations

Note

/nobackup filesystem is mounted on lfe as well, so we can also submit GCHP jobs on lfe.

Shiftc data transferring tool#

  • Instructions for local transfer (within NASA NAS system): shiftc local transfer

  • Instructions for installing shiftc on other clusters (e.g. Compute1): shiftc remote transfer

    • Add sup to your $PATH. For example, if your sup is located at $HOME/bin/sup, then add export PATH=$PATH:$HOME/bin to .bash_profile and lsf-conf.rc

    • The command of sup shiftc will expire every 604800s. We can check by using such as sup shiftc --status --state=run on Compute1 home node

    • Transferring outside NASA system needs to be initiated from remote cluster, i.e., using sup shiftc on remote cluster to transfer files from/to NASA system

  • Transferrring between Compute1 and NASA by batch jobs

    1. Installing shiftc on home node of Compute1 is also required for batch jobs

    2. There is also an available container (docker(registry.gsc.wustl.edu/sleong/bbftp))

    3. Add tail -f /dev/null for batch data transferring on Compute1 to avoid losing connection to clusters.

      Then manually kill the compute1 job when transferring finished.

      An example:

      #!/bin/bash
      #BSUB -n 1
      #BSUB -R "rusage[mem=50G] span[ptile=1] select[mem < 500GB] order[-slots]"
      #BSUB -q rvmartin
      #BSUB -a 'docker(registry.gsc.wustl.edu/sleong/bbftp)'
      #BSUB -N
      #BSUB -u <your_wustl_key>@wustl.edu
      #BSUB -o transfer-%J.txt
      #BSUB -J "transfer"
      
      cd /my-projects
      sup shiftc pfe:/nobackup/dzhang8/GEOSChem.ACAG.20180101*.nc4 .
      tail -f /dev/null
      

Note

Transferring data (restarts and outputs from GCHP) from pfe to lfe would be very helpful to reduce the amount of storage we need on pfe

Running GCHP on Pleiades#

  • GCHP environment: source the environment script by source /u/dzhang8/gchp-intel.202202.env to compile or run your GCHP

  • Example running script can be found at /u/dzhang8/run.pbs

  • NASA Pleiades system uses PBS for job scheduling. Commonly used PBS commands can be found at PBS Commands

  • Real-time usage of different clusters on NASA Pleiades can be monitored at NASA System Status (Note it will take several minutes for the website to be updated)

  • Model inputs /ExtData

    • There is no /ExtData like what we have on Compute1, but there are some customized downloaded inputs as follows:

      Sebastian has downloaded multiple required inputs at /nobackup/seastham/ExtData/

      Dandan has downloaded required inputs for simulations in 2018 and 2019 at /nobackup/dzhang8/ExtData/

Processing outputs on Pleiades#

  • Specific data analysis node: Lou Data Analysis Nodes (LDAN) can be used for postprocessing data (e.g. GCHP diagonostics)

  • Python environment: source the environment script by source /u/dzhang8/python-gchp.env

  • Need to bring data to disk before processing data on lfe to avoid unpredictable time stuck for I/O processes (see bring data to disk)

Note

It would help save space on Pleiades by first checking whether inputs you need are available or not and only downloading inputs you need.