Running Applications

Table of Contents

Python

First, load your Python script onto your Aeolus user, and then ssh into it:

scp [localfile].py [username]@aeolus.wsu.edu:~
ssh [username]@aeolus.wsu.edu

Next, create a job submission script. At the bottom of it, load the necessary modules (gcc/7.3.0 is needed for this version of Python) and add a line to run your script.

Example job script:

# ...
# qsub stuff. See job submission docs.
# ...

module load gcc/7.3.0
module load python/3.7.1/gcc/7.3.0
python3 ~/[filename].py

Example [filename].py:

tacos = 12

for i in range(tacos):
    print('{} taco'.format(i))

Note: if you’re using Python 2.7, you’ll need to run your script with python2.7 explicitly. If in doubt, check ls /share/el6.x86_64/opt/[module]/bin for the name of the binary, and then run which [binary name] to confirm that it points to that directory. For example, ls /share/el6.x86_64/opt/python/2.7.15/gcc/7.3.0/bin and then which python2.7.

Installing Python Modules

Python modules only need to be installed once. Following these instructions, the Python modules you install will live in your ~/.local/libs directory for use in future sessions.

Python modules can be installed with pip2.7 or pip3, depending on your target version. If in doubt, check ls /share/el6.x86_64/opt/[module]/bin for the name of the binary, and then run which [binary name] to confirm that it points to that directory.

Per General Rules of Using Aeolus, software should not be compiled on an ssh/login node. Since some Python modules compile during install, you’ll need to start an interactive session on a compute node to install your modules. Check your job script to make sure that you will be requesting enough resources and time for the necessary modules to compile.

Start an interactive session by adding the -I flag to your qsub call:

qsub -I [job script].sh

This will start an interactive session, which you can confirm by checking your shell signature: [user@compute-x-x-xx ~]$

Since you are in an interactive session, the modules you specified at the bottom of your job script will not have loaded, so you will need to load them now by running module load gcc/7.3.0; module load python/3.7.1/gcc/7.3.0 (or applicable Python version).

You can now install the modules on the compute node. Make sure to add the --user flag. This will make the modules install in your ~/.local/libs so you can use them in future sessions without re-installing.

pip3 install --user twisted

You can now exit the compute node and then run your Python job normally: qsub [jobscripts.sh], and modules imported within your script will get loaded from the modules you installed to the ~/.local/lib directory.

Jupyter Notebooks and Jupyter Lab

It is very easy to run a Jupyter Notebook or Lab on Aeolus and access it from your own computer. First SSH into Aeolus:

ssh [your user name]@aeolus.wsu.edu

Once you have logged in, you will need to load your desired version of Python:

module load python/[python version]/gcc/[gcc version]

Once you have done this, you will have access to a tool called newjupyter which will help you create a new jupyter lab or notebook. Try typing newjupyter --help to see what all your options are, as well as the defaults. In this case, I will use the defaults and send the job status to my email:

$ newjupyter [my email address]@wsu.edu

# Output:
Waiting for notebook to start...
Waiting for notebook to start...

Job running. Job ID: xxxxxx
Check job status: qstat xxxxxx
End job: qdel xxxxxx

Your Jupyter notebook is ready.
You can view Jupyter's logs in ~/.jupyter/logs

To use, open a new terminal on the computer you want to access your Jupyter Notebook from (not Aeolus) and run:
ssh -NL [port]:[ip address]:[port] [user]@aeolus.wsu.edu

Then, paste the following in your browser:
http://localhost:[port]?token=[the token]

In this example, specific job IDs, port numbers, IPs, and tokens have been removed from the output, but they will be there when you run the command. Simply follow the directions, copy-pasting the ssh -NL line into a terminal on the system you want to access Jupyter from (not Aeolus, but your own shell), and then copying the http://localhost line and pasting it in your browser on that same system.

Jupyter Labs

The above example created a Jupyter Notebook, but you can have it create a Jupyter Lab instead by simply specifying the --lab flag:

newjupyter --lab [my email address]@wsu.edu

Here are some example commands:

# Create a jupyter notebook which runs for
# 30 minutes with 1 processor and 1gb of ram
# and sends the job status to my-email
newjupyter my-email@wsu.edu

# Create a jupyter lab which runs for
# 1 hour with 1 processor, 512 mb of ram,
# and has jobname "j-job-3"
newjupyter --lab -n j-job-3 -t 01:00:00 -m 512mb my-email@wsu.edu

All of these commands will output instructions to access your newly created lab from a browser.

R scripts

First, load your R script onto your Aeolus user, and then ssh into it:

scp [localfile].R [username]@aeolus.wsu.edu:~
ssh [username]@aeolus.wsu.edu

Next, create a job submission script. At the bottom of it, load the necessary modules (gcc/7.3.0 is needed for this version of Python) and add a line to run your script.

# ...
# qsub stuff. See job submission docs.
# ...

module load gcc/7.3.0
module load r/3.5.1/gcc/7.3.0
Rscript ~/[filename].R

Example [filename].R:

# Find factorial of num
num = 8
factorial = 1
for(i in 1:num) {
    factorial = factorial * 1
}
print(paste("The factorial of", num, "is", factorial))

To run the script on the cluster, queue your job: qsub [jobscript].sh

Installing CRAN packages

CRAN packages only need to be installed once. After that, library(modulename) in your R script will import it.

To install CRAN packages, you’ll need to create a place for them to live locally in your Aeolus user’s home directory (ssh into your Aeolus user first):

mkdir -p ~/.local/lib/R3.5.1

Per General Rules of Using Aeolus, software should not be compiled on an ssh/login node. Since many R packages compile during install, you’ll need to start an interactive session on a compute node to install your packages. Check your job script to make sure that you will be requesting enough resources and time for the necessary modules to compile.

Start an interactive session by adding the -I flag to your qsub call:

qsub -I [job script].sh

This will start an interactive session, which you can confirm by checking your shell signature: [user@compute-x-x-xx ~]$

Since you are in an interactive session, the modules you specified at the bottom of your job script will not have loaded, so you will need to load them now by running module load gcc/7.3.0; module load r/3.5.1/gcc/7.3.0. Next, start the R interpreter (simply run R) and install your packages, specifying the local directory you created as the lib:

R> install.packages(c("lattice", "abctools"), lib="~/.local/lib/R3.5.1", repos="http://mirrors.vcea.wsu.edu/r-cran/")

Note: In the line above, repos= specifies the University of Oregon CRAN mirror. If you’d like to select a different mirror, simply remove that parameter and it will prompt you to select one from a list.

When the modules have finished installing, leave the R interpreter (Ctrl-d) and close the interactive session (exit). This will return you to your Aeolus login user.

Finally, you’ll need to create a file to tell R where your packages live. Create a file called .Renviron in your home directory, and specify your library directory:

R_LIBS_USER=~/.local/lib/R3.5.1

You will now be able to run your job script normally, importing your desired packages within your script using library(name).

Golang

Getting set up

To set up your Go workspace on Aeolus, you will first need to load the module:

module load go/1.11.5

This will set your GOPATH environment variable to /home/$USER/go. If you want a different location for your Go workspace, you can override it by changing it (export GOPATH=/path/to/workspace), but you will need to do this every time you load the module, since loading the module sets the GOPATH to ~/go. This is the recommended location for your Go workspace.

Next, create the folder structure for your workspace:

mkdir -p ~/go/src/[organization]/[unique name]

Your organization and unique name are up to you. Many people use github.com and their github username respectively, especially if the project will be hosted on GitHub. You could also use wsu.edu and your wsu ID. Whatever it is, you should make it unique so that people can potentially import your projects in the future.

All the personal projects you work on should live in $GOPATH/src/[organization]/[unique name]/. For instance, if you wanted to build an FTP server in Go, and it was checked out in your personal GitHub, it might live here: ~/go/src/github.com/my-git-user/my-go-ftp. The my-go-ftp folder will be the project root, and will contain .go source files and sub-folders. When you want to start working on a new project, you should create a new folder in ~/go/src/github.com/my-git-user/ or whatever your chosen personal directory is.

Similarly, when importing other people’s code, you should put it in $GOPATH/src/[organization]/[their personal identifier]/[the project]. go get will do this for you. For example, go get github.com/goftp/server will install this project to $GOPATH/src/github.com/goftp/server. But in the case that you need to install external code yourself, simply mkdir -p $GOPATH/src/[their organization]/[their identifier] and put the project in that folder.

In addition to setting your GOPATH, the module loader also adds $GOPATH/bin to your PATH so that after running go install, you can run the binaries anywhere. However, if you would like to access these binaries every time you log in to Aeolus without first loading Go, you’ll need to modify your .bash_profile:

export PATH=$PATH:$HOME/go/bin

Go CLI tool

The Go CLI tool has several useful functions:

$ go get [repository]/[user]/[project]
# Gets the specified code and puts it in
# $GOPATH/src. Can be called from anywhere
# because it depends on the GOPATH not the
# directory it was called from.

$ go build github.com/my-git-user/my-fun-project
# Builds my-fun-project in $GOPATH/src/github.com/my-git-user/.
# Binaries and libraries that are built
# will not yet be installed. This command can
# be run from anywhere, but can be shortened
# to `go build` if you are in the project folder.

$ go install github.com/my-git-user/my-fun-project
# Builds the project if necessary, or simply
# installs the libraries and binaries. Binaries
# will be installed to $GOPATH/bin, and libraries
# will be installed to:
# $GOPATH/pkg/linux_amd64/[org]/[repo]/[package].
# This can be called anywhere, but can be shortened
# to `go install` if you are in the project folder.

Note: Remember that per the General Rules of Using Aeolus, all compiling should be done in a submitted job rather than on your login node. A recommended way to use Go would be to request an interactive session (qsub -I jobscript.sh), install and compile your code, and exit the session. Then, to run your compiled binary as part of a job, simply add $GOPATH/bin to your PATH (module load go/1.11.5 will do this for you) and execute the binary. For example, put this at the end of a job script:

# jobscript.sh

#PBS stuff

module load go/1.11.5
mygobinary --option1 --logfile /fastscratch/$USER/my-go-log

And queue the job: qsub jobscript.sh.

Singularity (Docker)

Singularity is the HPC community’s answer to Docker. It was built to solve the problem of allowing non-trusted users to run containers. Singularity can run most Docker containers easily, and there are many Singularity-specific containers in the Singularity Library. For general information about Singularity, see the Singularity docs.

Setup

Per General Rules of Using Aeolus, computation and compilation should be done on a compute node. You should create a job submission script to request the proper resources to install and run your containers.

Run your job interactively: qsub -I [jobs script].sh. This will give you an SSH session on the resource you requested in your job script.

You will need to load go and singularity by running the following on the compute node shell:

module load go/1.11.5
module load singularity/3.0.3/go/1.11.5

You can now pull, build, and run images. You may want to make a directory in your home directory to store images that you pull.

Pulling images

You can pull images from several sources. To find Singularity images, you can use singularity search [term]. For instance:

$ singularity search ubun
No users found for 'ubun'

No collections found for 'ubun'

Found 10 containers for 'ubun'
        library://jialipassion/official/ubuntu
                Tags: latest
        library://dtrudg/linux/ubuntu
                Tags: 14.04 16.04 17.10 18.04 artful bionic devel latest rolling trusty xenial
        library://sylabs-adam/linux/ubuntu
                Tags: latest
        library://sylabs-jms/testing/ubuntu-armhf.sif
                Tags: latest
        library://library/default/ubuntu
                Tags: 14.04 16.04 18.04 18.10 latest
        library://jialigithub/default/ubuntu
                Tags: 18.05
        library://sylabs-andre/default/ubuntu
                Tags:
        library://mroche/baseline/ubuntu
                Tags: 16.04 18.04 bionic latest xenial
        library://ynop/default/siubuntu
                Tags:
        library://westleyk/official/ubuntu
                Tags: 16.04 18.04 latest

You can then pull images like:

singularity pull library://dtrudg/linux/ubuntu:trusty
# OR, for the default source:
singularity pull library://ubuntu:trusty
# OR, to pull a Docker image:
singularity pull docker://ubuntu:trusty
# OR, to build an image and name it:
singularity build ubuntu.sif library://ubuntu:trusty

Note: you will have to search Docker Hub independently from a browser to find images, as singularity search does not search Docker Hub.

All of these commands will pull an image and build a .sif file, Singularity’s equivalent of a .img. Saving these to a folder in your home directory (e.g. /home/$USER/singularity) will allow you to run them in the future without re-pulling.

Running images

You can run an image directly by doing:

singularity run [image name].sif
# OR
singularity shell [image name].sif

Note: if you run an image, the image’s default command will be run (e.g. start a web server) and you will be given a shell, but if you shell an image you will only be given a shell and the default command will not be run.

To confirm that you have entered the container’s shell, check to see if the shell signature has changed or try typing which singularity (it should return nothing). To exit the shell and end the container, type exit.

Alternatively, if you simply want to run a command within the container without entering a shell, you can use the exec command:

# singularity exec [image name].sif [bash commands]
$ singularity exec ubuntu.sif cat /etc/os-release
NAME="Ubuntu"
...

When using exec, the lifetime of the container is the lifetime of the command you pass it, so when the command completes (in this case, cat /etc/os-release), the container ends.

Writing to images

Appending --writable to exec, shell, or run commands will make your changes persist in the image. You can use this if you want to install pip modules or any software with yum.

Building custom images

Building a .sif image from a .def file requires root privileges in Singularity, which are not available on Aeolus. However, you can create a .def file on an external machine and then host its image in the Singularity library. You will then be able to pull this custom image on Aeolus.

For in-depth information about building your own Singularity images, see their documentation on it.

Using Singularity with qsub

Once you have pulled images on a compute node, you can run images without an interactive session. For instance, if you wanted to run an iPerf server on a compute node, you could pull an iPerf3 image to your singularity image folder (singularity pull docker://iperf3) and then add the following to the bottom of your job script:

# ...
#PBS stuff, see job submission docs
# ...

module load go/1.11.5
module load singularity/3.0.3/go/1.11.5

singularity run image_folder/iperf3-latest.sif -s

You can then submit your job with qsub [job script name].sh, and an iPerf server will be started on the compute node.

Limitations

NOTE: Due to the kernel version of the HPC cluster (2.32), Singularity is unable to run some recent Linux versions. If this is the case, you will get this:

$ singularity run ubuntu_latest.sif
FATAL: kernel too old

Solution: Try using an older distribution.

Example

Here is an example usage of Singularity. In this example, we’ll be running a simple python script in Alpine Linux on Aeolus.

First, load the Python script you want to run onto your Aeolus home directory. Here is the script I will use; it calculates prime numbers up to the one specified as a command line argument:

# Usage: python prime.py [limit]
import sys
max_range = int(sys.argv[1])

primes = []

for i in range(2, max_range + 1):
    isPrime = True
    for num in range(2, int(i ** 0.5) + 1):
        if i % num == 0:
            isPrime = False
            break

    if isPrime:
        primes.append(i)

print ('\n'.join(map(str, primes)))

Once the Python script is saved in your home Aeolus directory as prime.py, you must create a job script to allocate resources for running your script on a compute node. For more information about this, see the job submission script Here is our job script, saved as torquescript.sh:

#!/bin/bash

#PBS -V

#PBS -N my_serial_01

#PBS -l nodes=1:dev:ppn=2
#PBS -l mem=2048mb
#PBS -l walltime=00:05:00
#PBS -q batch

## Define path for output & error logs
#PBS -k o

#PBS -e /fastscratch/[your user name]/my_serial_01.e
#PBS -o /fastscratch/[your user name]/my_serial_01.o

## Define path for reporting
#PBS -M [your email address]@wsu.edu
#PBS -m abe

This will create a new job called my_serial_01 with 1 node, 2 processors per node (PPN), and 2048mb of RAM to work with. Additionally, it allocates 5 mins of wall time. Make sure to replace [your user name] with your user name, and [your email address] with the email you want to receive alerts about job status.

Before running our script for the first time, we want to pull and build an image on a compute node that will live in our Aeolus home directory. Before entering the compute node, create a new directory called “containers” (run mkdir containers). Then, enter a compute node by running your job script in interactive mode:

qsub -I torquescript.sh

This will give you a shell to the compute node once your job is processed, with the resources you requested. Type ls to make sure you see your containers folder, and then cd containers.

Now, you will want to pull your Alpine Linux + Python Docker image to run your Python script. By searching for alpine python on Docker Hub, I found that the official Python repo has a tag for Alpine. Let’s pull it:

singularity pull docker://python:alpine

This will download the Docker image and create a .sif file from it. Once it has downloaded, run ls to see what it was saved as, for example python_alpine.sif. You can now run your Python script in an interactive job using Singularity:

$ singularity exec python-alpine.sif python ../prime.py 1000
2
3
5
7
...
983
991
997

Congratulations! You’ve run a program on Alpine Linux, even though you are on CentOS, and without yum installing anything!

If you need to install modules, you can do that now. You can either get a --writable shell, or just run --writable exec commands.

Note: without --writable, your pip install changes will not be saved for the next time you run the .sif image.

# This will give you a shell to run pip install from.
# When you have finished installing, type exit.
singularity shell --writable python_alpine.sif

# This will simply run your install commands directly.
singularity exec --writable python_alpine.sif pip install numpy

However, our script does not need any external modules.

Type exit to end the Torque job and return to your Aeolus login node. To run the script within a non-interactive job and get the output in fastscratch, you’ll want to append the following to your torquescript.sh:

module load singularity/3.0.3/go/1.11.5
singularity exec containers/python-alpine.sif python prime.py 1000

Then, queue the job: qsub torquescript.sh.

Once the script has completed, you will find the output in /fastscratch/[your user name]/my_serial_01.o.

Matlab

Matlab can be run on the cluster either as a GUI using X11 over SSH, or through the command line.

Command Line

To use Matlab on the cluster, first load the module:

module load matlab/R2018b/gcc/7.3.0

Then, simply run newmatlab. This will run the matlab command with the options -nodisplay, -nodesktop, and -nosplash to make it more lightweight, since you will not be utilizing any graphics.

Matlab GUI

If you want to use the Matlab GUI, please do so inside a scheduled interactive job, not in the login node. Matlab uses X11, which can consume a lot of memory which can quickly fill up the login node. See the job submission script for more information about scheduling a job. You should schedule at least 4GB.

To start an interactive session:

qsub -I jobscript.sh

This will bring you a session on the compute node. Next, get the host name of the node that you are on:

hostname

Save the output of that command and keep the ssh window open, but put it aside. Next, open a new terminal on your machine and issue the following commands:

ssh -Y aeolus.wsu.edu
ssh -Y [hostname]
module load matlab/R2018b/gcc/7.3.0
matlab

Important: the first command will get you into the login node with X11 forwarding (that’s what the Y option does). The second command will get you into the compute node that your interactive session is running on, also forwarding X11. The third command will load the Matlab module, and the final command will start Matlab (with graphics). The X11 Matlab windows will be forwarded from the compute node to your machine.

When you are done using Matlab, remember to close your interactive session.

Content

Space Tools

Breadcrumbs