Table of Contents
Python
First, load your Python script onto your Aeolus user, and then ssh into it:
scp [localfile].py [username]@aeolus.wsu.edu:~ ssh [username]@aeolus.wsu.edu
Next, create a job submission script. At the bottom of it, load the necessary modules (gcc/7.3.0
is needed for this version of Python) and add a line to run your script.
Example job script:
# ... # qsub stuff. See job submission docs. # ... module load gcc/7.3.0 module load python/3.7.1/gcc/7.3.0 python3 ~/[filename].py
Example [filename].py:
tacos = 12 for i in range(tacos): print('{} taco'.format(i))
Note: if you’re using Python 2.7, you’ll need to run your script with python2.7
explicitly. If in doubt, check ls /share/el6.x86_64/opt/[module]/bin
for the name of the binary, and then run which [binary name]
to confirm that it points to that directory. For example, ls /share/el6.x86_64/opt/python/2.7.15/gcc/7.3.0/bin
and then which python2.7
.
Installing Python Modules
Python modules only need to be installed once. Following these instructions, the Python modules you install will live in your ~/.local/libs
directory for use in future sessions.
Python modules can be installed with pip2.7
or pip3
, depending on your target version. If in doubt, check ls /share/el6.x86_64/opt/[module]/bin
for the name of the binary, and then run which [binary name]
to confirm that it points to that directory.
Per General Rules of Using Aeolus, software should not be compiled on an ssh/login node. Since some Python modules compile during install, you’ll need to start an interactive session on a compute node to install your modules. Check your job script to make sure that you will be requesting enough resources and time for the necessary modules to compile.
Start an interactive session by adding the -I
flag to your qsub call:
qsub -I [job script].sh
This will start an interactive session, which you can confirm by checking your shell signature: [user@compute-x-x-xx ~]$
Since you are in an interactive session, the modules you specified at the bottom of your job script will not have loaded, so you will need to load them now by running module load gcc/7.3.0; module load python/3.7.1/gcc/7.3.0
(or applicable Python version).
You can now install the modules on the compute node. Make sure to add the --user
flag. This will make the modules install in your ~/.local/libs
so you can use them in future sessions without re-installing.
pip3 install --user twisted
You can now exit
the compute node and then run your Python job normally: qsub [jobscripts.sh]
, and modules imported within your script will get loaded from the modules you installed to the ~/.local/lib
directory.
Jupyter Notebooks and Jupyter Lab
It is very easy to run a Jupyter Notebook or Lab on Aeolus and access it from your own computer. First SSH into Aeolus:
ssh [your user name]@aeolus.wsu.edu
Once you have logged in, you will need to load your desired version of Python:
module load python/[python version]/gcc/[gcc version]
Once you have done this, you will have access to a tool called newjupyter
which will help you create a new jupyter lab or notebook. Try typing newjupyter --help
to see what all your options are, as well as the defaults. In this case, I will use the defaults and send the job status to my email:
$ newjupyter [my email address]@wsu.edu # Output: Waiting for notebook to start... Waiting for notebook to start... Job running. Job ID: xxxxxx Check job status: qstat xxxxxx End job: qdel xxxxxx Your Jupyter notebook is ready. You can view Jupyter's logs in ~/.jupyter/logs To use, open a new terminal on the computer you want to access your Jupyter Notebook from (not Aeolus) and run: ssh -NL [port]:[ip address]:[port] [user]@aeolus.wsu.edu Then, paste the following in your browser: http://localhost:[port]?token=[the token]
In this example, specific job IDs, port numbers, IPs, and tokens have been removed from the output, but they will be there when you run the command. Simply follow the directions, copy-pasting the ssh -NL
line into a terminal on the system you want to access Jupyter from (not Aeolus, but your own shell), and then copying the http://localhost
line and pasting it in your browser on that same system.
Jupyter Labs
The above example created a Jupyter Notebook, but you can have it create a Jupyter Lab instead by simply specifying the --lab
flag:
newjupyter --lab [my email address]@wsu.edu
Here are some example commands:
# Create a jupyter notebook which runs for # 30 minutes with 1 processor and 1gb of ram # and sends the job status to my-email newjupyter my-email@wsu.edu # Create a jupyter lab which runs for # 1 hour with 1 processor, 512 mb of ram, # and has jobname "j-job-3" newjupyter --lab -n j-job-3 -t 01:00:00 -m 512mb my-email@wsu.edu
All of these commands will output instructions to access your newly created lab from a browser.
R scripts
First, load your R script onto your Aeolus user, and then ssh into it:
scp [localfile].R [username]@aeolus.wsu.edu:~ ssh [username]@aeolus.wsu.edu
Next, create a job submission script. At the bottom of it, load the necessary modules (gcc/7.3.0
is needed for this version of Python) and add a line to run your script.
# ... # qsub stuff. See job submission docs. # ... module load gcc/7.3.0 module load r/3.5.1/gcc/7.3.0 Rscript ~/[filename].R
Example [filename].R:
# Find factorial of num num = 8 factorial = 1 for(i in 1:num) { factorial = factorial * 1 } print(paste("The factorial of", num, "is", factorial))
To run the script on the cluster, queue your job: qsub [jobscript].sh
Installing CRAN packages
CRAN packages only need to be installed once. After that, library(modulename)
in your R script will import it.
To install CRAN packages, you’ll need to create a place for them to live locally in your Aeolus user’s home directory (ssh into your Aeolus user first):
mkdir -p ~/.local/lib/R3.5.1
Per General Rules of Using Aeolus, software should not be compiled on an ssh/login node. Since many R packages compile during install, you’ll need to start an interactive session on a compute node to install your packages. Check your job script to make sure that you will be requesting enough resources and time for the necessary modules to compile.
Start an interactive session by adding the -I
flag to your qsub call:
qsub -I [job script].sh
This will start an interactive session, which you can confirm by checking your shell signature: [user@compute-x-x-xx ~]$
Since you are in an interactive session, the modules you specified at the bottom of your job script will not have loaded, so you will need to load them now by running module load gcc/7.3.0; module load r/3.5.1/gcc/7.3.0
. Next, start the R interpreter (simply run R
) and install your packages, specifying the local directory you created as the lib
:
R> install.packages(c("lattice", "abctools"), lib="~/.local/lib/R3.5.1", repos="http://mirrors.vcea.wsu.edu/r-cran/")
Note: In the line above, repos=
specifies the University of Oregon CRAN mirror. If you’d like to select a different mirror, simply remove that parameter and it will prompt you to select one from a list.
When the modules have finished installing, leave the R interpreter (Ctrl-d
) and close the interactive session (exit
). This will return you to your Aeolus login user.
Finally, you’ll need to create a file to tell R where your packages live. Create a file called .Renviron
in your home directory, and specify your library directory:
R_LIBS_USER=~/.local/lib/R3.5.1
You will now be able to run your job script normally, importing your desired packages within your script using library(name)
.
Golang
Getting set up
To set up your Go workspace on Aeolus, you will first need to load the module:
module load go/1.11.5
This will set your GOPATH
environment variable to /home/$USER/go
. If you want a different location for your Go workspace, you can override it by changing it (export GOPATH=/path/to/workspace
), but you will need to do this every time you load the module, since loading the module sets the GOPATH to ~/go
. This is the recommended location for your Go workspace.
Next, create the folder structure for your workspace:
mkdir -p ~/go/src/[organization]/[unique name]
Your organization
and unique name
are up to you. Many people use github.com
and their github username
respectively, especially if the project will be hosted on GitHub. You could also use wsu.edu
and your wsu ID
. Whatever it is, you should make it unique so that people can potentially import your projects in the future.
All the personal projects you work on should live in $GOPATH/src/[organization]/[unique name]/
. For instance, if you wanted to build an FTP server in Go, and it was checked out in your personal GitHub, it might live here: ~/go/src/
github.com/my-git-user/my-go-ftp
. The my-go-ftp
folder will be the project root, and will contain .go
source files and sub-folders. When you want to start working on a new project, you should create a new folder in ~/go/src/github.com/my-git-user/
or whatever your chosen personal directory is.
Similarly, when importing other people’s code, you should put it in $GOPATH/src/[organization]/[their personal identifier]/[the project]
. go get
will do this for you. For example, go get github.com/goftp/server
will install this project to $GOPATH/src/github.com/goftp/server
. But in the case that you need to install external code yourself, simply mkdir -p $GOPATH/src/[their organization]/[their identifier]
and put the project in that folder.
In addition to setting your GOPATH
, the module loader also adds $GOPATH/bin
to your PATH
so that after running go install
, you can run the binaries anywhere. However, if you would like to access these binaries every time you log in to Aeolus without first loading Go, you’ll need to modify your .bash_profile
:
export PATH=$PATH:$HOME/go/bin
Go CLI tool
The Go CLI tool has several useful functions:
$ go get [repository]/[user]/[project] # Gets the specified code and puts it in # $GOPATH/src. Can be called from anywhere # because it depends on the GOPATH not the # directory it was called from. $ go build github.com/my-git-user/my-fun-project # Builds my-fun-project in $GOPATH/src/github.com/my-git-user/. # Binaries and libraries that are built # will not yet be installed. This command can # be run from anywhere, but can be shortened # to `go build` if you are in the project folder. $ go install github.com/my-git-user/my-fun-project # Builds the project if necessary, or simply # installs the libraries and binaries. Binaries # will be installed to $GOPATH/bin, and libraries # will be installed to: # $GOPATH/pkg/linux_amd64/[org]/[repo]/[package]. # This can be called anywhere, but can be shortened # to `go install` if you are in the project folder.
Note: Remember that per the General Rules of Using Aeolus, all compiling should be done in a submitted job rather than on your login node. A recommended way to use Go would be to request an interactive session (qsub -I jobscript.sh
), install and compile your code, and exit the session. Then, to run your compiled binary as part of a job, simply add $GOPATH/bin to your PATH (module load go/1.11.5
will do this for you) and execute the binary. For example, put this at the end of a job script:
# jobscript.sh #PBS stuff module load go/1.11.5 mygobinary --option1 --logfile /fastscratch/$USER/my-go-log
And queue the job: qsub jobscript.sh
.
Singularity (Docker)
Singularity is the HPC community’s answer to Docker. It was built to solve the problem of allowing non-trusted users to run containers. Singularity can run most Docker containers easily, and there are many Singularity-specific containers in the Singularity Library. For general information about Singularity, see the Singularity docs.
Setup
Per General Rules of Using Aeolus, computation and compilation should be done on a compute node. You should create a job submission script to request the proper resources to install and run your containers.
Run your job interactively: qsub -I [jobs script].sh
. This will give you an SSH session on the resource you requested in your job script.
You will need to load go
and singularity
by running the following on the compute node shell:
module load go/1.11.5 module load singularity/3.0.3/go/1.11.5
You can now pull, build, and run images. You may want to make a directory in your home directory to store images that you pull.
Pulling images
You can pull images from several sources. To find Singularity images, you can use singularity search [term]
. For instance:
$ singularity search ubun No users found for 'ubun' No collections found for 'ubun' Found 10 containers for 'ubun' library://jialipassion/official/ubuntu Tags: latest library://dtrudg/linux/ubuntu Tags: 14.04 16.04 17.10 18.04 artful bionic devel latest rolling trusty xenial library://sylabs-adam/linux/ubuntu Tags: latest library://sylabs-jms/testing/ubuntu-armhf.sif Tags: latest library://library/default/ubuntu Tags: 14.04 16.04 18.04 18.10 latest library://jialigithub/default/ubuntu Tags: 18.05 library://sylabs-andre/default/ubuntu Tags: library://mroche/baseline/ubuntu Tags: 16.04 18.04 bionic latest xenial library://ynop/default/siubuntu Tags: library://westleyk/official/ubuntu Tags: 16.04 18.04 latest
You can then pull images like:
singularity pull library://dtrudg/linux/ubuntu:trusty # OR, for the default source: singularity pull library://ubuntu:trusty # OR, to pull a Docker image: singularity pull docker://ubuntu:trusty # OR, to build an image and name it: singularity build ubuntu.sif library://ubuntu:trusty
Note: you will have to search Docker Hub independently from a browser to find images, as singularity search
does not search Docker Hub.
All of these commands will pull an image and build a .sif
file, Singularity’s equivalent of a .img
. Saving these to a folder in your home directory (e.g. /home/$USER/singularity
) will allow you to run them in the future without re-pulling.
Running images
You can run an image directly by doing:
singularity run [image name].sif # OR singularity shell [image name].sif
Note: if you run
an image, the image’s default command will be run (e.g. start a web server) and you will be given a shell, but if you shell
an image you will only be given a shell and the default command will not be run.
To confirm that you have entered the container’s shell, check to see if the shell signature has changed or try typing which singularity
(it should return nothing). To exit the shell and end the container, type exit
.
Alternatively, if you simply want to run a command within the container without entering a shell, you can use the exec
command:
# singularity exec [image name].sif [bash commands] $ singularity exec ubuntu.sif cat /etc/os-release NAME="Ubuntu" ...
When using exec
, the lifetime of the container is the lifetime of the command you pass it, so when the command completes (in this case, cat /etc/os-release
), the container ends.
Writing to images
Appending --writable
to exec
, shell
, or run
commands will make your changes persist in the image. You can use this if you want to install pip modules or any software with yum.
Building custom images
Building a .sif
image from a .def
file requires root privileges in Singularity, which are not available on Aeolus. However, you can create a .def
file on an external machine and then host its image in the Singularity library. You will then be able to pull this custom image on Aeolus.
For in-depth information about building your own Singularity images, see their documentation on it.
Using Singularity with qsub
Once you have pulled images on a compute node, you can run images without an interactive session. For instance, if you wanted to run an iPerf server on a compute node, you could pull an iPerf3 image to your singularity image folder (singularity pull docker://iperf3
) and then add the following to the bottom of your job script:
# ... #PBS stuff, see job submission docs # ... module load go/1.11.5 module load singularity/3.0.3/go/1.11.5 singularity run image_folder/iperf3-latest.sif -s
You can then submit your job with qsub [job script name].sh
, and an iPerf server will be started on the compute node.
Limitations
NOTE: Due to the kernel version of the HPC cluster (2.32), Singularity is unable to run some recent Linux versions. If this is the case, you will get this:
$ singularity run ubuntu_latest.sif FATAL: kernel too old
Solution: Try using an older distribution.
Example
Here is an example usage of Singularity. In this example, we’ll be running a simple python script in Alpine Linux on Aeolus.
First, load the Python script you want to run onto your Aeolus home directory. Here is the script I will use; it calculates prime numbers up to the one specified as a command line argument:
# Usage: python prime.py [limit] import sys max_range = int(sys.argv[1]) primes = [] for i in range(2, max_range + 1): isPrime = True for num in range(2, int(i ** 0.5) + 1): if i % num == 0: isPrime = False break if isPrime: primes.append(i) print ('\n'.join(map(str, primes)))
Once the Python script is saved in your home Aeolus directory as prime.py
, you must create a job script to allocate resources for running your script on a compute node. For more information about this, see the job submission script Here is our job script, saved as torquescript.sh
:
#!/bin/bash #PBS -V #PBS -N my_serial_01 #PBS -l nodes=1:dev:ppn=2 #PBS -l mem=2048mb #PBS -l walltime=00:05:00 #PBS -q batch ## Define path for output & error logs #PBS -k o #PBS -e /fastscratch/[your user name]/my_serial_01.e #PBS -o /fastscratch/[your user name]/my_serial_01.o ## Define path for reporting #PBS -M [your email address]@wsu.edu #PBS -m abe
This will create a new job called my_serial_01
with 1
node, 2
processors per node (PPN
), and 2048mb
of RAM to work with. Additionally, it allocates 5 mins
of wall time. Make sure to replace [your user name]
with your user name, and [your email address]
with the email you want to receive alerts about job status.
Before running our script for the first time, we want to pull and build an image on a compute node that will live in our Aeolus home directory. Before entering the compute node, create a new directory called “containers” (run mkdir containers
). Then, enter a compute node by running your job script in interactive mode:
qsub -I torquescript.sh
This will give you a shell to the compute node once your job is processed, with the resources you requested. Type ls
to make sure you see your containers
folder, and then cd containers
.
Now, you will want to pull your Alpine Linux + Python Docker image to run your Python script. By searching for alpine python
on Docker Hub, I found that the official Python repo has a tag for Alpine. Let’s pull it:
singularity pull docker://python:alpine
This will download the Docker image and create a .sif
file from it. Once it has downloaded, run ls
to see what it was saved as, for example python_alpine.sif
. You can now run your Python script in an interactive job using Singularity:
$ singularity exec python-alpine.sif python ../prime.py 1000 2 3 5 7 ... 983 991 997
Congratulations! You’ve run a program on Alpine Linux, even though you are on CentOS, and without yum installing
anything!
If you need to install modules, you can do that now. You can either get a --writable
shell, or just run --writable
exec commands.
Note: without --writable
, your pip install
changes will not be saved for the next time you run the .sif
image.
# This will give you a shell to run pip install from. # When you have finished installing, type exit. singularity shell --writable python_alpine.sif # This will simply run your install commands directly. singularity exec --writable python_alpine.sif pip install numpy
However, our script does not need any external modules.
Type exit
to end the Torque job and return to your Aeolus login node. To run the script within a non-interactive job and get the output in fastscratch, you’ll want to append the following to your torquescript.sh
:
module load singularity/3.0.3/go/1.11.5 singularity exec containers/python-alpine.sif python prime.py 1000
Then, queue the job: qsub torquescript.sh
.
Once the script has completed, you will find the output in /fastscratch/[your user name]/my_serial_01.o
.
Matlab
Matlab can be run on the cluster either as a GUI using X11 over SSH, or through the command line.
Command Line
To use Matlab on the cluster, first load the module:
module load matlab/R2018b/gcc/7.3.0
Then, simply run newmatlab
. This will run the matlab
command with the options -nodisplay
, -nodesktop
, and -nosplash
to make it more lightweight, since you will not be utilizing any graphics.
Matlab GUI
If you want to use the Matlab GUI, please do so inside a scheduled interactive job, not in the login node. Matlab uses X11, which can consume a lot of memory which can quickly fill up the login node. See the job submission script for more information about scheduling a job. You should schedule at least 4GB.
To start an interactive session:
qsub -I jobscript.sh
This will bring you a session on the compute node. Next, get the host name of the node that you are on:
hostname
Save the output of that command and keep the ssh window open, but put it aside. Next, open a new terminal on your machine and issue the following commands:
ssh -Y aeolus.wsu.edu ssh -Y [hostname] module load matlab/R2018b/gcc/7.3.0 matlab
Important: the first command will get you into the login node with X11 forwarding (that’s what the Y
option does). The second command will get you into the compute node that your interactive session is running on, also forwarding X11. The third command will load the Matlab module, and the final command will start Matlab (with graphics). The X11 Matlab windows will be forwarded from the compute node to your machine.
When you are done using Matlab, remember to close your interactive session.