Usage
All usage of computer resources is manged through the Slurm1 Workload Manager.
In addition, the server is equipped with a number of modules that can be loaded through the module
command.
Modules
Modules are pre-compiled software that you can load into your shell environment.
To see the available modules, you can run the following command:
And to load a module:
And unload again:
Slurm
Slurm is a job scheduler and resource manager for the compute resources available.
Status
To see the attached resources, you can run the following command:
To see the jobs that are currently running, you can run the following command:
for a specific user:
Submitting jobs
Batch jobs
A typical SLURM script consists of three main sections:
- SLURM directives (#SBATCH)
- Environment setup
- Execution commands
Here's a detailed example:
#!/bin/bash
#----------------------------------------
# SLURM Directives
#----------------------------------------
#SBATCH --chdir=/projects/main_compute-AUDIT/ # Working directory
#SBATCH --job-name alphafoldtestjobname # Job name
#SBATCH --mem=50G # Memory requirement
#SBATCH --ntasks=1 # Number of tasks
#SBATCH --cpus-per-task=1 # CPU cores per task
#SBATCH --nodes=1 # Number of nodes
#SBATCH --mail-type=begin # Email at job start
#SBATCH --mail-type=end # Email at job end
#SBATCH --mail-user=abc123@ku.dk # Email address
#SBATCH --gres=gpu:1 # GPU requirement
#SBATCH --time=10:00:00 # Maximum runtime of 10 hours
#----------------------------------------
# Environment Setup
#----------------------------------------
# Load required modules
module load miniconda/4.10.4
conda activate alphafold
#----------------------------------------
# Job Execution
#----------------------------------------
# Change to working directory
cd /projects/main_compute-AUDIT/data/alphafold
# Run the main script
bash run_alphafold.sh \
-d /projects/testproject1/data/genetic_databases/ \
-o /projects/testproject1/people/btj820/ \
-m model_1 \
-f example/query.fasta \
-t 2020-05-14
Run the script with:
Once the job is submitted you will receive a job-id, which you can use to check the status of the job with:
Also, the output of the job will be saved in a file with the name
slurm-<job-id>.out
in the working directory specified.
Get the node information:
To stop a running job:
Interactive jobs
To start a simple interactive shell with 2 CPU cores, 5GB ram, 1 v100 GPU you can run the following command:
Tip
If copy pasting doesn't work for the multi line code snippets, try switching between selecting the text and using the copy button in the top right corner
srun -w sodasgpun01fl --partition=gpuqueue \ #(1)!
--ntasks-per-node=2 \ #(2)!
--mem=50GB \ #(3)!
--gres=gpu:v100:1 \ #(4)!
--time=240 \ #(5)!
--pty /bin/bash -i #(6)!
- Standard node and partition configuration
- Number of CPU cores
- Amount of memory (RAM)
- Number of GPUs
- Maximum time to run the task in minutes
- Run task in pseudo terminal.
Change to~/bin/zsh
if you installed zsh and wish to use that instead.
This will start a new shell session with the allocated resources. This means that exiting the shell (e.g. when logging out of the server) will release the resources. To prevent this, you can start a persistent session with tmux.
Check that you have access to the GPU by running
You will need to reload modules and/or activate environments in the new shell.
Jupyter Notebook
To start a Jupyter Notebook, you need to first allocate resources on the server:
Then, within the newly created interactive slurm session, and a folder containing a python uv project, run:
Activate the virtual environment:
Now, you can start the notebook server:
Then copy the generated link and paste it in your local computer's browsers.
I.e: http://10.84.10.216:8800/?token=abcd1234...
Info
To above code works when you have entered an interactive slurm session. Don't change the port or the url, since they are required for access to the server.
To start a jupyter notebook on the head
node instead, you have to specify
a port when you access the server via ssh, and then also refer to that port
in the jupyter notebook
command.
VSCode
In order to make the resources from slurm available to VSCode, follow the steps above and start a jupyter session.
Then, in VSCode, when you open a Notebook Ctrl+Shift+P and search for Notebook: Select Notebook Kernel
. If a kernel is already suggested, click Select another kernel...
then Existing Jupyter Server...
and copy the link with the token from above into the field.
Jupyter Kernels and Virtual Environments
TBD: When do you need to do this?
To register a virtual environment with Jupyter, you can run the following
command, from within your environment (that is, after activating it and making
sure that ipykernel
(uv pip install ipykernel
) is installed).
Persistent sessions
Use tmux to create and manage persistent sessions on the server.
Start a new tmux session
List tmux sessions
Attach tmux session
Detach (when you are inside) the session from tmux, leaving everything running in the background
Ctrl+B D
Docker
The server is equipped with a subset of docker, called udocker. It requires the anaconda3 module to be loaded first.
Resources
The UCPH guide to HPC systems
Five part video series introducing Slurm
The official slurm cheatsheet
-
Simple Linux Utility for Resource Management. ↩