SLURM Usage¶

SLURM is used to submit jobs on the different partitions from the Dalek frontend node. The available partitions are listed in the description page.

Frontend Connection¶

It is recommended to add some lines to your ~/.ssh/config file as explained in the SSH access section. Then, to connect to the frontend from your computer you only have to do:

ssh front.dalek.lip6

Basic SLURM Commands¶

Here are some useful command to start using SLURM:

sinfo -l lists the available partitions

$ sinfo -l
Wed Jan 01 00:00:00 2025
PARTITION AVAIL  TIMELIMIT   JOB_SIZE ROOT OVERSUBS     GROUPS  NODES       STATE NODELIST
az4-n4090    up   infinite 1-infinite   no       NO        all      4        idle az4-n4090-[0-3]
az4-a7900    up   infinite 1-infinite   no       NO        all      4        idle az4-a7900-[0-3]
az4-mixed    up   infinite 1-infinite   no       NO        all      8        idle az4-n4090-[0-3],az4-a7900-[0-3]
iml-ia770    up   infinite 1-infinite   no       NO        all      4        idle iml-ia770-[0-3]
az5-a890m    up   infinite 1-infinite   no       NO        all      4        idle az5-a890m-[0-3]

srun -p [partition] command runs a command on a partition
```
$ srun -p az4-n4090 -N 4 hostname
az4-n4090-1
az4-n4090-3
az4-n4090-0
az4-n4090-2
```
Submission of a job that executes the hostname command on the nodes of the az4-n4090 partition.
srun -p [partition] --pty bash -i runs a interactive session on a partition
```
srun -p az4-a7900 --exclusive --pty bash -i
```
Interactive job on a node of the az4-a7900 partition. --exclusive means that other users cannot connect to this node at the same time.

Info

An easier way to connect interactively to the nodes is to use a custom ~/.ssh/config file as detailed in the SSH Access page.
sbatch [script] runs a SLURM script on the cluster

squeue -l allows you to view current submitted jobs on the cluster

$ squeue -l
Wed Jan 01 00:00:00 2025
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
            713705 iml-ia770     bash  galveze  RUNNING      12:12 UNLIMITED      2 iml-ia770-[1,3]

For instance, here one job from the galveze user is running on the iml-ia770 partition and it is taking two nodes (iml-ia770-1 and iml-ia770-3).

scancel [jobid] cancels a job
scancel -u [user] cancels all the jobs for a given user