SLURM Usage¶
SLURM is used to submit jobs on the different partitions from the Dalek frontend. The available partitions are listed in the description page (see the "SLURM Partition" column in the summary table).
Frontend Connection¶
It is recommended to add some lines to your ~/.ssh/config
file as explained
in the SSH access section. Then, to connect to the frontend
from your computer you only have to do:
Basic SLURM Commands¶
Here are some useful command to start using SLURM:
-
sinfo -l
lists the available partitions$ sinfo -l Wed Jan 01 00:00:00 2025 PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST az4-n4090 up infinite 1-infinite no NO all 4 idle az4-n4090-0,az4-n4090-1,az4-n4090-2,az4-n4090-3 az4-a7900 up infinite 1-infinite no NO all 4 idle az4-a7900-0,az4-a7900-1,az4-a7900-2,az4-a7900-3 az4-mixed up infinite 1-infinite no NO all 8 idle az4-n4090-0,az4-n4090-1,az4-n4090-2,az4-n4090-3,az4-a7900-0,az4-a7900-1,az4-a7900-2,az4-a7900-3 iml-ia770 up infinite 1-infinite no NO all 4 idle iml-ia770-0,iml-ia770-1,iml-ia770-2,iml-ia770-3 az5-a890m up infinite 1-infinite no NO all 4 idle az5-a890m-0,az5-a890m-1,az5-a890m-2,az5-a890m-3
-
Submission of a job that executes thesrun -p [partition] command
runs a command on a partitionhostname
command on the nodes of theaz4-n4090
partition.
-
sbatch [script]
runs a SLURM script on the cluster -
squeue -l
allows you to view current submitted jobs on the clusterFor instance, here one job from the$ squeue -l Wed Jan 01 00:00:00 2025 JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON) 713705 iml-ia770 bash galveze RUNNING 12:12 UNLIMITED 2 iml-ia770-1,iml-ia770-3
galveze
user is running on theiml-ia770
partition and it is taking two nodes (iml-ia770-1
andiml-ia770-3
). -
scancel [jobid]
cancels a job -
scancel -u [user]
cancels all the jobs for a given user