Partitions
Partitions group nodes with similar characteristics like resources, priorities or run-time limits.
Configuration
The sinfo
1 command lists partitions and their states:
Option | Description |
---|---|
-s , --summarize |
Lists only a partition state summary with no node state details. |
-o , --format |
Specifies the output columns to print, please refer to the manual page for more details. |
Following show an example of overall resource allocation on partitions. The column “NODES(A/I/O/T)” indicates resource state, capital letter are abbreviations for Available, Idle, Other and Total:
>>> sinfo -s
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
debug up 30:00 0/7/3/10 lxbk[0719-0722,1130-1135]
main* up 8:00:00 254/129/57/440 lxbk[0724-1033,1136-1265]
grid up 3-00:00:00 148/106/56/310 lxbk[0724-1033]
high_mem up 7-00:00:00 24/3/19/46 lxbk[1034-1079]
gpu up 7-00:00:00 14/3/33/50 lxbk[1080-1129]
long up 7-00:00:00 197/89/56/342 lxbk[0717-0718,0824-1033,1136-1265]
Show default runtime with limits:
>>> sinfo -o "%9P %6g %11L %10l %5D %20C"
PARTITION GROUPS DEFAULTTIME TIMELIMIT NODES CPUS(A/I/O/T)
debug all 5:00 30:00 10 0/1664/384/2048
main* all 2:00:00 8:00:00 440 23058/33838/6144/630
grid all 1:00:00 3-00:00:00 310 10380/14004/5376/297
high_mem all 1:00:00 7-00:00:00 46 2296/4616/4864/11776
gpu all 2:00:00 7-00:00:00 50 1202/430/3168/4800
long all 2:00:00 7-00:00:00 342 19074/28574/6048/536
List CPUs configuration and memory per node:
>>> sinfo -o "%9P %6g %4c %10z %8m %5D %20C"
PARTITION GROUPS CPUS S:C:T MEMORY NODES CPUS(A/I/O/T)
debug all 128+ 2:32+:2 257500+ 10 0/1664/384/2048
main* all 96+ 2:24+:2 191388+ 440 23056/33840/6144/630
grid all 96 2:24:2 191388 310 10378/14006/5376/297
high_mem all 256 8:16:2 1031342 46 2296/4616/4864/11776
gpu all 96 2:24:2 515451 50 1202/430/3168/4800
long all 96+ 2:24+:2 191388+ 342 19072/28576/6048/536
Print a comprehensive list idle nodes including available resources:
sinfo -Nel -t idle
An asterisk as suffix ‘*
’ indicates the default partition. Compute jobs will be send to the default partition unless a specific partition is selected by option.
It is recommended to test your application launch in the debug
partition first. This partition has a very short runtime and therefore allows a very quick resource allocation, which prevents long waiting times in the scheduler queue.
Allocation
salloc
, srun
, and sbatch
support following command options to select a partition, which is typically used in conjunction with other options related to resource allocation:
Option | Description |
---|---|
-p , --partition |
Request a specific partition for the resource allocation. |
For example request resource from the debug partition:
sbatch --partition=debug ...
Overwrite the default partition configuration with following environment variables:
Variable | Description |
---|---|
SLURM_PARTITION |
Interpreted by the srun command |
SALLOC_PARTITION |
Interpreted by the salloc command |
SBATCH_PARTITION |
Interpreted by the sbatch command |
Footnotes
sinfo
manual page, SchedMD
https://slurm.schedmd.com/sinfo.html↩︎