Wiki source code of Slurm

Version 5.1 by Thomas Coelho on 2022/12/08 11:05

Hide last authors
Thomas Coelho 1.1 1 SLURM is the Simple Linux Utility for Resource Management and is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
2
3 Slurm is fully integrated in our system. You do not need set any environment variables.
4
5
6
Thomas Coelho 4.1 7 {{toc/}}
8
9 == Partitions ==
10
Thomas Coelho 1.1 11 A partition is a subset of the cluster, a bundle of compute nodes with the same characteristics.
12
13 Based on access restrictions our cluster is divided in different partitions. 'sinfo' will only show partitions you are allowed to use. Using 'sinfo -a' shows all partitons.
14
15 A partition is selected by '-p PARTITIONNAME'.
16
Thomas Coelho 5.1 17 |=**Partition** |=**No. Nodes** |=**Cores/M** |=**Tot. Cores**|=**RAM/GB** |=**CPU** |=**Remark/Restriction**
Thomas Coelho 4.1 18 |itp |10|20 |200|64 |Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Common Usage
19 |dfg-big|3|32|96|128|8-Core AMD Opteron(tm) Processor 6128|Group Valenti
20 |dfg-big|3|48|144|128/256|12-Core AMD Opteron(tm) Processor 6168|Group Valenti
21 |dfg-big|4|64|256|128/256|16-Core AMD Opteron(tm) Processor 6272|Group Valenti
22 |dfg-big|4|48|192|128/256|12-Core AMD Opteron(tm) Processor 6344|Group Valenti
23 |fplo|2|12|24|256|Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz|Group Valenti
24 |fplo|4|16|32|256|Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz|Group Valenti
25 |dfg-xeon|5|16|32|128|Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz|Group Valenti
26 |dfg-xeon|7|20|140|128|Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz|Group Valenti
Thomas Coelho 5.1 27 |iboga|34|20|880|64|Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Group Rezzolla
Thomas Coelho 4.1 28 |dreama|1|40|40|1024|Intel(R) Xeon(R) CPU E7-4820 v3 @ 1.90GHz|Group Rezzolla
Thomas Coelho 5.1 29 |barcelona|8|40|320|192|Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz|(((
30 Group Valenti
31 )))
32 |barcelona|1|40|40|512| |Group Valenti
33 |mallorca|4|48|192|256|AMD EPYC 7352 24-Core Processor|Group Valenti
34 |calea|36|64|2304|256|Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz|(((
35 Group Rezzolla
36 )))
37 |majortom|1|64|64|256|AMD EPYC 7513 32-Core Processor|Group Bleicher
Thomas Coelho 1.1 38
Thomas Coelho 3.1 39 Most nodes are for exclusive use by their corresponding owners. The itp nodes are for common usage. Except for 'fplo' and 'dfg-big' nodes, all machines are connected with Infiniband for all traffic (IP and internode communitcation - MPI)
Thomas Coelho 1.1 40
41 == Submitting Jobs ==
42
43 In most cases you want to submit a non interactive job to be executed in our cluster.
44
45 This is very simple for serial (1 CPU) jobs:
46
Thomas Coelho 4.1 47 {{{ sbatch -p PARTITION jobscript.sh}}}
Thomas Coelho 1.1 48
49 where jobscript.sh is a shell script with your job commands.
50
Thomas Coelho 4.1 51 Running **openMPI** jobs is not much more complictated:
Thomas Coelho 1.1 52
Thomas Coelho 4.1 53 {{{ sbatch -p PARTITION -n X jobscript.sh}}}
Thomas Coelho 1.1 54
55 where X is the number of desired MPI processes. Launch the job in the jobscript with:
56
Thomas Coelho 4.1 57 {{{ mpirun YOUREXECUTABLE}}}
Thomas Coelho 1.1 58
59 You don't have to worry about the number of processes or specific nodes. Both slurm and openmpi know
60 about each other.
61
Thomas Coelho 4.1 62 If you want **infiniband** for your MPI job (which is usually a good idea, if not running on the same node), you have to request the feature infiniband:
Thomas Coelho 1.1 63
Thomas Coelho 4.1 64 {{{ sbatch -p dfg -C infiniband -n X jobscript.sh}}}
Thomas Coelho 1.1 65
66 Note: Infiniband is not available for 'fplo' and 'dfg-big'.
67
Thomas Coelho 4.1 68 Running **SMP jobs** (multiple threads, not necessary mpi). Running MPI jobs on a single node is recommended for the
Thomas Coelho 1.1 69 dfg-big nodes. This are big host with up to 64 cpu's per node, but 'slow' gigabit network connection. Launch SMP jobs with
70
Thomas Coelho 4.1 71 {{{ sbatch -p PARTITION -N 1 -n X jobscript.sh}}}
Thomas Coelho 1.1 72
73 === Differences in network the network connection ===
Thomas Coelho 4.1 74
75
76
Thomas Coelho 1.1 77 The new v3 dfg-xeon nodes are equipped with 10 GB network. This is faster (trough put) and has lower latency then gigabit ethernet, but is not is not as fast as the DDR infinband network. The 10 GB network is used for MPI and I/O. Infiniband is only use for MPI.
78
79 == Defining Resource limits ==
80
81 By default each job allocates 2 GB memory and a run time of 3 days. More resources can be requested by
82
Thomas Coelho 4.1 83 {{{ --mem-per-cpu=<MB>}}}
Thomas Coelho 1.1 84
85 where <MB> is the memory in megabytes. The virtual memory limit is 2.5 times of the requested real memory limit.
86
87 The memory limit is not a hard limit. When exceeding the limit, your memory will be swapped out. Only when using more the 150% of the limit your job will be killed. So be conservative, to keep enough room for other jobs. Requested memory is blocked from the use by other jobs.
88
Thomas Coelho 4.1 89 {{{ -t or --time=<time>}}}
Thomas Coelho 1.1 90
91 where time can be set in the format "days-hours". See man page for more formats.
92
93 == Memory Management ==
94
95 In Slurm you specify only one parameter, which is the limit for your real memory usage and drives the decision where your job is started. The virtual memory of your job maybe 2.5 times of your requested memory. You can exceed your memory limit by 20%. But this will be swap space instead of real memory. This prevents crashing if you memory limit is a little to tight.
96
97 == Inline Arguments ==
98
99 sbatch arguments can be written in the jobfile:
100
Thomas Coelho 4.1 101 {{{#! /bin/bash
Thomas Coelho 1.1 102 #
103 # Choosing a partition:
104 #SBATCH -p housewives
105
Thomas Coelho 4.1 106 YOUR JOB COMMANDS....}}}
Thomas Coelho 1.1 107
Thomas Coelho 4.1 108
109
Thomas Coelho 1.1 110 = Links =
111
112
Thomas Coelho 4.1 113
114 * SLURM-Homepage [[url:http://slurm.schedmd.com/slurm.html]]