SLURM is the Simple Linux Utility for Resource Management and is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.

Slurm is fully integrated in our system. You do not need set any environment variables.

Partitions

A partition is a subset of the cluster, a bundle of compute nodes with the same characteristics.

Based on access restrictions our cluster is divided in different partitions. 'sinfo' will only show partitions you are allowed to use. Using 'sinfo -a' shows all partitons.

A partition is selected by '-p PARTITIONNAME'.

Partition	No. Nodes	Cores/M	Tot. Cores	RAM/GB	CPU	Remark
itp	10	12	120	32	Six-Core AMD Opteron(tm) Processor 2427
itpbig	3	12	48	128	AMD Opteron(tm) Processor 6172
dfg-big	3	32	96	128	8-Core AMD Opteron(tm) Processor 6128	Restricted access
dfg-big	3	48	144	128/256	12-Core AMD Opteron(tm) Processor 6168	Restricted access
dfg-big	4	64	256	128/256	16-Core AMD Opteron(tm) Processor 6272	Restricted access
dfg-big	4	48	192	128/256	12-Core AMD Opteron(tm) Processor 6344	Restricted access
dfg-big	3	24	72	64	12-Core AMD Opteron(tm) Processor 6344	Restricted access
fplo	2	12	24	256	Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz	Restricted access
fplo	4	16	32	256	Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz	Restricted access
dfg-xeon	5	16	32	128	Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz	Restricted access
iboga	44	20	880	64	Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz	Group Rezzolla

The access to the DFG-Nodes (dfg and dfg-big) is restricted to the members of the SFB/TR49. If you do not belong to that group but want to test and develop programs for the Infiniband Network, please talk to the administrator. The queue 'fplo' have the same restrictions like the dfg-queue and is intended for large memory and single threaded job from the program 'fplo'. The access to the queue 'quantum' is restricted to group Prof. Hofstetter.

Submitting Jobs

In most case you want to submit a non interactive job to be executed in our cluster.

This is very simple for serial (1 CPU) jobs:

  sbatch -p PARTITION jobscript.sh

where jobscript.sh is a shell script with your job commands.

Running openMPI jobs is not much more complictated:

  sbatch -p PARTITION -n X jobscript.sh

where X is the number of desired MPI processes. Launch the job in the jobscript with:

  mpirun YOUREXECUTABLE

You don't have to worry about the number of processes or specific nodes. Both slurm and openmpi know about each other.

If you want infiniband for your MPI job (which is usually a good idea, if not running on the same node), you have to request the feature infiniband:

 sbatch -p dfg -C infiniband -n X jobscript.sh

Note: Infiniband is only available for the partitions dfg and quantum.

Running SMP jobs (multiple threads, not necessary mpi). Running MPI jobs an a single node, is recommended for the dfg-big nodes. This are big host, with up to 64 cpu's per node, but 'slow' gigabit network connection. Launch SMP jobs with

  sbatch -p PARTITION -N 1 -n X jobscript.sh

Differences in network the network connection

The new v3 dfg-xeon nodes are equipped with 10 GB network. This is faster (trough put) and has lower latency then gigabit ethernet, but is not is not as fast as the DDR infinband network. The 10 GB network is used for MPI and I/O. Infiniband is only use for MPI.

Defining Resource limits

By default each job allocates 2 GB memory and a run time of 3 days. More resources can be requested by

  --mem-per-cpu=<MB>

where <MB> is the memory in megabytes. The virtual memory limit is 2.5 times of the requested real memory limit.

The memory limit is not a hard limit. When exceeding the limit, your memory will be swapped out. Only when using more the 150% of the limit your job will be killed. So be conservative, to keep enough room for other jobs. Requested memory is blocked from the use by other jobs.

  -t or --time=

where time can by "days-hours". See man page for more formats.

Memory Management

In Slurm you specify only one parameter, which is the limit for your real memory usage and drives the decision where your job is started. The virtual memory of your job maybe 2.5 times of your requested memory. You can exceed you memory limit by 50%. But this will be swap space instead of real memory. This prevents crashing if you memory limit is a little to tight.

Inline Arguments

sbatch arguments can be written in the jobfile:

#! /bin/bash
#
# Choosing a partition:
#SBATCH -p housewives

YOUR JOB COMMANDS....

Links

SLURM-Homepage [1]

SLURM

Contents

Partitions

Submitting Jobs

Differences in network the network connection

Defining Resource limits

Memory Management

Inline Arguments

Links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools