SLURM
SLURM is the Simple Linux Utility for Resource Management and is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
Partitions
A partition is a subset of the cluster, a bundle of compute nodes with the same characteristics.
Based on access restrictions our cluster is divided in different partitions. 'sinfo' will only show partitions you are allowed to use. Using 'sinfo -a' shows all partitons.
Partition | No. Nodes | Cores | Tot. Slots | RAM/GB | CPU | Remark |
---|---|---|---|---|---|---|
housewifes | 15 | 4 | 72 | 16 | Dual Core AMD Opteron(tm) Processor 270 2,0 GHz | |
dfg | 12 | 8 | 96 | 32 | Quad-Core AMD Opteron(tm) Processor 2346 HE | Restricted access |
dfg | 8 | 8 | 64 | 32/64 | Quad-Core AMD Opteron(tm) Processor 2376 | Infiniband, Restricted access |
dfg | 8 | 12 | 96 | 32/64 | Six-Core AMD Opteron(tm) Processor 2427 | Infiniband, Restricted access |
quantum | 8 | 12 | 96 | 32/64 | Six-Core AMD Opteron(tm) Processor 2427 | Infiniband, Restricted access |
dfg-big | 3 | 32 | 96 | 128 | 8-Core AMD Opteron(tm) Processor 6128 | Restricted access |
dfg-big | 3 | 48 | 144 | 128 | 12-Core AMD Opteron(tm) Processor 6168 | Restricted access |
The access to the DFG-Nodes (dfg, dfg-ib and dfg-big) is restricted to the members of the SFB/TR49. If you do not belong to that group but want to test and develop programs for the Infiniband Network, please talk to the administrator. The access to the queue 'quantum' is restricted to group Prof. Hofstetter.
SLURM vs. SGE
This chapter compares the new batch system SLURM with the old SGE.
Partitions
Slurm has a slightly different view on the cluster. Nodes of a cluster are organized in partitions. To submit a job you have the choose one partition where to run the job.
Comparison of commands
The following table shows the most important commands in slurm compared to the commands of the grid engine.
SGE | Slurm | Description |
---|---|---|
qstat | squeue | Show running jobs |
qsub | sbatch | Submit a batch job |
qlogin | qrun | Run interactive commands |
qdel | scancel | Delete a batch job |
qhost | sinfo | Get info about nodes |
Links
- Homepage [1]