Changes for page Slurm
Last modified by Thomas Coelho (local) on 2025/03/18 13:17
From version 14.1
edited by Thomas Coelho (local)
on 2025/03/18 13:17
on 2025/03/18 13:17
Change comment:
There is no comment for this version
To version 4.1
edited by Thomas Coelho
on 2022/12/08 10:56
on 2022/12/08 10:56
Change comment:
There is no comment for this version
Summary
-
Page properties (2 modified, 0 added, 0 removed)
Details
- Page properties
-
- Author
-
... ... @@ -1,1 +1,1 @@ 1 -XWiki. coelho1 +XWiki.thw - Content
-
... ... @@ -14,29 +14,21 @@ 14 14 15 15 A partition is selected by '-p PARTITIONNAME'. 16 16 17 -|=**Partition** |=**No. Nodes** |=**Cores/M** |=**Tot. Cores**|=**RAM/GB/;** |=**CPU** |=**Remark/Restriction** 18 -|itp |12|20 |240|64 |Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Common Usage 17 +|=(% scope="col" %)**Partition** |=(% scope="col" %)**No. Nodes** |=(% scope="col" %)**Cores/M** |=(% scope="col" %)**Tot. Cores**|=(% scope="col" %)**RAM/GB** |=(% scope="col" %)**CPU** |=(% scope="col" %)**Remark/Restriction** 18 +|itp |10|20 |200|64 |Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Common Usage 19 +|dfg-big|3|32|96|128|8-Core AMD Opteron(tm) Processor 6128|Group Valenti 20 +|dfg-big|3|48|144|128/256|12-Core AMD Opteron(tm) Processor 6168|Group Valenti 21 +|dfg-big|4|64|256|128/256|16-Core AMD Opteron(tm) Processor 6272|Group Valenti 22 +|dfg-big|4|48|192|128/256|12-Core AMD Opteron(tm) Processor 6344|Group Valenti 19 19 |fplo|2|12|24|256|Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz|Group Valenti 20 20 |fplo|4|16|32|256|Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz|Group Valenti 21 21 |dfg-xeon|5|16|32|128|Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz|Group Valenti 22 22 |dfg-xeon|7|20|140|128|Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz|Group Valenti 23 -|iboga| 34|20|880|64|Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Group Rezzolla27 +|iboga|44|20|880|64|Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz|Group Rezzolla 24 24 |dreama|1|40|40|1024|Intel(R) Xeon(R) CPU E7-4820 v3 @ 1.90GHz|Group Rezzolla 25 -|barcelona|8|40|320|192|Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz|((( 26 -Group Valenti 27 -))) 28 -|barcelona|1|40|40|512|((( 29 -Intel(R) Xeon(R) Silver 4316 CPU @ 2.30GHz 30 -)))|Group Valenti 31 -|mallorca|4|48|192|256|AMD EPYC 7352 24-Core Processor|Group Valenti 32 -|calea|36|64|2304|512|Intel(R) Xeon(R) Platinum 8358 CPU @ 2.10GHz|Group Rezzolla 33 -|bilbao|7|64|448|512|Intel Xeon(R) Gold 6540 @ 2.20GHz((( 34 -Group Valenti 35 -))) 36 -|majortom|1|64|64|256|AMD EPYC 7513 32-Core Processor|Group Bleicher 29 +|barcelona|8|40|320|192|Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz|Group Valenti\\ 37 37 38 -Most nodes are for exclusive use by their corresponding owners. The itp nodes are for common usage. Except for 'fplo' and ' 39 -majortom', all machines are connected with Infiniband for all traffic (IP and internode communitcation - MPI) 31 +Most nodes are for exclusive use by their corresponding owners. The itp nodes are for common usage. Except for 'fplo' and 'dfg-big' nodes, all machines are connected with Infiniband for all traffic (IP and internode communitcation - MPI) 40 40 41 41 == Submitting Jobs == 42 42 ... ... @@ -59,14 +59,23 @@ 59 59 You don't have to worry about the number of processes or specific nodes. Both slurm and openmpi know 60 60 about each other. 61 61 62 - Running**SMP jobs**(multiplethreads,notnecessarympi).RunningMPI jobs onasingle nodeisrecommendedforthe54 +If you want **infiniband** for your MPI job (which is usually a good idea, if not running on the same node), you have to request the feature infiniband: 63 63 56 +{{{ sbatch -p dfg -C infiniband -n X jobscript.sh}}} 57 + 58 +Note: Infiniband is not available for 'fplo' and 'dfg-big'. 59 + 60 +Running **SMP jobs** (multiple threads, not necessary mpi). Running MPI jobs on a single node is recommended for the 64 64 dfg-big nodes. This are big host with up to 64 cpu's per node, but 'slow' gigabit network connection. Launch SMP jobs with 65 65 66 66 {{{ sbatch -p PARTITION -N 1 -n X jobscript.sh}}} 67 67 65 +=== Differences in network the network connection === 68 68 69 69 68 + 69 +The new v3 dfg-xeon nodes are equipped with 10 GB network. This is faster (trough put) and has lower latency then gigabit ethernet, but is not is not as fast as the DDR infinband network. The 10 GB network is used for MPI and I/O. Infiniband is only use for MPI. 70 + 70 70 == Defining Resource limits == 71 71 72 72 By default each job allocates 2 GB memory and a run time of 3 days. More resources can be requested by ... ... @@ -75,7 +75,7 @@ 75 75 76 76 where <MB> is the memory in megabytes. The virtual memory limit is 2.5 times of the requested real memory limit. 77 77 78 -The memory limit is not a hard limit. When exceeding the limit, your memory will be swapped out. Only when using more the 1 10% of the limit your job will be killed. So be conservative, to keep enough room for other jobs. Requested memory is blocked from the use by other jobs.79 +The memory limit is not a hard limit. When exceeding the limit, your memory will be swapped out. Only when using more the 150% of the limit your job will be killed. So be conservative, to keep enough room for other jobs. Requested memory is blocked from the use by other jobs. 79 79 80 80 {{{ -t or --time=<time>}}} 81 81