Changes for page GPU Server

Last modified by Thomas Coelho (local) on 2024/10/01 13:47

From version 7.1
edited by Thomas Coelho (local)
on 2023/11/06 11:08
Change comment: There is no comment for this version
To version 12.1
edited by Thomas Coelho (local)
on 2024/06/07 10:12
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -4,11 +4,11 @@
4 4  
5 5  = The GPU Server =
6 6  
7 -The GPU machine is a two socket server with AMD EPYC 7313 processors. The processors have 16 Cores, actually with SMT enabled (32 Threads). It comes with 512 GB of memory and 2 x 4 TB U.3 (NVMe) SSDs as fast storage. There are** 8 AMD Instinct Mi 50** GPU cards for computing.
7 +The GPU machine is a two socket server with AMD EPYC 7313 processors. One processor a 16 Cores, actually with SMT enabled (32 Threads). It comes with 512 GB of memory and 2 x 4 TB U.3 (NVMe) SSDs as fast storage. There are** 8 AMD Instinct Mi 50** GPU cards for computing.
8 8  
9 9  Access is given by SLURM and the separate partition "gpu".
10 10  
11 -As software stack AMD ROCm is installed. This supports the ROCm and openCL interface.
11 +As software stack AMD ROCm is installed. This supports the ROCm and openCL interface. Current ROCm Stack is version 6.1. This is also packaged in Ubuntu 6.1.
12 12  
13 13  (% class="box infomessage" %)
14 14  (((
... ... @@ -15,6 +15,10 @@
15 15  Because GPU computing is a new discipline, we can only provide limited information here. If you have something to share, please fell free to edit this page.
16 16  )))
17 17  
18 +{{warning}}
19 +To have built in ROCm support in slurm, this machine has already been updated to Ubuntu 24.04. There are some parts of the ROCm stack included in the distribution which is a mixture of 5.7 and 6.0. Official Support from AMD for Ubuntu 24.04 is not yet available. Pytorch has succesfully tested with this setup.
20 +{{/warning}}
21 +
18 18  == Submitting ==
19 19  
20 20  GPUs are handled as generic resources in Slurm (gres).
... ... @@ -41,9 +41,13 @@
41 41  Install Pytorch:
42 42  
43 43  {{code language="bash"}}
44 -pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6
48 +pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0
49 +
50 +
45 45  {{/code}}
46 46  
53 +At time of writing it's not available for 6.1. Please check the pytorch Website for updates.
54 +
47 47  You can test the installation with
48 48  
49 49  {{code language="python"}}
... ... @@ -62,5 +62,4 @@
62 62  Pytorch: [[https:~~/~~/pytorch.org/>>https://pytorch.org/]]
63 63  
64 64  
65 -
66 66