Changes for page GPU Server
Last modified by Thomas Coelho (local) on 2024/10/01 13:47
From version 12.1
edited by Thomas Coelho (local)
on 2024/06/07 10:12
on 2024/06/07 10:12
Change comment:
There is no comment for this version
To version 7.1
edited by Thomas Coelho (local)
on 2023/11/06 11:08
on 2023/11/06 11:08
Change comment:
There is no comment for this version
Summary
-
Page properties (1 modified, 0 added, 0 removed)
Details
- Page properties
-
- Content
-
... ... @@ -4,11 +4,11 @@ 4 4 5 5 = The GPU Server = 6 6 7 -The GPU machine is a two socket server with AMD EPYC 7313 processors. One processor a 16 Cores, actually with SMT enabled (32 Threads). It comes with 512 GB of memory and 2 x 4 TB U.3 (NVMe) SSDs as fast storage. There are** 8 AMD Instinct Mi 50** GPU cards for computing.7 +The GPU machine is a two socket server with AMD EPYC 7313 processors. The processors have 16 Cores, actually with SMT enabled (32 Threads). It comes with 512 GB of memory and 2 x 4 TB U.3 (NVMe) SSDs as fast storage. There are** 8 AMD Instinct Mi 50** GPU cards for computing. 8 8 9 9 Access is given by SLURM and the separate partition "gpu". 10 10 11 -As software stack AMD ROCm is installed. This supports the ROCm and openCL interface. Current ROCm Stack is version 6.1. This is also packaged in Ubuntu 6.1.11 +As software stack AMD ROCm is installed. This supports the ROCm and openCL interface. 12 12 13 13 (% class="box infomessage" %) 14 14 ((( ... ... @@ -15,10 +15,6 @@ 15 15 Because GPU computing is a new discipline, we can only provide limited information here. If you have something to share, please fell free to edit this page. 16 16 ))) 17 17 18 -{{warning}} 19 -To have built in ROCm support in slurm, this machine has already been updated to Ubuntu 24.04. There are some parts of the ROCm stack included in the distribution which is a mixture of 5.7 and 6.0. Official Support from AMD for Ubuntu 24.04 is not yet available. Pytorch has succesfully tested with this setup. 20 -{{/warning}} 21 - 22 22 == Submitting == 23 23 24 24 GPUs are handled as generic resources in Slurm (gres). ... ... @@ -45,13 +45,9 @@ 45 45 Install Pytorch: 46 46 47 47 {{code language="bash"}} 48 -pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0 49 - 50 - 44 +pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.6 51 51 {{/code}} 52 52 53 -At time of writing it's not available for 6.1. Please check the pytorch Website for updates. 54 - 55 55 You can test the installation with 56 56 57 57 {{code language="python"}} ... ... @@ -70,4 +70,5 @@ 70 70 Pytorch: [[https:~~/~~/pytorch.org/>>https://pytorch.org/]] 71 71 72 72 65 + 73 73