Difference between revisions of "Intel Compiler"
Line 37: | Line 37: | ||
If you are using a 32bit machine and want use the HPC nodes, it is recommended that you login to our login server th.physik.uni-frankfurt.de an compile your code there. 32 bit programs will run on 64 bit but have some limitations. A small performance impact and limited address space (4 GB) with much less usable ram. | If you are using a 32bit machine and want use the HPC nodes, it is recommended that you login to our login server th.physik.uni-frankfurt.de an compile your code there. 32 bit programs will run on 64 bit but have some limitations. A small performance impact and limited address space (4 GB) with much less usable ram. | ||
+ | |||
+ | == Implicit parallelization == | ||
+ | |||
+ | This version of MKL has internal parallel code (based on openMP) which uses all available CPUs which are in the system. This is fine on the desktop as it will speed up your caluclation. But on the compute nodes this will conflict with other jobs running on the same machine. Therefore you have to restrict the number CPUs used to the number of allocated slots in SGE. Your can do this by defining the following environment variable: | ||
+ | |||
+ | export OMP_NUM_THREADS=x | ||
+ | |||
+ | where x is the number of allocated slots. Using 1 means traditional serial processing. You should benchmark your program how many parallel threads will give a reasonable performance per CPU value. In the SGE the have to use the PE 'smp' to ensure all parallel slots are located on the same machine. | ||
== Documentation == | == Documentation == | ||
The comprehensive set of Intel documentation will be found under '''/opt/intel/Compiler/11.0/069/Documentation/'''. | The comprehensive set of Intel documentation will be found under '''/opt/intel/Compiler/11.0/069/Documentation/'''. |
Revision as of 12:09, 15 September 2010
This page describes, how to setup your environment for using our temporary Intel Compiler Installation. The instruction are for bash shell user. If you use a different shell, you know what you are doing and you will be able to translate the configuration.
We provide the recent version 11 of the Intel compiler suite for Fortran, C/C++ and the full set of performance libraries like MKL, TBB, Cluster_OMP.
.bashrc
In order to use the installation you have to modify your .bashrc. Open the file with you favorite editor and append the following lines.
#Intel Compiler Setup export LM_LICENSE_FILE="16287@th.physik.uni-frankfurt.de" intel_arch="intel64" if [[ `uname -m` == "i686" ]]; then intel_arch="ia32" fi iccvars=/opt/intel/Compiler/11.0/069/bin/iccvars.sh if [ -f $iccvars ]; then . $iccvars $intel_arch fi ifortvars=/opt/intel/Compiler/11.0/069/bin/iccvars.sh if [ -f $ifortvars ]; then . $ifortvars $intel_arch fi
There are similar files with .csh extension for C-Shell users.
After sourcing your .bashrc or opening an new shell you can validate your setup with 'which ifort'. This should point to somewhere in /opt.
32/64 bit Issues
If you only work on 64 bit machines you can skip this section. But if you are working on 32 bit machine things can get more complicated. If you exclusively work on a 32 bit and you don't want the run your programs on the our HPC nodes you can also skip this. 32 bit machines are mainly the white FSC boxes and, less important, the pool. Running "uname -m" in a shell will tell you your architecture: i686 for 32 or x86_64 for 64 bit. The setup above will choose the appropriate environment.
If you are using a 32bit machine and want use the HPC nodes, it is recommended that you login to our login server th.physik.uni-frankfurt.de an compile your code there. 32 bit programs will run on 64 bit but have some limitations. A small performance impact and limited address space (4 GB) with much less usable ram.
Implicit parallelization
This version of MKL has internal parallel code (based on openMP) which uses all available CPUs which are in the system. This is fine on the desktop as it will speed up your caluclation. But on the compute nodes this will conflict with other jobs running on the same machine. Therefore you have to restrict the number CPUs used to the number of allocated slots in SGE. Your can do this by defining the following environment variable:
export OMP_NUM_THREADS=x
where x is the number of allocated slots. Using 1 means traditional serial processing. You should benchmark your program how many parallel threads will give a reasonable performance per CPU value. In the SGE the have to use the PE 'smp' to ensure all parallel slots are located on the same machine.
Documentation
The comprehensive set of Intel documentation will be found under /opt/intel/Compiler/11.0/069/Documentation/.