How to: Usage of software in AG Valentí

In order to decrease the redundancy of installations between different users and to establish a common hassle-free basis of availability commonly used software and libraries are installed under the username "ag-valenti".

Recent updates

Date	Changes
19.02.2021	Added `w2k_clean` utility.
08.02.2021	Boost 1.75.0 and TEMA 1.0 (Total Energy Mapping Analysis by A. Razpopov) are now available. Updated VESTA to 3.5.7.
04.11.2020	FFTW 3.3.8 and Armadillo 10.1.1 are now available.
03.11.2020	Eigen 3.3.8 is now available. This is automatically added to the `CPATH`, so it should be found by the compiler.
19.10.2020	VASP input files can now be copied from the documentation directory.
22.09.2020	Added Wannier90 (version 3.1.0) installation.
14.09.2020	Added FPLO installation for Ubuntu 20.04. The correct executables should be loaded automatically. At the moment xfplo and xfbp are not working.
03.08.2020	Added VESTA 3.5.2, please test `vesta-3.5.2` Added VASP. Version 6.1.1 is currently only available on Barcelona. Added clean_fplo, w2k_machines_setup

The following software is available (not necessarily a complete listing):

Software

	Availability
WIEN2k 19.1	OK
FPLO 18.00-52	XFPLO and XFBP only on Ubuntu 18.04
VASP 5.4.4	Only on Ubuntu 18.04 (clusters)
VASP 6.1.1	Only on Ubuntu 18.04 (clusters)
VESTA	OK
Wannier90 3.1.0	OK
Wannier90 2.0.1	OK
TEMA 1.0	OK

Developer tools and libraries

	Availability
Intel Compiler 2019.3.199	OK
Intel MKL 2019.3.199	OK
Intel MPI 2020.1.217	Only on Ubuntu 18.04 (clusters)
HPCX MPI 2.6	Only on Ubuntu 18.04 (clusters)
Eigen 3.3.8	OK
FFTW 3.3.8	OK
Armadillo 10.1.1	OK
Boost 1.75.0	OK
GCC 10.1	planned

Please email requests for installation of additional software and libraries, questions or contributions to this website to support.

Getting Started

Prerequisites

To access the installation you have to be in the group ag-valenti, otherwise you will get "permission denied" errors. Check this with the terminal command id. The output should look somewhat like this:

For running jobs on the group-only compute clusters (barcelona, dfg-big, dfg-xeon, fplo) you need to be in the group slurm-dfg.

Please contact the ITP system administrator to be added to these groups.

Software activation

There is no single path containing all binaries to avoid confusion and errors. Each user must request the activation of a desired program via a loader script that lies in the user home of ag-valenti. The script should be called like this:

source /home/ag-valenti/activate <option>

This is the preferred method. While setting environment variables manually is possible, it will only lead to errors on the user side. Thus, the only supported and most convenient option is to use the activate script.

Loading different programs can lead to problems in the dynamic linking stage. Please try to avoid unnecessary loading of software and do this in your job script or in separate terminal profiles.

Currently supported options: wien, fplo, vasp, default. For an up-to-date listing of available options please use the -h or --help flags.

Please do not add this to your .bashrc file as this may have consequences for the stability of other software. Only the default option is considered safe for loading in your .bashrc.

Description of options (activate help menu may be more up to date):

Option	Description	Version information
default	Loads several useful common programs: VESTA, w2k_machines_setup, cif2fplo, clean_fplo, ssubmit, sterminate, asbatch
wien	Loads WIEN2k executables init_lapw, lapw0, etc.	19.1
fplo	Loads FPLO executables fedit, fplo, dirac, xfbp, xfplo	18.00-52
vasp	Loads VASP executables vasp_std, vasp_ncl, vasp_gam	5.4.4, 6.1.1

Versions

The ag-valenti program loader is designed to host different versions for each program. It is intended to maintain old versions and add new versions rather than performing in-place upgrades. Each user has the freedom to choose from the installed versions.

Usually, the newest installed version is loaded automatically. If a specific version is desired an argument has to be specified:

source /home/ag-valenti/activate <name>@<version>

Versions typically have the format a.b or a.b.c. Available versions for a specific program are listed by -h,--help.

Some programs will be available in multiple platform optimized binaries. In this case the correct binaries will be selected by the loader automatically based on the platform the request is coming from. There is no way to request binaries for a specific platform.

Developer tools

In addition to numerical tools we also host commonly used developer tools and libraries that may be more up-to-date than the system default.

In order to use these type

source /home/ag-valenti/activate_dev <option>

Example Eigen

For example, to use Eigen do

source /home/ag-valenti/activate_dev eigen

g++ your_code.cpp -o your_executable

Documentations

Some programs come with documentations embedded in the source tree. In this case we provide a symbolic link to the respective directory under /home/ag-valenti/docs to abstract the installation details from the user. Use ls /home/ag-valenti/docs to list all available documentations. Note that these are provided "as is".

Tips and Tricks

Custom terminal profiles

In case one uses e.g. WIEN2k and VASP very often in an interactive terminal session one might be tempted to load both programs in the default .bashrc file. This can lead to problems if the programs use different versions of the same dynamic library. In this case it can be helpful to define custom terminal profiles.

Here are a few options how this can be achieved

Define aliases for each program, e.g. alias loadvasp="source /home/ag-valenti/activate vasp"

Create links to the terminal that automatically add the option --rcfile .vaspbashrc

File: vaspterm
                  
#!/bin/bash
gnome-terminal --rcfile=.vaspbashrc

Create separate profiles
1. Go to Edit -> Preferences
2. Next to Profiles click on + and type a name
3. Go to Command and check Run a custom command instead of my shell
4. Add unter Custom command e.g. source .vaspbashrc
5. Choose to Hold the terminal open under When command exits.
The profile can always be changed under Terminal -> Change Profile. To run a terminal with a specific profile use gnome-terminal --window-with-profile=<profile_name>.
An executable can be created like this:
```
File: abcterm
                  
#!/bin/bash
gnome-terminal --window-with-profile=<profile_name>
                
```
Don't forget chmod u+x abcterm.
Any of the above can be made more convenient by defining a custom keyboard shortcut for your terminal (see below).

Custom keyboard shortcuts

Having keyboard shortcuts for often used tasks is very handy. This is how it is done:

Open the system settings menu and navigate to: Settings > Devices > Keyboard.
Scroll to the bottom and click on +
Enter whatever name you like and a command for that you want to create the shortcut.
Click on Set Shortcut and press whatever key combination seems convenient to you.

Slurm batch script essentials

Slurm is a workload manager that distributes work over the available resources. A single work entity is called a "job". To obtain an allocation we typically use the batch submission system, where a number of jobs are pushed onto a queue and then executed as resources become available.

sbatch

To submit a job use the command


sbatch job.sh

A typical job script looks like this:


#!/bin/bash

#SBATCH --partition=PARTITION
#SBATCH --ntasks=NTASKS
#SBATCH --time=dd-hh:mm:ss
#SBATCH --job-name=JOBNAME

# your command to be executed

Description of options:

partition	name of the partition: itp, itp-big, barcelona, dfg-xeon dfg-big, fplo
ntasks	Number of tasks. This will allocate ntasks processors unless `--cpus-per-task` is specified
cpus-per-task	Request a number of CPU cores per task. The total number of processor cores allocated is then ntasks*cpus-per-task. Useful for shared memory programs, because a single task is guaranteed to run on one node. Groups of processors belonging to the same task will also sit on the same node.
time	Time limit for the allocation in the format `dd-hh:mm:ss`. After this has run out the job will be canceled.
job-name	Name of your job
mem	Requests a specific amount of memory (in MB). This limit can be temporarily exceeded, but will fail the job eventually. Use 5120 or 5G for 5GB.
mail-type	Control which status mails will be sent to your email address. NONE, BEGIN, END, FAIL, ALL are common values.

squeue

Use the command squeue to display the current job queue. A few notable options:

`-u,--user`	show only the jobs belonging to the given user
`-p,--partition`	show only the jobs running and queueing on the given partition

Software Help

WIEN2k

The most recent user guide can be found over at TU Wien. For the one shipping with any installed version visit the documentations archive.

Saving, restoring and cleaning a calculation

WIEN2k is notorious for creating large files that are essentially a complete waste of disk space. However, the program comes with tools that alleviate that problem.

Use save_lapw -d <directory> to store all input files and all necessary files for restoring a calculation to ./<directory>. For more information use the option -h.

After changing input files or after another calculation the previous run can be restored with restore_lapw -d <directory> if save_lapw has been run before.

Once your calculation is completed clean up the directory with clean_lapw or w2k_clean. This keeps only the important files and deletes the large files. Should those be needed later on they can be recovered quickly. For this, check which subprogram creates which file in the user guide.

WIEN2k Utility Suite

Tools that help with common WIEN2k related tasks are collected in this suite. Currently it contains:

w2k_clean	Cleans large files (like `clean_lapw`) but keeps `:log` file.
w2k_machines_setup	Generates a valid `.machines` file based on the resources requested from Slurm. This works for an arbitrary number of (fully allocated) nodes, but is limited to k-parallelization (no atom-parallelization via MPI at the moment).
fix_wannier90_hopping	Converts Wannier90 hopping file to a format without degeneracies.

Example script

A valid job script for parallel jobs looks like this:


#!/bin/bash
#SBATCH --partition=barcelona
#SBATCH --ntasks=40
#SBATCH --time=00-10:00:00
#SBATCH --job-name=my_job_name

. /home/ag-valenti/activate wien

w2k_machines_setup

run_lapw -p -e 0.0001 -c 0.0001

k-parallelization uses process communication via SSH. Be sure to setup an unprotected SSH public-private key pair via ssh-keygen and add the public key to your ~/.ssh/known_hosts file. Otherwise you will face "Permission denied" errors.

Unfortunately, due to the way k-point parallelization works, it is necessary to have the activate line in the .bashrc file if you are using the -p option.

FPLO

FPLO executables ship with the default naming convention <name>a.b-c-x86_64, where a.b-c is the version number. This allows for the parallel installation of different versions. Since we use a different version management, which allows to load a specific version only this tedious naming scheme is unnecessary. Therefore, in addition to the default binaries we provide convenient symbolic links fplo, fedit, dirac, which can be used independently of the specific version of FPLO that is used.

A sample job script is provided below:


#!/bin/bash

#SBATCH --partition=dfg-fplo
#SBATCH --ntasks=1
#SBATCH --mem=6G
#SBATCH --time=00-01:00:00
#SBATCH --job-name=my_job_name

. /home/ag-valenti/activate fplo@18.00-52

fplo

Note that the version can be changed in the argument to the activate script.

Since FPLO binaries are statically linked one can safely load FPLO along with any other Program, even other versions of FPLO. In this case the shortened executable names become ill-defined and the longer standard names should be used instead.

Using CIF files as FPLO input files

The FPLO input file =.in is created by the editor fedit. Provided as part of the default set of programs cif2fplo, originally written by Milan Tomić, allows to automatically convert a CIF file to FPLO input, which can then be edited with fedit.

Usage is kept rather simple: cif2fplo filename, where filename is the name or path to an existing CIF file. The =.in file will be created in the current working directory of the shell.

Cleaning up FPLO files

FPLO files follow a very annoying naming convention. Annoying because they start with = or + and therefore have to be typed in quotes. We provide a very simple tool to delete all or a selection of files: clean_fplo. Use -h for a list of available options. This is especially useful if you've accidentally opened fedit in e.g. your home directory and want to get rid of the files it created.

Using PyFPLO

FPLO ships with a very powerful and handy Python library pyfplo. When loading FPLO you will also have access to pyfplo.

PyFPLO in version 18.00-52 only supports Python2.7. Please make sure that you execute your scripts with python2.

VASP

VASP requires specific input files that are distributed along with the software. You can find the files at /home/ag-valenti/docs/VASP/input_files.

A sample job script is provided below:


#!/bin/bash

#SBATCH --partition=dfg-xeon
#SBATCH --ntasks=16
#SBATCH --mem=100G
#SBATCH --time=00-01:00:00
#SBATCH --job-name=my_job_name

. /home/ag-valenti/activate vasp

mpirun -np 16 vasp_std

Pearson Crystal Data

Access the Pearson Crystal Database setup on a virtual machine:

Open Remote Desktop Connection (Windows) or Remmina (Ubuntu)
Enter Computer Name: PCD
Ask someone for the login credentials

A connection can only be established from within the ITP network. Use a VPN otherwise.

Since this is a Windows virtual machine only one login is allowed at a time. Please log out after you are done!

Additional help regarding the useage of the software can be found on the official website.

Shared Folder

There is a shared Owncloud folder group_seminar_slides, where slides from the group seminar can be shared (only) with other members of the group. Files or folders therein should follow the naming convention:

YYYY-MM-DD_TITLE

This way it is easier to keep on top of everything. If you have more than one file to share make a folder, otherwise just upload the file.

Access

To access this folder your ITP account needs to be in the Linux group ag-valenti.

Open Owncloud
Login with your ITP account
You should see a shared folder named group_seminar_slides

Data Management

This section is still up for discussion! Input is still welcome until these measures are put in place.

Here we provide some guidlines as to how data is supposed to be handled. In the following data refers to published work only.

In order to assure that data is reproducible and available independent of the current staff all information has to be collected in a general repository.

A data set consists of the following elements:

	Details
Source code	Source code of the program used to produce the data. In case of licensed software the version number is sufficient. Add a `Makefile` (or comparable build instructions) and `configuration.txt` file that contains compiler and library versions used if applicable.
Input files	All input files corresponding to the particular calculation. This includes a script to run the program. Add metadata (parameters used).
Output files	Files produced by the program. Only those important for the discussion in the paper are necessary, e.g. files containing the band structure. If a script produced the final data file from a program output file add that here. Please try to combine post processing scripts into one or add an explanatory `README` file.
Plot scripts	Scripts that generate the exact plot found in the paper using data from a specific data set. Make sure that it is clear which data is used. If raw plots have been post processed add a note containing a short explanation of what was changed.
LaTeX source	The complete LaTeX source for the reproduction of the manuscript. This includes the figures. You can add additional notes that did not end up in the paper in a directory separated from the paper.

Repository Structure

Example File Names

Click to expand directory tree structure.

root

2101.1234-short-name-of-the-paper
- data
  - values.dat
- figure1
  - source_code_info.txt
  - input
    - script.sh
    - parameters.in
  - plot
    - fig1.gnu
- figure2
  - ...
- source_code
  - name-v1.0
    - Makefile
    - configuration.txt
    - ...
- latex
  - ...
2102.1234-short-name-of-the-paper
- figure1
  - source_code_info.txt
- figure2
- band_data

Arxiv ID?

Goethe-HLR

On the Goethe-HLR things are a bit different. Currently no DFT codes are set up. Users who run their own codes are asked to follow these rules:

Only modify files in your user home!
Keep your .bashrc as clean as possible and try to avoid modifying PATH.
Add module loading to your job scripts.

Module Loading

Modules are managed via the Modules utility. You only need the command module. Below we list common usage:

`module avail`	Shows all available modules
`module load <modulename>`	Load the specified module into the environment
`module unload <modulename>`	Unload the specified module from the environment

Per default only global modules will be shown. This may be enough. However, it is recommended that you add the following to your .bashrc


. /home/compmatsc/public/spack/share/spack/setup-env.sh

This will add more group-wide installed modules to your list. Typically you will only need to load the Intel compiler (if preferred) and an MPI implementation. We recommend mpi/intel/2019.5 with the Intel compiler comp/intel/2019.5 and mpi/openmpi/3.1.2-gcc-8.2.0 with the GNU compiler. If in doubt feel free to ask.

Please try not to modify anything in /home/compmatsc/public. This will affect all users of the group and you may involuntarily break something. Email support instead.