Overview

Software in Deucalion

The software is available as loadable Modules. The modules allow a set of combinations of different versions of the software. Users can request additional module.

Info

Browsing and selection of available software is only available through command line, by default with SSH access, please check Connecting with SSH. If you are using HTTP access, you need to start a Shelll Access session to run the above commands.

List of available software modules:

module avail

Search for the desired software module:

module spider OpenMPI

Where OpenMPI is the name of software to search for.

Loading the default version of a software module:

module load OpenMPI

Loading a specific version of a software module:

module load OpenMPI/4.1.5-GCC-12.3.0

Display information of the selected software module:

module whatis OpenMPI/4.1.5-GCC-12.3.0

To change the software version of the module:

module switch OpenMPI OpenMPI/new-version>

List of loaded or currently active software modules:

module list

Remove all loaded software modules:

module purge

Remove and then load all loaded software modules:

module reload

Remove selected modules:

module unload OpenMPI

Help for the selected software:

module help OpenMPI/4.1.2.1

Additional commands can be found in the help section:

module help

System Environments

Basic Information


Compiler Name	FUJITSU Software Compiler Package
Compiler Version	V1.0L21 (cp-1.0.21.02a)
MPI Interconnect	InfiniBand (HDR100)
Job Scheduler Name	Slurm
Job Scheduler Version	20.02.5

Available compilers

Compile commands

Non-MPI

Kind	Language	Compile command
Cross	Fortran	frtpx
Cross	C	fccpx
Cross	C++	FCCpx
Native	Fortran	frt
Native	C	fcc
Native	C++	FCC

The Cross compilers are only used on the login nodes.\ The Native compilers are only used on the compute nodes.

MPI

Kind	Language	Compile command
Cross	Fortran	mpifrtpx
Cross	C	mpifccpx
Cross	C++	mpiFCCpx
Native	Fortran	mpifrt
Native	C	mpifcc
Native	C++	mpiFCC

The Cross compilers are only used on the login nodes.\ The Native compilers are only used on the compute nodes.

Compile Information

Fortran compiler
- Creates object programs from Fortran source programs.
C/C++ compiler
- Creates object programs from C/C++ source programs.
- C/C++ compiler has 2 modes with different user interfaces, the mode to be used is specified by the option of the compile command.
- The default mode is Trad Mode.

Mode	Description
Trad Mode (default)	This mode uses an enhanced compiler based on compilers for K computer and PRIMEHPC FX100 or earlier system. This mode is suitable for maintaining compatibility with the past Fujitsu compiler. [C] Supported specifications are C89/C99/C11 and OpenMP 3.1/Part of OpenMP 4.5. [C++] Supported specifications are C++03/C++11/C++14/Part of C++17 and OpenMP 3.1/Part of OpenMP 4.5.
Clang Mode (-Nclang)	This mode uses an enhanced compiler based on Clang/LLVM. This mode is suitable for compiling programs using the latest language specification and open source software. [C] Supported specifications are C89/C99/C11 and OpenMP 4.5/Part of OpenMP 5.0. [C++] Supported specifications are C++03/C++11/C++14/C++17 and OpenMP 4.5/Part of OpenMP 5.0.

Recommended Options

Language/Mode	Focus	Recommended options	Induced options
Fortran	Performance	Kfast,openmp[,parallel]	‑O3 ‑Keval,fp_contract,fp_relaxed,fz,ilfunc,mfunc omitfp,simd_packed_promotion
Fortran	Precision	Kfast,openmp[,parallel],fp_precision	‑Knoeval,nofp_contract,nofp_relaxed,nofz,noilfunc,nomfunc,parallel_fp_precision
C/C++ Trad Mode	Performance	Kfast,openmp[,parallel]	‑O3 ‑Keval,fast_matmul,fp_contract,fp_relaxed,fz,ilfunc,mfunc,omitfp,simd_packed_promotion
C/C++ Trad Mode	Precision	Kfast,openmp[,parallel],fp_precision	‑Knoeval,nofast_matmul,nofp_contract,nofp_relaxed,nofz,noilfunc,nomfunc,parallel_fp_precision
C/C++ Clang Mode		Nclang -Ofast	-O3 -ffj-fast-matmul -ffast-math -ffp-contract=fast -ffj-fp-relaxed -ffj-ilfunc -fbuiltin -fomitframe-pointer -finline-functions

Compile Examples

Fortran

Sequential program
```
(ln0x)$ frtpx -Kfast sample.f
```
Thread-parallel program (using automatic parallelization)
```
(ln0x)$ frtpx -Kfast,parallel sample.f
```
Thread-parallel program (Open MP)
```
(ln0x)$ frtpx -Kfast,openmp sample.f
```

Thread-parallel program (Open MP+ auto-parallel)

(ln0x)$ frtpx -Kfast,openmp,parallel sample.f

MPI program
```
(ln0x)$ mpifrtpx -Kfast sample.f
```

Hybrid program (Open MP+ auto-parallel + MPI)

(ln0x)$ mpifrtpx -Kfast,openmp,parallel sample.f

C

Sequential program
```
(ln0x)$ fccpx -Kfast sample.c
```
Thread-parallel program (using automatic parallelization)
```
(ln0x)$ fccpx -Kfast,parallel sample.c
```
Thread-parallel program (Open MP)
```
(ln0x)$ fccpx -Kfast,openmp sample.c
```

Thread-parallel program (Open MP+ auto-parallel)

(ln0x)$ fccpx -Kfast,openmp,parallel sample.c

MPI program
```
(ln0x)$ mpifccpx -Kfast sample.c
```

Hybrid program (Open MP+ auto-parallel + MPI)

(ln0x)$ mpifccpx -Kfast,openmp,parallel sample.c

C++

Sequential program
```
(ln0x)$ FCCpx -Kfast sample.c
```
Thread-parallel program (using automatic parallelization)
```
(ln0x)$ FCCpx -Kfast,parallel sample.c
```
Thread-parallel program (Open MP)
```
(ln0x)$ FCCpx -Kfast,openmp sample.c
```

Thread-parallel program (Open MP+ auto-parallel)

(ln0x)$ FCCpx -Kfast,openmp,parallel sample.c

MPI program
```
(ln0x)$ mpiFCCpx -Kfast sample.c
```

Hybrid program (Open MP+ auto-parallel + MPI)

(ln0x)$ mpiFCCpx -Kfast,openmp,parallel sample.c

System Environments

Basic Information


Compilers CPU	GCC 12.3.0, Intel oneAPI HPC Toolkit 2023.1.0
Compilers GPU	CUDA 11.8: GCC 11.3.0, NVIDIA HPC SDK 22.9
MPI Interconnect CPU	InfiniBand (HDR100)
MPI Interconnect GPU	InfiniBand (HDR200)
Job Scheduler Name	Slurm
Job Scheduler Version	20.02.5

Available compilers

Compile commands

Non-MPI

Kind	Language	Compile command
Native	Fortran	gfortran/ifort
Native	C	gcc/icc
Native	C++	g++/icpc

The Native compilers are used on both the login nodes and compute nodes.

MPI

Kind	Language	Compile command
Native	Fortran	mpifort/mpiifort
Native	C	mpicc/mpiicc
Native	C++	mpic++/mpiicpc

The Native compilers are used on both the login nodes and compute nodes.

Compile Information

Fortran compiler
- Creates object programs from Fortran source programs.
C/C++ compiler
- Creates object programs from C/C++ source programs.

Compiler	Description
GCC	https://gcc.gnu.org/gcc-12
Intel oneAPI HPC Toolkit	https://www.intel.com/content/www/us/en/developer/tools/oneapi/hpc-toolkit.html

CPU: Recommended Options

Language/Mode	Focus	Recommended options
Fortran	GCC	-O2 -ftree-vectorize -march=native -fno-math-errno
Fortran	Intel oneAPI	-O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise
C/C++	GCC	-O2 -ftree-vectorize -march=native -fno-math-errno
C/C++	Intel oneAPI	-O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise

Compile Examples

Fortran

Sequential program

Intel oneAPI: (ln0x)$ ifort -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.f

GCC: (ln0x)$ gfortran -O2 -ftree-vectorize -march=native -fno-math-errno sample.f

Thread-parallel program (Open MP)

Intel oneAPI: (ln0x)$ ifort -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.f

GCC: (ln0x)$ gfortran -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.f

MPI program

Intel oneAPI: (ln0x)$ mpiifort -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.f

GCC: (ln0x)$ mpif90 -O2 -ftree-vectorize -march=native -fno-math-errno sample.f

Hybrid program (Open MP + MPI)

Intel oneAPI: (ln0x)$ mpiifort -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.f

GCC: (ln0x)$ mpif90 -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.f

C

Sequential program

Intel oneAPI: (ln0x)$ icx -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.c

GCC: (ln0x)$ gcc -O2 -ftree-vectorize -march=native -fno-math-errno sample.c

Thread-parallel program (Open MP)

Intel oneAPI: (ln0x)$ icx -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.c

GCC: (ln0x)$ gcc -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.c

MPI program

Intel oneAPI: (ln0x)$ mpiicx -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.c

GCC: (ln0x)$ mpicc -O2 -ftree-vectorize -march=native -fno-math-errno sample.c

Hybrid program (Open MP+ MPI)

Intel oneAPI: (ln0x)$ mpiicx -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.c

GCC: (ln0x)$ mpicc -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.c

C++

Sequential program

Intel oneAPI: (ln0x)$ icpx -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.cpp

GCC: (ln0x)$ g++ -O2 -ftree-vectorize -march=native -fno-math-errno sample.cpp

Thread-parallel program (Open MP)

Intel oneAPI: (ln0x)$ icpx -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.cpp

GCC: (ln0x)$ g++ -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.cpp

MPI program

Intel oneAPI: (ln0x)$ mpiicpx -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.cpp

GCC: (ln0x)$ mpic++ -O2 -ftree-vectorize -march=native -fno-math-errno sample.cpp

Hybrid program (Open MP + MPI)

Intel oneAPI: (ln0x)$ mpiicpx -qopenmp -O2 -march=core-avx2 -ftz -fp-speculation=safe -fp-model precise sample.cpp

GCC: (ln0x)$ mpic++ -fopenmp -O2 -ftree-vectorize -march=native -fno-math-errno sample.cpp

GPU: Recommended Options

NVIDIA recommendations:
- https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html
Resources on:
- https://developer.nvidia.com/hpc-sdk

Compiler	Description
CUDA/GCC	https://docs.nvidia.com/cuda/archive/11.8.0/index.html
NVIDIA HPC SDK	https://docs.nvidia.com/hpc-sdk/archive/22.9/index.html

Compile Examples

Fortran

Fortran support it is only available through OpenACC directives or using CUDA Fortran

OpenACC: 
(ln0x)$ ml NVHPC/22.9-CUDA-11.8.0
(ln0x)$ nvfortran -acc -gpu=cc80 -Minfo=accel -Mpreprocess -o sample_acc sample_acc.f90

CUDA Fortran: 
(ln0x)$ ml NVHPC/22.9-CUDA-11.8.0
(ln0x)$ nvfortran -gpu=cc80 -Minfo=accel -Mpreprocess -o sample sample.cuf

C

CUDA

nvcc: 
(ln0x)$ ml CUDA/11.8.0 GCC/11.3.0
(ln0x)$ nvcc --generate-code arch=compute_80,code=sm_80 -o sample sample.c

OpenACC

nvhpc:
(ln0x)$ ml NVHPC/22.9-CUDA-11.8.0 
(ln0x)$ nvc -acc -gpu=cc80 -Minfo=accel -Mpreprocess -o sample_acc sample_acc.c

C++

CUDA

nvcc: 
(ln0x)$ ml CUDA/11.8.0 GCC/11.3.0
(ln0x)$ nvcc --generate-code arch=compute_80,code=sm_80 -o sample sample.cpp

OpenACC

nvhpc: 
(ln0x)$ ml NVHPC/22.9-CUDA-11.8.0
(ln0x)$ nvc++ -acc -gpu=cc80 -Minfo=accel -Mpreprocess -o sample_acc sample_acc.cpp