title |
---|
Installation cheat sheet for Kokkos |
- title: Installation cheat sheet for Kokkos
- Requirements
- How to build Kokkos
- Kokkos compile options
https://kokkos.org/kokkos-core-wiki/ProgrammingGuide/Compiling.html
https://kokkos.org/kokkos-core-wiki/building.html
https://github.com/kokkos/kokkos-tutorials/blob/main/LectureSeries/KokkosTutorial_01_Introduction.pdf
Compiler | Minimum version | Notes |
---|---|---|
ARM Clang | 20.1 | |
Clang | 10.0.0 | For CUDA |
Clang | 8.0.0 | For CPU |
GCC | 8.2.0 | |
Intel Classic | 19.0.5 | |
Intel LLVM | 2022.0.0 | For SYCL |
Intel LLVM | 2021.1.1 | For CPU |
MSVC | 19.29 | |
NVCC | 11.0 | |
NVHPC/PGI | 22.3 | |
ROCM | 5.2.0 |
Build system | Minimum version | Notes |
---|---|---|
CMake | 3.25.2 | For Intel LLVM full support |
CMake | 3.21.1 | For NVHPC support |
CMake | 3.18 | For better Fortran linking |
CMake | 3.16 |
https://kokkos.org/kokkos-core-wiki/requirements.html
add_subdirectory(path/to/kokkos)
target_link_libraries(
my-app
Kokkos::kokkos
)
cd path/to/your/code
cmake -B build \
-DCMAKE_CXX_COMPILER=<your C++ compiler> \
<Kokkos compile options>
cd path/to/kokkos
cmake -B build \
-DCMAKE_CXX_COMPILER=<your C++ compiler> \
-DCMAKE_INSTALL_PREFIX=path/to/kokkos/install \
<Kokkos compile options>
cmake --build build
cmake --install build
https://kokkos.org/kokkos-core-wiki/building.html
find_package(Kokkos REQUIRED)
target_link_libraries(
my-app
Kokkos::kokkos
)
cd path/to/your/code
cmake -B build \
-DCMAKE_CXX_COMPILER=<your C++ compiler> \
-DKokkos_ROOT=path/to/kokkos/install
https://cmake.org/cmake/help/latest/guide/tutorial/index.html
TODO finish this part
See https://kokkos.org/kokkos-core-wiki/building.html#spack
Option | Backend |
---|---|
-DKokkos_ENABLE_SERIAL=ON |
Serial |
-DKokkos_ENABLE_OPENMP=ON |
OpenMP |
-DKokkos_ENABLE_THREADS=ON |
Threads |
The serial backend is enabled by default.
Option | Backend | Notes |
---|---|---|
-DKokkos_ENABLE_CUDA=ON |
CUDA | |
-DKokkos_ENABLE_HIP=ON |
HIP | |
-DKokkos_ENABLE_SYCL=ON |
SYCL | Experimental |
-DKokkos_ENABLE_OPENMPTARGET=ON |
OpenMP target | Experimental |
You can only select the serial backend, plus another host backend and one device backend at a time.
See architecture-specific options.
Option | Description |
---|---|
-DKokkos_ENABLE_BENCHMARKS=ON |
Build benchmarks |
-DKokkos_ENABLE_COMPILER_WARNINGS=ON |
Print all compiler warnings |
-DKokkos_ENABLE_DEBUG=ON |
Activate extra debug features, may increase compile times |
-DKokkos_ENABLE_DEBUG_BOUNDS_CHECK=ON |
Use bounds checking, will increase runtime |
-DKokkos_ENABLE_EXAMPLES=ON |
Build examples |
-DKokkos_ENABLE_TESTS=ON |
Build tests |
-DKokkos_ENABLE_TUNING=ON |
Create bindings for tuning tools |
Extra options
Option | Description |
---|---|
-DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON |
Aggressively vectorize loops |
-DKokkos_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK=ON |
Debug check on dual views |
-DKokkos_ENABLE_DEPRECATED_CODE=ON |
Enable deprecated code |
-DKokkos_ENABLE_LARGE_MEM_TESTS=ON |
Perform extra large memory tests |
For more, see https://kokkos.org/kokkos-core-wiki/keywords.html
Host options are used for controlling optimization and are optional.
Option | Architecture |
---|---|
-DKokkos_ARCH_NATIVE=ON |
Local host |
Option | Architecture |
---|---|
-DKokkos_ARCH_ZEN3=ON |
Zen3 |
-DKokkos_ARCH_ZEN2=ON |
Zen2 |
-DKokkos_ARCH_ZEN=ON |
Zen |
Option | Architecture |
---|---|
-DKokkos_ARCH_A64FX=ON |
ARMv8.2 with SVE Support |
-DKokkos_ARCH_ARMV81=ON |
ARMV8.1 |
-DKokkos_ARCH_ARMV80=ON |
ARMV8.0 |
Option | Architecture |
---|---|
`-DKokkos_ARCH_SPR=ON | Sapphire Rapids |
`-DKokkos_ARCH_SKX=ON | Skylake |
`-DKokkos_ARCH_BDW=ON | Intel Broadwell |
`-DKokkos_ARCH_HSW=ON | Intel Haswell |
`-DKokkos_ARCH_KNL=ON | Intel Knights Landing |
`-DKokkos_ARCH_SNB=ON | Sandy Bridge |
Device options are mandatory. They can be deduced from the device if present at CMake configuration time.
Option | Architecture | Associated cards |
---|---|---|
-DKokkos_ARCH_AMD_GFX942=ON |
GFX942 | MI300A, MI300X |
-DKokkos_ARCH_AMD_GFX90A=ON |
GFX90A | MI210, MI250, MI250X |
-DKokkos_ARCH_AMD_GFX908=ON |
GFX908 | MI100 |
-DKokkos_ARCH_AMD_GFX906=ON |
GFX906 | MI50, MI60 |
-DKokkos_ARCH_AMD_GFX1100=ON |
GFX1100 | 7900xt |
-DKokkos_ARCH_AMD_GFX1030=ON |
GFX1030 | V620, W6800 |
Option | Description |
---|---|
-DKokkos_ENABLE_HIP_MULTIPLE_KERNEL_INSTANTIATIONS=ON |
Instantiate multiple kernels at compile time, improves performance but increases compile time |
-DKokkos_ENABLE_HIP_RELOCATABLE_DEVICE_CODE=ON |
Enable Relocatable Device Code (RDC) for HIP |
Option | Architecture |
---|---|
-DKokkos_ARCH_INTEL_GEN=ON |
Generic JIT |
-DKokkos_ARCH_INTEL_XEHP=ON |
Xe-HP |
-DKokkos_ARCH_INTEL_PVC=ON |
GPU Max/Ponte Vecchio |
-DKokkos_ARCH_INTEL_DG1=ON |
Iris XeMAX |
-DKokkos_ARCH_INTEL_GEN12=ON |
Gen12 |
-DKokkos_ARCH_INTEL_GEN11=ON |
Gen11 |
Option | Architecture | CC | Associated cards |
---|---|---|---|
-DKokkos_ARCH_HOPPER90=ON |
Hopper | 9.0 | H200, H100 |
-DKokkos_ARCH_ADA89=ON |
Ada | 8.9 | GeForce RTX 40 series, RTX 6000/5000 series, L4, L40 |
-DKokkos_ARCH_AMPERE86=ON |
Ampere | 8.6 | GeForce RTX 30 series, RTX A series, A40, A10, A16, A2 |
-DKokkos_ARCH_AMPERE80=ON |
Ampere | 8.0 | A100, A30 |
-DKokkos_ARCH_TURING75=ON |
Turing | 7.5 | T4 |
-DKokkos_ARCH_VOLTA72=ON |
Volta | 7.2 | |
-DKokkos_ARCH_VOLTA70=ON |
Volta | 7.0 | V100 |
-DKokkos_ARCH_PASCAL61=ON |
Pascal | 6.1 | P6, P40, P4 |
-DKokkos_ARCH_PASCAL60=ON |
Pascal | 6.0 | P100 |
-DKokkos_ARCH_MAXWELL53=ON |
Maxwell | 5.3 | |
-DKokkos_ARCH_MAXWELL52=ON |
Maxwell | 5.2 | M6, M60, M4, M40 |
-DKokkos_ARCH_MAXWELL50=ON |
Maxwell | 5.0 | M10 |
-DKokkos_ARCH_KEPLER37=ON |
Kepler | 3.7 | K80 |
-DKokkos_ARCH_KEPLER35=ON |
Kepler | 3.5 | K40, K20 |
-DKokkos_ARCH_KEPLER32=ON |
Kepler | 3.2 | |
-DKokkos_ARCH_KEPLER30=ON |
Kepler | 3.0 | K10 |
See NVIDIA documentation on Compute Capability (CC): https://developer.nvidia.com/cuda-gpus
Option | Description |
---|---|
-DKokkos_ENABLE_CUDA_CONSTEXPR |
Activate experimental relaxed constexpr functions |
-DKokkos_ENABLE_CUDA_LAMBDA |
Activate experimental lambda features |
-DKokkos_ENABLE_CUDA_LDG_INTRINSIC |
Use CUDA LDG intrinsics |
-DKokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE |
Enable relocatable device code (RDC) for CUDA |
See https://kokkos.org/kokkos-core-wiki/keywords.html#third-party-libraries-tpls
cmake \
-B build \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ARCH_NATIVE=ON \
-DKokkos_ENABLE_OPENMP=ON
cmake \
-B build \
-DCMAKE_CXX_COMPILER=hipcc \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_HIP=ON \
-DKokkos_ARCH_AMD_GFX90A=ON \
-DKokkos_ENABLE_OPENMP=ON
cmake \
-B build \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_AMPERE80=ON \
-DKokkos_ENABLE_OPENMP=ON
cmake \
-B build \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_CUDA=ON \
-DKokkos_ARCH_VOLTA70=ON \
-DKokkos_ENABLE_OPENMP=ON
cmake \
-B build \
-DCMAKE_CXX_COMPILER=icpx \
-DCMAKE_BUILD_TYPE=Release \
-DKokkos_ENABLE_SYCL=ON \
-DKokkos_ARCH_INTEL_PVC=ON \
-DKokkos_ENABLE_OPENMP=ON \
-DCMAKE_CXX_FLAGS="-fp-model=precise" # for math precision