![]() |
Project
|
This ticket will serve as documentation how to enable which GPU features and collect related issues.
So far, the following features exist:
GPU support should be detected and enabled automatically. If you just want to reproduce the GPU build locally without running it, it might be easiest to use the GPU CI container (see below). The provisioning script of the container also demonstrates which patches need to be applied such that everything works correctly.
In a nutshell, all is steered via CMake variables, and the ALIBUILD_O2_FORCE_GPU...
environment variables exist to steer what alidist/o2.sh puts as CMake defaults. We try to run the same CMake GPU detection as in O2 (FindO2GPU.cmake) during the aliBuild prefer_system_check
(gpu-system.sh), such that all GPU features / versions / architectures can become part of the gpu-system
version, which avoid inconsistencies between different packages we build.
All is steered via environment variables, which will go into the version and thus the hash:
ALIBUILD_O2_FORCE_GPU=...
sets the modeALIBUILD_O2_FORCE_GPU_CUDA=1
can force-enable (=1
) or disable (=0
) backends, even if they were not detected. Same for ..._HIP
and ..._OPENCL
.ALIBUILD_O2_FORCE_GPU_CUDA_ARCH=...
can override the architecture to cross-compile, e.g. ALIBUILD_O2_FORCE_GPU_CUDA_ARCH="86;89"
. Same for ..._HIP_ARCH
.Modes for ALIBUILD_O2_FORCE_GPU
force
/ 1
/ ci
: Force that all backends / features are detected, fail if not. GPU architectures are set to the default ones if not specified by environment variables. CI is currently identical to force, but should allow special behavior when running in the CI.auto
: check for supported system-cmake version, fail if not found. Auto-detect GPU backends / features and architectures. Selected features can be force-enabled on top via env variable. But not selectively disabled. (But one can use the manual mode below.)onthefly
: Don't detect GPUs at alidist levels. gPUs disabled in ONNX. GPUs auto-detcted in O2 CMake during build as before, but this means the O2 build hash does not depend on GPU features, so we also have the same problems as before. This is just a fallback, to allow users to build with GPUs if they don't have a compatible system CMake.fullauto
: Detect supported system-cmake. If found, behave as Auto. If not found behave as OnTheFly.disabled
: Disable all GPU builds. No extra time during aliBuild command.manual
: all GPU builds disabled by default, to be enabled manually via env variable. No extra time during aliBuild command.Additional reasoning for this approach Advantages:
gpu-system
.Disadvantages:
CMake
>= 3.26
for the detsction at aliBuild level.FindO2GPU.cmake
is duplicated in O2 and alidist and must be kept in sync. But at least this is checked and gives an error otherwise.GPU Tracking with CUDA
-DENABLE_CUDA=ON/OFF/AUTO
steers whether CUDA is forced enabled / unconditionally disabled / auto-detected.-DCUDA_COMPUTETARGET=...
fixes a GPU target, e.g. 61 for PASCAL or 75 for Turing (if unset, it compiles for the lowest supported architecture)GPU Tracking with HIP
-DHIP_AMDGPUTARGET=...
/ env variable ALIBUILD_O2_FORCE_GPU_HIP_ARCH=...
forces a GPU target, e.g. gfx906 for MI50 (if unset, it auto-detects the GPU).GPU Tracking with OpenCL (Needs Clang >= 18 for compilation)
OpenGL visualization of TPC tracking
Vulkan visualization
ITS GPU Tracking
Using the GPU CI container
alisw/slc9-gpu-builder
.ALIBUILD_O2_FORCE_GPU=1
env variable, which force-enables all GPU builds.If you want to enforce the GPU builds on a system without GPU, please export the following environment variables:
ALIBUILD_O2_FORCE_GPU_CUDA=ON
ALIBUILD_O2_FORCE_GPU_HIP=ON
ALIBUILD_O2_FORCE_GPU_OPENCL=ON
ALIBUILD_O2_FORCE_GPU_CUDA_ARCH=default
ALIBUILD_O2_FORCE_GPU_HIP_ARCH=default