Description
Git commit
git rev-parse HEAD
725f23f
Operating systems
Linux
GGML backends
CUDA
Problem description & steps to reproduce
I have a brand new DGX-H200 with compute capability 9.0. NVidia-smi reports: Driver Version: 575.51.03. The admin has installed the CUDA SDK and I'm able to build and run *.cu test programs.
When I configure with:
cmake -B build -DLLAMA_CURL=OFF -DGGML_CUDA=1
I get:
-- CUDA Toolkit found
-- Using CUDA architectures: native
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- CUDA host compiler is GNU 11.5.0
-- Including CUDA backend
-- Configuring done (12.4s)
CMake Error in ggml/src/ggml-cuda/CMakeLists.txt:
CUDA_ARCHITECTURES is set to "native", but no GPU was detected.
I see that in ggml/src/ggml-cuda/CMakeLists.txt the supported architectures are up to 89.
If I specify:
cmake -B build -DLLAMA_CURL=OFF -DGGML_CUDA=1 -DCMAKE_CUDA_ARCHITECTURES=89
I can configure and build with no errors. But when I execute with:
build/bin/llama-server -m <my_model> -ngl 100 ...
It loads and runs but only on CPU not GPU even though I specify -ngl.
In short, the GPUs are visible to the OS (nvidia-smi), and I can build a CUDA target but my GPUs are not detected during build or during execution. I do suspect this is a platform problem but I wonder if it could be better detected during cmake --build. I have tried to find the source of "but no GPU was detected" and work backwards from that. I do not know how GPU detection is done.
First Bad Commit
No response
Compile command
cmake -B build -DLLAMA_CURL=OFF -DGGML_CUDA=1 -DCMAKE_CUDA_ARCHITECTURES=89
cmake --build build -j 150
Relevant log output
cmake -B build -DLLAMA_CURL=OFF -DGGML_CUDA=1
-- The C compiler identification is GNU 11.5.0
-- The CXX compiler identification is GNU 11.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.43.5")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.8.93")
-- CUDA Toolkit found
-- Using CUDA architectures: native
-- The CUDA compiler identification is NVIDIA 12.8.93
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- CUDA host compiler is GNU 11.5.0
-- Including CUDA backend
-- Configuring done (12.3s)
CMake Error in ggml/src/ggml-cuda/CMakeLists.txt:
CUDA_ARCHITECTURES is set to "native", but no GPU was detected.
CMake Error in ggml/src/ggml-cuda/CMakeLists.txt:
CUDA_ARCHITECTURES is set to "native", but no GPU was detected.
-- Generating done (6.1s)
CMake Generate step failed. Build files cannot be regenerated correctly.