-
Notifications
You must be signed in to change notification settings - Fork 13.4k
SYCL: Rename oneMKL to oneMath #12192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 15 commits
a856351
dccc9f8
3577bc0
4482516
2c79721
64b5a14
bc851c8
9002b4d
09dfe89
948f3c5
1c8a949
5f2525a
995aea3
1caa2d9
0b6f9a9
6af33c9
06fe2ca
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,7 @@ | |
**oneAPI** is an open ecosystem and a standard-based specification, supporting multiple architectures including but not limited to intel CPUs, GPUs and FPGAs. The key components of the oneAPI ecosystem include: | ||
|
||
- **DPCPP** *(Data Parallel C++)*: The primary oneAPI SYCL implementation, which includes the icpx/icx Compilers. | ||
- **oneAPI Libraries**: A set of highly optimized libraries targeting multiple domains *(e.g. oneMKL and oneDNN)*. | ||
- **oneAPI Libraries**: A set of highly optimized libraries targeting multiple domains *(e.g. Intel oneMKL, oneMath and oneDNN)*. | ||
- **oneAPI LevelZero**: A high performance low level interface for fine-grained control over intel iGPUs and dGPUs. | ||
- **Nvidia & AMD Plugins**: These are plugins extending oneAPI's DPCPP support to SYCL on Nvidia and AMD GPU targets. | ||
|
||
|
@@ -227,16 +227,6 @@ Upon a successful installation, SYCL is enabled for the available intel devices, | |
|
||
**oneAPI Plugin**: In order to enable SYCL support on Nvidia GPUs, please install the [Codeplay oneAPI Plugin for Nvidia GPUs](https://developer.codeplay.com/products/oneapi/nvidia/download). User should also make sure the plugin version matches the installed base toolkit one *(previous step)* for a seamless "oneAPI on Nvidia GPU" setup. | ||
|
||
|
||
**oneMKL for cuBlas**: The current oneMKL releases *(shipped with the oneAPI base-toolkit)* do not contain the cuBLAS backend. A build from source of the upstream [oneMKL](https://github.com/oneapi-src/oneMKL) with the *cuBLAS* backend enabled is thus required to run it on Nvidia GPUs. | ||
|
||
```sh | ||
git clone https://github.com/oneapi-src/oneMKL | ||
cd oneMKL | ||
cmake -B buildWithCublas -DCMAKE_CXX_COMPILER=icpx -DCMAKE_C_COMPILER=icx -DENABLE_MKLGPU_BACKEND=OFF -DENABLE_MKLCPU_BACKEND=OFF -DENABLE_CUBLAS_BACKEND=ON -DTARGET_DOMAINS=blas | ||
cmake --build buildWithCublas --config Release | ||
``` | ||
|
||
**oneDNN**: The current oneDNN releases *(shipped with the oneAPI base-toolkit)* do not include the NVIDIA backend. Therefore, oneDNN must be compiled from source to enable the NVIDIA target: | ||
|
||
```sh | ||
|
@@ -250,16 +240,6 @@ cmake --build build-nvidia --config Release | |
|
||
**oneAPI Plugin**: In order to enable SYCL support on AMD GPUs, please install the [Codeplay oneAPI Plugin for AMD GPUs](https://developer.codeplay.com/products/oneapi/amd/download). As with Nvidia GPUs, the user should also make sure the plugin version matches the installed base toolkit. | ||
|
||
**oneMKL for rocBlas**: The current oneMKL releases *(shipped with the oneAPI base-toolkit)* doesn't contain the rocBLAS backend. A build from source of the upstream [oneMKL](https://github.com/oneapi-src/oneMKL) with the *rocBLAS* backend enabled is thus required to run it on AMD GPUs. | ||
|
||
```sh | ||
git clone https://github.com/oneapi-src/oneMKL | ||
cd oneMKL | ||
# Find your HIPTARGET with rocminfo, under the key 'Name:' | ||
cmake -B buildWithrocBLAS -DCMAKE_CXX_COMPILER=icpx -DCMAKE_C_COMPILER=icx -DENABLE_MKLGPU_BACKEND=OFF -DENABLE_MKLCPU_BACKEND=OFF -DENABLE_ROCBLAS_BACKEND=ON -DHIPTARGETS=${HIPTARGET} -DTARGET_DOMAINS=blas | ||
cmake --build buildWithrocBLAS --config Release | ||
``` | ||
|
||
3. **Verify installation and environment** | ||
|
||
In order to check the available SYCL devices on the machine, please use the `sycl-ls` command. | ||
|
@@ -300,6 +280,8 @@ For AMD GPUs we should expect at least one SYCL-HIP device [`hip:gpu`]: | |
|
||
### II. Build llama.cpp | ||
|
||
The SYCL backend depends on [oneMath](https://github.com/uxlfoundation/oneMath). By default it is automatically built along with the project. A specific build can be provided by setting the CMake flag `-DoneMath_DIR=/path/to/oneMath/install/lib/cmake/oneMath`. | ||
|
||
|
||
#### Intel GPU | ||
|
||
``` | ||
|
@@ -325,12 +307,6 @@ cmake --build build --config Release -j -v | |
#### Nvidia GPU | ||
|
||
```sh | ||
# Export relevant ENV variables | ||
export LD_LIBRARY_PATH=/path/to/oneMKL/buildWithCublas/lib:$LD_LIBRARY_PATH | ||
export LIBRARY_PATH=/path/to/oneMKL/buildWithCublas/lib:$LIBRARY_PATH | ||
export CPLUS_INCLUDE_DIR=/path/to/oneMKL/buildWithCublas/include:$CPLUS_INCLUDE_DIR | ||
export CPLUS_INCLUDE_DIR=/path/to/oneMKL/include:$CPLUS_INCLUDE_DIR | ||
|
||
# Build LLAMA with Nvidia BLAS acceleration through SYCL | ||
# Setting GGML_SYCL_DEVICE_ARCH is optional but can improve performance | ||
GGML_SYCL_DEVICE_ARCH=sm_80 # Example architecture | ||
|
@@ -348,11 +324,6 @@ cmake --build build --config Release -j -v | |
#### AMD GPU | ||
|
||
```sh | ||
# Export relevant ENV variables | ||
export LD_LIBRARY_PATH=/path/to/oneMKL/buildWithrocBLAS/lib:$LD_LIBRARY_PATH | ||
export LIBRARY_PATH=/path/to/oneMKL/buildWithrocBLAS/lib:$LIBRARY_PATH | ||
export CPLUS_INCLUDE_DIR=/path/to/oneMKL/buildWithrocBLAS/include:$CPLUS_INCLUDE_DIR | ||
|
||
# Build LLAMA with rocBLAS acceleration through SYCL | ||
|
||
## AMD | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,23 @@ ggml_add_backend_library(ggml-sycl | |
../../include/ggml-sycl.h | ||
) | ||
|
||
file(GLOB GGML_HEADERS_SYCL "*.hpp") | ||
file(GLOB GGML_SOURCES_SYCL "*.cpp") | ||
target_sources(ggml-sycl PRIVATE ${GGML_HEADERS_SYCL} ${GGML_SOURCES_SYCL}) | ||
|
||
find_package(IntelSYCL) | ||
if (IntelSYCL_FOUND) | ||
# Use oneAPI CMake when possible | ||
target_link_libraries(ggml-sycl PRIVATE IntelSYCL::SYCL_CXX) | ||
else() | ||
# Fallback to the simplest way of enabling SYCL when using intel/llvm nightly for instance | ||
target_compile_options(ggml-sycl PRIVATE "-fsycl") | ||
target_link_options(ggml-sycl PRIVATE "-fsycl") | ||
endif() | ||
|
||
target_compile_options(ggml-sycl PRIVATE "-Wno-narrowing") | ||
|
||
# Link against oneDNN | ||
find_package(DNNL) | ||
set(GGML_SYCL_DNNL 0) | ||
if(DNNL_FOUND) | ||
|
@@ -62,8 +79,6 @@ if (GGML_SYCL_F16) | |
add_compile_definitions(GGML_SYCL_F16) | ||
endif() | ||
|
||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-narrowing -fsycl") | ||
|
||
if (GGML_SYCL_TARGET STREQUAL "NVIDIA") | ||
add_compile_definitions(GGML_SYCL_WARP_SIZE=32) | ||
elseif (GGML_SYCL_TARGET STREQUAL "AMD") | ||
|
@@ -76,34 +91,84 @@ else() | |
add_compile_definitions(GGML_SYCL_WARP_SIZE=16) | ||
endif() | ||
|
||
file(GLOB GGML_HEADERS_SYCL "*.hpp") | ||
file(GLOB GGML_SOURCES_SYCL "*.cpp") | ||
target_sources(ggml-sycl PRIVATE ${GGML_HEADERS_SYCL} ${GGML_SOURCES_SYCL}) | ||
|
||
if (GGML_SYCL_GRAPH) | ||
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_GRAPH) | ||
endif() | ||
|
||
if (WIN32) | ||
find_package(IntelSYCL REQUIRED) | ||
# Link against Intel oneMKL or oneMath | ||
if (GGML_SYCL_TARGET STREQUAL "INTEL") | ||
# Intel devices use Intel oneMKL directly instead of oneMath to avoid the limitation of linking Intel oneMKL statically | ||
# See https://github.com/uxlfoundation/oneMath/issues/654 | ||
find_package(MKL REQUIRED) | ||
target_link_libraries(ggml-sycl PRIVATE IntelSYCL::SYCL_CXX MKL::MKL MKL::MKL_SYCL) | ||
target_link_libraries(ggml-sycl PRIVATE MKL::MKL MKL::MKL_SYCL) | ||
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_USE_INTEL_ONEMKL) | ||
else() | ||
if (GGML_SYCL_GRAPH) | ||
add_compile_definitions(GGML_SYCL_GRAPH) | ||
find_package(oneMath QUIET) | ||
if (NOT oneMath_FOUND) | ||
message(STATUS "oneMath not found: oneMath will be automatically downloaded") | ||
# Use FetchContent to automatically pull and build oneMath | ||
include(FetchContent) | ||
set(BUILD_FUNCTIONAL_TESTS False) | ||
set(BUILD_EXAMPLES False) | ||
set(TARGET_DOMAINS blas) | ||
if (GGML_SYCL_TARGET STREQUAL "NVIDIA") | ||
set(ENABLE_MKLCPU_BACKEND False) | ||
set(ENABLE_MKLGPU_BACKEND False) | ||
set(ENABLE_CUBLAS_BACKEND True) | ||
elseif (GGML_SYCL_TARGET STREQUAL "AMD") | ||
set(ENABLE_MKLCPU_BACKEND False) | ||
set(ENABLE_MKLGPU_BACKEND False) | ||
set(ENABLE_ROCBLAS_BACKEND True) | ||
# Ensure setting a string variable here is not overriden by oneMath CACHE variables | ||
cmake_policy(SET CMP0126 NEW) | ||
# Setting the device architecture is only needed and useful for AMD devices in oneMath | ||
set(HIP_TARGETS ${GGML_SYCL_DEVICE_ARCH} CACHE STRING "oneMath HIP target" FORCE) | ||
endif() | ||
FetchContent_Declare( | ||
ONEMATH | ||
GIT_REPOSITORY https://github.com/uxlfoundation/oneMath.git | ||
GIT_TAG c255b1b4c41e2ee3059455c1f96a965d6a62568a | ||
) | ||
FetchContent_MakeAvailable(ONEMATH) | ||
# Create alias to match with find_package targets name | ||
function(onemath_alias target) | ||
if (TARGET ${target}_obj) | ||
# Silence verbose warnings from external libraries | ||
target_compile_options(${target}_obj PRIVATE -w) | ||
endif() | ||
if (TARGET ${target}) | ||
add_library(ONEMATH::${target} ALIAS ${target}) | ||
endif() | ||
endfunction() | ||
onemath_alias(onemath) | ||
onemath_alias(onemath_blas_mklcpu) | ||
onemath_alias(onemath_blas_mklgpu) | ||
onemath_alias(onemath_blas_cublas) | ||
onemath_alias(onemath_blas_rocblas) | ||
endif() | ||
if (GGML_SYCL_TARGET STREQUAL "INTEL") | ||
target_link_libraries(ggml-sycl PRIVATE sycl OpenCL mkl_core pthread m dl mkl_sycl_blas mkl_intel_ilp64 mkl_tbb_thread) | ||
NeoZhangJianyu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
elseif (GGML_SYCL_TARGET STREQUAL "NVIDIA") | ||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl-targets=nvptx64-nvidia-cuda") | ||
add_compile_definitions(GGML_SYCL_NVIDIA) | ||
target_link_libraries(ggml-sycl PRIVATE sycl pthread m dl onemkl_blas_cublas) | ||
|
||
# Below oneMath compile-time dispatching is used for better performance | ||
if (GGML_SYCL_TARGET STREQUAL "NVIDIA") | ||
target_link_libraries(ggml-sycl PRIVATE ONEMATH::onemath_blas_cublas) | ||
target_compile_options(ggml-sycl PRIVATE "-fsycl-targets=nvptx64-nvidia-cuda") | ||
target_link_options(ggml-sycl PRIVATE "-fsycl-targets=nvptx64-nvidia-cuda") | ||
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_NVIDIA) | ||
elseif (GGML_SYCL_TARGET STREQUAL "AMD") | ||
if (NOT GGML_SYCL_DEVICE_ARCH) | ||
message(ERROR "Can't enable SYCL hip backend, GGML_SYCL_DEVICE_ARCH has not been set.") | ||
endif() | ||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fsycl-targets=amdgcn-amd-amdhsa") | ||
target_link_libraries(ggml-sycl PRIVATE sycl pthread m dl onemkl) | ||
target_link_libraries(ggml-sycl PRIVATE ONEMATH::onemath_blas_rocblas) | ||
target_compile_options(ggml-sycl PRIVATE "-fsycl-targets=amdgcn-amd-amdhsa") | ||
target_link_options(ggml-sycl PRIVATE "-fsycl-targets=amdgcn-amd-amdhsa") | ||
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_AMD) | ||
else() | ||
# Fallback to oneMath runtime dispatcher | ||
target_link_libraries(ggml-sycl PRIVATE ONEMATH::onemath) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This path is for Intel in fact. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Intel devices will not go through this path. This whole section is in the |
||
target_compile_definitions(ggml-sycl PRIVATE GGML_SYCL_GENERIC) | ||
endif() | ||
endif() | ||
|
||
if (GGML_SYCL_DEVICE_ARCH) | ||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Xsycl-target-backend --offload-arch=${GGML_SYCL_DEVICE_ARCH}") | ||
endif() | ||
if (GGML_SYCL_DEVICE_ARCH) | ||
target_compile_options(ggml-sycl PRIVATE -Xsycl-target-backend --offload-arch=${GGML_SYCL_DEVICE_ARCH}) | ||
target_link_options(ggml-sycl PRIVATE -Xsycl-target-backend --offload-arch=${GGML_SYCL_DEVICE_ARCH}) | ||
endif() |
Uh oh!
There was an error while loading. Please reload this page.