-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[vulkan] Allow u8/u16 when only 8/16-bit storage is supported (no shaderInt16) #8759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
yunusberndt
wants to merge
26
commits into
taichi-dev:master
Choose a base branch
from
yunusberndt:Vulkan-patch-for-AMD
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+368
−34
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
for more information, see https://pre-commit.ci
- Fix const-correctness issue in vulkan_device_creator.cpp by adding const_cast for create_info.pNext pointer casting - Reorder SPIRV-Tools include directory in CMakeLists.txt to resolve build configuration issues - Add CHANGELOG.md to document recent Vulkan patch work and improvements These changes resolve compilation errors and improve Vulkan backend compatibility, particularly for AMD GPU support and 8-bit/16-bit operations. Part of Vulkan-patch-for-AMD branch development.
for more information, see https://pre-commit.ci
…lities - Updated Vulkan device creation logic to set granular 16-bit storage capabilities based on supported features. - Improved SPIRV IR builder to conditionally enable specific 16-bit capabilities for better compatibility. - Added logging during Python module initialization to aid in debugging. - Updated .gitignore to exclude new build artifacts. - BUILD_CONFIGURATION.md contains info on how to build wheels using the correct linker, and CMake flags in entry.py. More recent linkers and no flags seem fail the build. These changes address compatibility issues and enhance the Vulkan backend for AMD GPU users.
…/taichi into Vulkan-patch-for-AMD
- Improved formatting in `vulkan_device_creator.cpp` for better readability. - Cleaned up whitespace in `entry.py` to adhere to style guidelines. - Updated `BUILD_CONFIGURATION.md` to ensure consistent documentation of CMake flags. These changes enhance code clarity and maintainability while ensuring accurate build instructions.
…end-of-file fixes
- Changed CHECK_VERSION(1, 2) to CHECK_VERSION(1, 1) for VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME - This allows AMD GPUs with Vulkan 1.1.73 to properly detect 8-bit/16-bit arithmetic capabilities - Fixes 'Type u8 not supported' error on AMD Radeon R7 450
…detected in device extensions
… and device levels
- Removed all TI_DEBUG statements added for troubleshooting - Cleaned up AMD GPU debugging output - Kept the core functionality and fixes intact - Ready for production build
for more information, see https://pre-commit.ci
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue: #8758
Brief Summary
Some Vulkan devices (e.g. my AMD Radeon R7 450) expose 8/16-bit storage features but do not expose 16-bit arithmetic (shaderInt16 = false). Taichi’s Vulkan/SPIR-V backend previously gated u8/u16 types on arithmetic caps only, so kernels using ti.u8/ti.u16 failed at compile time with:
[spirv_ir_builder.cpp:IRBuilder::get_primitive_type] Type u8/u16 not supported.
This PR separates storage from arithmetic capabilities, enables the right Vulkan features via the VkPhysicalDeviceFeatures2 chain, and teaches the SPIR-V builder to declare & use u8/u16 when storage support exists (with arithmetic widened to 32-bit as before).
Walkthrough
Why
Vulkan distinguishes:
Arithmetic features: VkPhysicalDeviceShaderFloat16Int8Features (shaderInt8, shaderFloat16), VkPhysicalDeviceFeatures.shaderInt16
Storage features: VkPhysicalDevice8BitStorageFeatures, VkPhysicalDevice16BitStorageFeatures
A device can legally have storage without arithmetic. Taichi should still allow u8/u16 as storage types (e.g. SSBO/UBO/push constants), performing math in 32-bit and truncating on store (which Taichi already warns about).
User-visible behavior
Programs can use ti.u8 / ti.u16 fields on Vulkan even when shaderInt16 == false, as long as the device exposes the corresponding 8/16-bit storage features.
Arithmetic still happens in 32-bit; assigning to u8/u16 may emit the existing “may lose precision” warning and truncates on store. No change for devices that already support shaderInt16.
What changed (walkthrough)
Add two caps in taichi/inc/rhi_constants.inc.h:
spirv_has_8bit_storage
spirv_has_16bit_storage
These are distinct from existing arithmetic caps spirv_has_int8/spirv_has_int16.
In vulkan_device_creator.cpp:
Query support with vkGetPhysicalDeviceFeatures2KHR for:
VkPhysicalDeviceFloat16Int8FeaturesKHR (arithmetic: shaderInt8 / shaderFloat16)
VkPhysicalDevice8BitStorageFeatures
VkPhysicalDevice16BitStorageFeatures
Build a single VkPhysicalDeviceFeatures2 enable chain (not pEnabledFeatures), and append it to the tail of create_info.pNext so validation layers remain intact.
Set caps:
spirv_has_int8 if shaderInt8
spirv_has_8bit_storage if any 8-bit storage bit is true
spirv_has_16bit_storage if any 16-bit storage bit is true
(Do not set spirv_has_int16 unless core shaderInt16 is true.)
In spirv_ir_builder.cpp::init_header():
When spirv_has_8bit_storage is set, emit:
CapabilityStorageBuffer8BitAccess
CapabilityUniformAndStorageBuffer8BitAccess
CapabilityStoragePushConstant8
When spirv_has_16bit_storage is set, emit:
CapabilityStorageBuffer16BitAccess
CapabilityUniformAndStorageBuffer16BitAccess
CapabilityStoragePushConstant16
CapabilityStorageInputOutput16
OpExtension "SPV_KHR_16bit_storage"
In IRBuilder::init_pre_defs() and get_primitive_type(...):
Declare/allow i8/u8 if arithmetic OR storage is available.
Declare/allow i16/u16 if arithmetic OR storage is available.
This removes the hard requirement on shaderInt16 for u16 storage types.
Repro / test plan
Minimal program:
import taichi as ti
ti.init(arch=ti.vulkan, debug=True)
n = 16
x8 = ti.field(dtype=ti.u8, shape=n)
x16 = ti.field(dtype=ti.u16, shape=n)
@ti.kernel
def compute():
for i in range(n):
x8[i] = i * 7 + 3 # wraps mod 256
x16[i] = i * 300 + 123 # truncates to 16-bit
compute()
print("u8 :", x8.to_numpy())
print("u16:", x16.to_numpy())
Expected on devices like AMD R7 450 (e.g., API dump shows shaderInt16 = false but 8/16-bit storage = true):
No runtime error.
Warnings about precision loss may appear (unchanged).
Arrays print correctly with 8/16-bit truncation behavior.
Negative guard:
If neither 16-bit storage nor arithmetic is available, Taichi still rejects u16 (unchanged).
Notes:
Clear Taichi’s offline cache between runs (e.g., C:\taichi_cache\ticache) to avoid stale kernels when switching builds/features.
Compatibility & risks
No behavior change on devices that already support shaderInt16/shaderInt8.
SPIR-V capabilities/extensions are only emitted when the corresponding device caps are set.
Validation/layers are preserved: the Features2 chain is appended after validation entries in pNext.
CPU/OpenGL/Metal backends untouched.
Files touched (high level)
taichi/inc/rhi_constants.inc.h — add spirv_has_8bit_storage, spirv_has_16bit_storage
taichi/rhi/vulkan/vulkan_device_creator.cpp — query & enable storage features via Features2; set new caps; keep validation chain intact
taichi/codegen/spirv/spirv_ir_builder.cpp — emit storage capabilities/extensions; allow/declare 8/16-bit types when storage is present
Motivation evidence (device logs)
On an AMD Radeon R7 450, vkCreateDevice API dump shows:
shaderInt16 = 0
storageBuffer8BitAccess = 1, uniformAndStorageBuffer8BitAccess = 1
storageBuffer16BitAccess = 1, uniformAndStorageBuffer16BitAccess = 1
…which previously caused Taichi to error on u8/u16 even though storage is supported.