Skip to content

Conversation

yunusberndt
Copy link

Issue: #8758

Brief Summary

Some Vulkan devices (e.g. my AMD Radeon R7 450) expose 8/16-bit storage features but do not expose 16-bit arithmetic (shaderInt16 = false). Taichi’s Vulkan/SPIR-V backend previously gated u8/u16 types on arithmetic caps only, so kernels using ti.u8/ti.u16 failed at compile time with:

[spirv_ir_builder.cpp:IRBuilder::get_primitive_type] Type u8/u16 not supported.

This PR separates storage from arithmetic capabilities, enables the right Vulkan features via the VkPhysicalDeviceFeatures2 chain, and teaches the SPIR-V builder to declare & use u8/u16 when storage support exists (with arithmetic widened to 32-bit as before).

Walkthrough

Why

Vulkan distinguishes:

Arithmetic features: VkPhysicalDeviceShaderFloat16Int8Features (shaderInt8, shaderFloat16), VkPhysicalDeviceFeatures.shaderInt16

Storage features: VkPhysicalDevice8BitStorageFeatures, VkPhysicalDevice16BitStorageFeatures

A device can legally have storage without arithmetic. Taichi should still allow u8/u16 as storage types (e.g. SSBO/UBO/push constants), performing math in 32-bit and truncating on store (which Taichi already warns about).

User-visible behavior

Programs can use ti.u8 / ti.u16 fields on Vulkan even when shaderInt16 == false, as long as the device exposes the corresponding 8/16-bit storage features.

Arithmetic still happens in 32-bit; assigning to u8/u16 may emit the existing “may lose precision” warning and truncates on store. No change for devices that already support shaderInt16.

What changed (walkthrough)

  1. Device capabilities: add storage flags

Add two caps in taichi/inc/rhi_constants.inc.h:

spirv_has_8bit_storage

spirv_has_16bit_storage

These are distinct from existing arithmetic caps spirv_has_int8/spirv_has_int16.

  1. Vulkan device creation: enable features via Features2 chain

In vulkan_device_creator.cpp:

Query support with vkGetPhysicalDeviceFeatures2KHR for:

VkPhysicalDeviceFloat16Int8FeaturesKHR (arithmetic: shaderInt8 / shaderFloat16)

VkPhysicalDevice8BitStorageFeatures

VkPhysicalDevice16BitStorageFeatures

Build a single VkPhysicalDeviceFeatures2 enable chain (not pEnabledFeatures), and append it to the tail of create_info.pNext so validation layers remain intact.

Set caps:

spirv_has_int8 if shaderInt8

spirv_has_8bit_storage if any 8-bit storage bit is true

spirv_has_16bit_storage if any 16-bit storage bit is true
(Do not set spirv_has_int16 unless core shaderInt16 is true.)

  1. SPIR-V header: emit storage capabilities

In spirv_ir_builder.cpp::init_header():

When spirv_has_8bit_storage is set, emit:

CapabilityStorageBuffer8BitAccess

CapabilityUniformAndStorageBuffer8BitAccess

CapabilityStoragePushConstant8

When spirv_has_16bit_storage is set, emit:

CapabilityStorageBuffer16BitAccess

CapabilityUniformAndStorageBuffer16BitAccess

CapabilityStoragePushConstant16

CapabilityStorageInputOutput16

OpExtension "SPV_KHR_16bit_storage"

  1. SPIR-V type availability

In IRBuilder::init_pre_defs() and get_primitive_type(...):

Declare/allow i8/u8 if arithmetic OR storage is available.

Declare/allow i16/u16 if arithmetic OR storage is available.

This removes the hard requirement on shaderInt16 for u16 storage types.

Repro / test plan

Minimal program:

import taichi as ti
ti.init(arch=ti.vulkan, debug=True)

n = 16
x8 = ti.field(dtype=ti.u8, shape=n)
x16 = ti.field(dtype=ti.u16, shape=n)

@ti.kernel
def compute():
for i in range(n):
x8[i] = i * 7 + 3 # wraps mod 256
x16[i] = i * 300 + 123 # truncates to 16-bit

compute()
print("u8 :", x8.to_numpy())
print("u16:", x16.to_numpy())

Expected on devices like AMD R7 450 (e.g., API dump shows shaderInt16 = false but 8/16-bit storage = true):

No runtime error.

Warnings about precision loss may appear (unchanged).

Arrays print correctly with 8/16-bit truncation behavior.

Negative guard:

If neither 16-bit storage nor arithmetic is available, Taichi still rejects u16 (unchanged).

Notes:

Clear Taichi’s offline cache between runs (e.g., C:\taichi_cache\ticache) to avoid stale kernels when switching builds/features.

Compatibility & risks

No behavior change on devices that already support shaderInt16/shaderInt8.

SPIR-V capabilities/extensions are only emitted when the corresponding device caps are set.

Validation/layers are preserved: the Features2 chain is appended after validation entries in pNext.

CPU/OpenGL/Metal backends untouched.

Files touched (high level)

taichi/inc/rhi_constants.inc.h — add spirv_has_8bit_storage, spirv_has_16bit_storage

taichi/rhi/vulkan/vulkan_device_creator.cpp — query & enable storage features via Features2; set new caps; keep validation chain intact

taichi/codegen/spirv/spirv_ir_builder.cpp — emit storage capabilities/extensions; allow/declare 8/16-bit types when storage is present

Motivation evidence (device logs)

On an AMD Radeon R7 450, vkCreateDevice API dump shows:

shaderInt16 = 0

storageBuffer8BitAccess = 1, uniformAndStorageBuffer8BitAccess = 1

storageBuffer16BitAccess = 1, uniformAndStorageBuffer16BitAccess = 1
…which previously caused Taichi to error on u8/u16 even though storage is supported.

@CLAassistant
Copy link

CLAassistant commented Aug 28, 2025

CLA assistant check
All committers have signed the CLA.

@yunusberndt yunusberndt changed the title Vulkan: allow u8/u16 when only 8/16-bit storage is supported (no shaderInt16) [vulkan] allow u8/u16 when only 8/16-bit storage is supported (no shaderInt16) Aug 28, 2025
@yunusberndt yunusberndt changed the title [vulkan] allow u8/u16 when only 8/16-bit storage is supported (no shaderInt16) [vulkan] Allow u8/u16 when only 8/16-bit storage is supported (no shaderInt16) Aug 28, 2025
yunusberndt and others added 24 commits August 28, 2025 15:59
- Fix const-correctness issue in vulkan_device_creator.cpp by adding const_cast
  for create_info.pNext pointer casting
- Reorder SPIRV-Tools include directory in CMakeLists.txt to resolve build
  configuration issues
- Add CHANGELOG.md to document recent Vulkan patch work and improvements

These changes resolve compilation errors and improve Vulkan backend
compatibility, particularly for AMD GPU support and 8-bit/16-bit operations.

Part of Vulkan-patch-for-AMD branch development.
…lities

- Updated Vulkan device creation logic to set granular 16-bit storage capabilities based on supported features.
- Improved SPIRV IR builder to conditionally enable specific 16-bit capabilities for better compatibility.
- Added logging during Python module initialization to aid in debugging.
- Updated .gitignore to exclude new build artifacts.
- BUILD_CONFIGURATION.md contains info on how to build wheels using the correct linker, and CMake flags in entry.py. More recent linkers and no flags seem fail the build.

These changes address compatibility issues and enhance the Vulkan backend for AMD GPU users.
- Improved formatting in `vulkan_device_creator.cpp` for better readability.
- Cleaned up whitespace in `entry.py` to adhere to style guidelines.
- Updated `BUILD_CONFIGURATION.md` to ensure consistent documentation of CMake flags.

These changes enhance code clarity and maintainability while ensuring accurate build instructions.
- Changed CHECK_VERSION(1, 2) to CHECK_VERSION(1, 1) for VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME
- This allows AMD GPUs with Vulkan 1.1.73 to properly detect 8-bit/16-bit arithmetic capabilities
- Fixes 'Type u8 not supported' error on AMD Radeon R7 450
- Removed all TI_DEBUG statements added for troubleshooting
- Cleaned up AMD GPU debugging output
- Kept the core functionality and fixes intact
- Ready for production build
@yunusberndt yunusberndt marked this pull request as ready for review September 10, 2025 03:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants