-
Notifications
You must be signed in to change notification settings - Fork 715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get bitfield insert / extract to emit corresponding DXIL instructions? #6902
Comments
Unfortunately, there is no built-in available to directly generate these DXIL opcodes. We could consider rephrasing this as a feature request to add an optimization to DXC to try and detect this case, but given our current priorities we're unlikely to address this ourselves, although we'd consider contributions adding it. We'd also need to understand if this is something that the driver backend might be better placed to detect and optimize. Alternatively, maybe this is a request to add some new hlsl intrinsics that would generate these opcodes? This has a similar funding issue, but we'd also accept contributions for this. |
@damyanp I understand about budget. Adding some HLSL intrinsics doesn’t seem like too hard of a thing for me to submit a pull request for. Perhaps that would help solve the budget issue. |
Thank you, that would be awesome if you wanted to contribute that. Before you start doing serious work on it, I suggest that you submit a PR with this proposal to the hlsl-specs repo using this template: https://github.com/microsoft/hlsl-specs/blob/main/proposals/templates/basic-template.md. That way we can be sure that we agree on the proposal independently of reviewing the code change itself. For the initial PR you only need to fill out the Introduction, Motivation and Proposed Solution sections. I think you'd want to add HLSL intrinsics for: Note that since DXC also has a SPIR-V backend, we need to consider how these intrinsics will be implemented in SPIR-V. |
Fwiw, in SPIR-V, these map to |
What is the issue you're encountering?
I'm looking to implement bitfieldInsert and bitfieldExtract functions in Slang for reinterpreting structs of data, and want to make sure our Slang->HLSL->DXC->DXIL path is performing as efficiently as I can. In general, I've found that using the built-in intrinsics give very large performance improvements, at least on NVIDIA hardware.
In Slang->SPIR-V, I can emit
OpBitfieldSExtract
,OpBitfieldUExtract
, andOpBitfieldInsert
.And in Slang->GLSL, I can use GLSL's built-in
bitfieldInsert
andbitfieldExtract
.But in HLSL, there doesn't seem to be an equivalent, directly accessible intrinsic.
I do see that DXIL has bitfield insertion/extraction intrinsics,
Lbfe
/Ubfe
/Bfi
, listed here: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rstBut DXC doesn't seem to be able to identify my manual shifts / masks as bitfield insertion and extraction operations, and so the code appears to revert to the slower emulation path.
Steps to Reproduce (if applicable)
Here's a reproducer written in shader-playground.
https://shader-playground.timjones.io/a1580121ec0a9241961d4908ace11b24
For clarity, here's the code:
I'm hoping to see a
Ubfe
instruction emitted in the DXIL output, but instead I just see shifts and masks. On first glance, this seems like a tough case for DXC to pick up, but at the same time I'm not sure how else to get aUbfe
instruction to appear in the output.Environment
For the repro, I'm using
cs_6_6
and optimization level 3. (using optimization level 4 seems to crash.)The text was updated successfully, but these errors were encountered: