Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get bitfield insert / extract to emit corresponding DXIL instructions? #6902

Closed
natevm opened this issue Sep 6, 2024 · 4 comments
Closed
Labels
needs-triage Awaiting triage user-support Issues from users requesting support.

Comments

@natevm
Copy link
Contributor

natevm commented Sep 6, 2024

What is the issue you're encountering?

I'm looking to implement bitfieldInsert and bitfieldExtract functions in Slang for reinterpreting structs of data, and want to make sure our Slang->HLSL->DXC->DXIL path is performing as efficiently as I can. In general, I've found that using the built-in intrinsics give very large performance improvements, at least on NVIDIA hardware.

In Slang->SPIR-V, I can emit OpBitfieldSExtract, OpBitfieldUExtract, and OpBitfieldInsert.
And in Slang->GLSL, I can use GLSL's built-in bitfieldInsert and bitfieldExtract.
But in HLSL, there doesn't seem to be an equivalent, directly accessible intrinsic.

I do see that DXIL has bitfield insertion/extraction intrinsics, Lbfe / Ubfe / Bfi, listed here: https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/DXIL.rst

But DXC doesn't seem to be able to identify my manual shifts / masks as bitfield insertion and extraction operations, and so the code appears to revert to the slower emulation path.

Steps to Reproduce (if applicable)

Here's a reproducer written in shader-playground.
https://shader-playground.timjones.io/a1580121ec0a9241961d4908ace11b24

For clarity, here's the code:

RWStructuredBuffer<uint> result0;
RWStructuredBuffer<int> result1;

uint bitfieldExtract(uint value, int offset, int bits)
{
    return (value >> offset) & ((1u << bits) - 1);
}

[shader("compute")]
[numthreads(1, 1, 1)]
void main()
{
    uint val = result0[1];
    int off = result1[0];
    int bts = result1[1];
    result0[0] = bitfieldExtract(val, off, bts);
}

I'm hoping to see a Ubfe instruction emitted in the DXIL output, but instead I just see shifts and masks. On first glance, this seems like a tough case for DXC to pick up, but at the same time I'm not sure how else to get a Ubfe instruction to appear in the output.

Environment

  • DXC version : ToT 2024-04-29
  • Host System: Shader Playground

For the repro, I'm using cs_6_6 and optimization level 3. (using optimization level 4 seems to crash.)

@natevm natevm added needs-triage Awaiting triage user-support Issues from users requesting support. labels Sep 6, 2024
@damyanp
Copy link
Member

damyanp commented Sep 10, 2024

Unfortunately, there is no built-in available to directly generate these DXIL opcodes.

We could consider rephrasing this as a feature request to add an optimization to DXC to try and detect this case, but given our current priorities we're unlikely to address this ourselves, although we'd consider contributions adding it. We'd also need to understand if this is something that the driver backend might be better placed to detect and optimize.

Alternatively, maybe this is a request to add some new hlsl intrinsics that would generate these opcodes? This has a similar funding issue, but we'd also accept contributions for this.

@damyanp damyanp closed this as not planned Won't fix, can't repro, duplicate, stale Sep 10, 2024
@github-project-automation github-project-automation bot moved this to Triaged in HLSL Triage Sep 10, 2024
@natevm
Copy link
Contributor Author

natevm commented Sep 10, 2024

@damyanp I understand about budget.

Adding some HLSL intrinsics doesn’t seem like too hard of a thing for me to submit a pull request for. Perhaps that would help solve the budget issue.

@damyanp
Copy link
Member

damyanp commented Sep 10, 2024

Thank you, that would be awesome if you wanted to contribute that.

Before you start doing serious work on it, I suggest that you submit a PR with this proposal to the hlsl-specs repo using this template: https://github.com/microsoft/hlsl-specs/blob/main/proposals/templates/basic-template.md. That way we can be sure that we agree on the proposal independently of reviewing the code change itself. For the initial PR you only need to fill out the Introduction, Motivation and Proposed Solution sections.

I think you'd want to add HLSL intrinsics for:

Note that since DXC also has a SPIR-V backend, we need to consider how these intrinsics will be implemented in SPIR-V.

@natevm
Copy link
Contributor Author

natevm commented Sep 10, 2024

Fwiw, in SPIR-V, these map to OpBitfieldUExtract, OpBitfieldSExtract, and OpBitfieldInsert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-triage Awaiting triage user-support Issues from users requesting support.
Projects
Archived in project
Development

No branches or pull requests

2 participants