Skip to content

JIT: Optimize ConditionalSelect with const zero when condition is not TYP_MASK #113864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 4, 2025

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented Mar 25, 2025

AVX-512 embedded masking relies on use of ConditionalSelect, which makes this pattern common:

static Vector512<int> SubtractSaturateUnsigned(Vector512<int> a, Vector512<int> b) =>
    Vector512.ConditionalSelect(Vector512.GreaterThan(a, b), a - b, Vector512<int>.Zero);
       vpcmpgtd k1, zmm0, zmm1
       vpsubd   zmm0 {k1}{z}, zmm0, zmm1

This change improves codegen when the same pattern is used with a compare that does not produce a kmask result:

static Vector256<int> SubtractSaturateUnsigned(Vector256<int> a, Vector256<int> b) =>
    Vector256.ConditionalSelect(Vector256.GreaterThan(a, b), a - b, Vector256<int>.Zero);
        vpcmpgtd ymm2, ymm0, ymm1
        vpsubd   ymm0, ymm0, ymm1
-       vxorps   ymm1, ymm1, ymm1
-       vpblendvb ymm0, ymm1, ymm0, ymm2
+       vpand    ymm0, ymm0, ymm2

And more when pblendvb is not available:

static Vector128<int> SubtractSaturateUnsigned(Vector128<int> a, Vector128<int> b) =>
    Vector128.ConditionalSelect(Vector128.GreaterThan(a, b), a - b, Vector128<int>.Zero);
        movaps   xmm2, xmm0
        pcmpgtd  xmm2, xmm1
        psubd    xmm0, xmm1
        pand     xmm0, xmm2
-       xorps    xmm1, xmm1
-       pandn    xmm2, xmm1
-       por      xmm0, xmm2

Diffs show a few interesting variations in tests.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 25, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 25, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@saucecontrol saucecontrol marked this pull request as ready for review March 25, 2025 02:47
@saucecontrol
Copy link
Member Author

cc @tannergooding

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC. @dotnet/jit-contrib, @EgorBo for secondary review

@EgorBo
Copy link
Member

EgorBo commented Mar 27, 2025

/azp run Fuzzlyn

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@saucecontrol
Copy link
Member Author

saucecontrol commented Mar 27, 2025

The x86 Fuzzlyn failure looks like another one from #112728. I'll put up a separate fix for that.

Arm failures are #113939

@saucecontrol
Copy link
Member Author

ping @tannergooding @EgorBo for merge

// If either of the value operands is const zero, we can optimize down to AND or AND_NOT.
GenTree* binOp = nullptr;

if (op3->IsVectorZero())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have nodes of type ConvertMaskToVector(Vector.Zero) for op2 or op3. I just opened #114272 to fix a case where I was not checking that for arm64.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way we'd create that currently on xarch, but I plan on reviving #110342 and will keep that possibility in mind since we'll more eagerly choose intrinsics that produce a mask with that.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak kunalspathak merged commit 2fe8937 into dotnet:main Apr 4, 2025
109 of 111 checks passed
@saucecontrol saucecontrol deleted the csel branch April 4, 2025 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants