Skip to content

[libspirv] Have SPIR-V copysign use CLC copysign #17172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 25, 2025

Conversation

frasercrmck
Copy link
Contributor

This mirrors the OpenCL copysign builtin, and brings benefits in keeping types in vectors as opposed to scalarizing.

This mirrors the OpenCL copysign builtin, and brings benefits in keeping
types in vectors as opposed to scalarizing.
@frasercrmck frasercrmck requested a review from a team as a code owner February 25, 2025 16:39
@frasercrmck frasercrmck requested review from steffenlarsen and removed request for a team February 25, 2025 16:39
@frasercrmck
Copy link
Contributor Author

Note, CLC copysign was introduced in cfc8ef0 which we pulled in

@frasercrmck
Copy link
Contributor Author

See an example of the codegen difference for Native CPU:

in function _Z20__spirv_ocl_copysignDv2_DF16_S_:
  in block %entry:
    >   %2 = tail call <2 x half> @llvm.copysign.v2f16(<2 x half> %0, <2 x half> %1)
    >   %3 = bitcast <2 x half> %2 to i32
    >   ret i32 %3
    <   %2 = extractelement <2 x half> %0, i64 0
    <   %3 = extractelement <2 x half> %1, i64 0
    <   %conv.i7.i = fpext half %2 to double
    <   %conv1.i8.i = fpext half %3 to double
    <   %4 = tail call double @llvm.copysign.f64(double %conv.i7.i, double %conv1.i8.i)
    <   %conv2.i9.i = fptrunc double %4 to half
    <   %vecinit.i = insertelement <2 x half> poison, half %conv2.i9.i, i64 0
    <   %5 = extractelement <2 x half> %0, i64 1
    <   %6 = extractelement <2 x half> %1, i64 1
    <   %conv.i.i = fpext half %5 to double
    <   %conv1.i.i = fpext half %6 to double
    <   %7 = tail call double @llvm.copysign.f64(double %conv.i.i, double %conv1.i.i)
    <   %conv2.i.i = fptrunc double %7 to half
    <   %vecinit4.i = insertelement <2 x half> %vecinit.i, half %conv2.i.i, i64 1
    <   %8 = bitcast <2 x half> %vecinit4.i to i32
    <   ret i32 %8

Note also that half was previously promoting to double. This was spotted in #17163.

@jsji jsji merged commit d24e635 into intel:sycl-web Feb 25, 2025
1 check passed
@frasercrmck frasercrmck deleted the libspirv-copysign branch February 25, 2025 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants