Skip to content

Add SIMD funnel shift and round-to-even intrinsics #142078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sayantn
Copy link
Contributor

@sayantn sayantn commented Jun 5, 2025

This PR adds 3 new SIMD intrinsics

  • simd_funnel_shl - funnel shift left
  • simd_funnel_shr - funnel shift right
  • simd_round_ties_even (vector version of round_ties_even_fN)

TODO: implement simd_fsh{l,r} in miri, cg_gcc and cg_clif

#t-compiler > More SIMD intrinsics

@rustbot label T-compiler T-libs A-intrinsics F-core_intrinsics
cc @workingjubilee

@rustbot
Copy link
Collaborator

rustbot commented Jun 5, 2025

r? @wesleywiser

rustbot has assigned @wesleywiser.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 5, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 5, 2025

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

Some changes occurred to the platform-builtins intrinsics. Make sure the
LLVM backend as well as portable-simd gets adapted for the changes.

cc @antoyo, @GuillaumeGomez, @bjorn3, @calebzulawski, @programmerjake

@rustbot rustbot added A-intrinsics Area: Intrinsics F-core_intrinsics Issue in the "core intrinsics" for internal usage only. labels Jun 5, 2025
Copy link
Contributor

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you'd also need to implement this operation in MIRI.

@RalfJung
Copy link
Member

RalfJung commented Jun 5, 2025

TODO: implement these in cg_gcc and cg_clif

Please add Miri to the list. :) (Doesn't have to be in the first PR.)

@folkertdev
Copy link
Contributor

Decide: Should we allow signed integers in simd_fsh{l,r}?

Both s390x and avx512 do (want to) use the funnel shifts on signed types. We can use transmutes there of course, but I'd say we may as well also support signed types. It's already documented that this is a logical shift (i.e. that it always shifts in zeros)

@antoyo
Copy link
Contributor

antoyo commented Jun 5, 2025

@antoyo @bjorn3 some help about implementing fsh{l,r} will be appreciated

You should be able to have more or less the same implementation for cg_gcc by putting the code in this file. The complicated part is to find the equivalent intrinsic names for GCC.

@workingjubilee
Copy link
Member

Decide: Should we add fshr and fshl to normal int types as well? They seem useful enough (cc @rust-lang/libs-api)

If you want to block this PR on submitting a T-libs-api ACP, you can do that, but I don't see why you would.

@sayantn
Copy link
Contributor Author

sayantn commented Jun 5, 2025

@antoyo @bjorn3 some help about implementing fsh{l,r} will be appreciated

You should be able to have more or less the same implementation for cg_gcc by putting the code in this file. The complicated part is to find the equivalent intrinsic names for GCC.

I only found the intrinsic name for x86 as __builtin_ia32_vpsh{l,r}dv_vNiM (from your implementation of llvm.fsh{l,r}), should I go ahead and add it, or do I need to check for other archs as well?.

Decide: Should we add fshr and fshl to normal int types as well? They seem useful enough (cc @rust-lang/libs-api)

If you want to block this PR on submitting a T-libs-api ACP, you can do that, but I don't see why you would.

I don't think this should blocked on adding these to int types, just thought this might be a nice place to discuss it. I will go ahead and submit an independent ACP for that

@antoyo
Copy link
Contributor

antoyo commented Jun 5, 2025

I only found the intrinsic name for x86 as __builtin_ia32_vpsh{l,r}dv_vNiM (from your implementation of llvm.fsh{l,r}), should I go ahead and add it, or do I need to check for other archs as well?.

It seems GCC doesn't have an arch-independent builtin for this, so I guess it's better for this PR to not implement the intrinsic for cg_gcc.
I'll eventually come up with a general solution for this kind of issues.

@sayantn
Copy link
Contributor Author

sayantn commented Jun 5, 2025

ok thanks

@sayantn sayantn force-pushed the more-intrinsics branch from 9d0c1ad to 0733d6c Compare June 5, 2025 15:15
@rustbot
Copy link
Collaborator

rustbot commented Jun 5, 2025

Some changes occurred in compiler/rustc_codegen_cranelift

cc @bjorn3

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@sayantn
Copy link
Contributor Author

sayantn commented Jun 5, 2025

I have implemented simd_round_ties_even in miri, cg_clif and cg_gcc (because it seemed to be easy), and allowed signed integer types for simd_fsh{l,r}

@sayantn sayantn force-pushed the more-intrinsics branch from 0733d6c to c15ce79 Compare June 5, 2025 15:24
@rustbot
Copy link
Collaborator

rustbot commented Jun 5, 2025

The Miri subtree was changed

cc @rust-lang/miri

@sayantn sayantn requested a review from workingjubilee June 5, 2025 15:26
@sayantn sayantn force-pushed the more-intrinsics branch from c15ce79 to 33fe1ce Compare June 5, 2025 15:59
Comment on lines 144 to 161
pub unsafe fn simd_fshl<T>(a: T, b: T, shift: T) -> T;

/// Funnel Shifts vector right elementwise, with UB on overflow.
///
/// Concatenates `a` and `b` elementwise (with `a` in the most significant half),
/// creating a vector of the same length, but with each element being twice as
/// wide. Then shift this vector right elementwise by `shift`, shifting in zeros,
/// and extract the least significant half of each of the elements. If `a` and `b`
/// are the same, this is equivalent to an elementwise rotate right operation.
///
/// `T` must be a vector of integers.
///
/// # Safety
///
/// Each element of `shift` must be less than `<int>::BITS`.
#[rustc_intrinsic]
#[rustc_nounwind]
pub unsafe fn simd_fshr<T>(a: T, b: T, shift: T) -> T;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I have been thinking and I can't get over the bit where the current mnemonic sounds like it has something to do with floats in my head. I think we might want to call out these names as simd_funnel_shl and simd_funnel_shr? I looked around and we get names like __funnelshift_lc and __funnelshift_rc for similar intrinsics (e.g. CUDA), and I know that some languages actually do support shifts on floats, with... sometimes interesting semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, simd_funnel_sh{l,r} sounds nice enough (with the scalar functions as funnel_sh{l,r}). I will change the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the names. Btw @workingjubilee as you seem to be interested, should I r? you?

@sayantn sayantn changed the title Add simd_fsh{l,r} and simd_round_ties_even Add SIMD funnel shift and round-to-even intrinsics Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intrinsics Area: Intrinsics A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. F-core_intrinsics Issue in the "core intrinsics" for internal usage only. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants