[RISCV] Rewrite vrgather.vx undef, (vmv.s.x), 0, v0 as vmv.v.x #136010

preames · 2025-04-16T19:03:03Z

This extends the DAG combine introduced in 336b290 to handle the case where the prior value is defined by a vmv.s.x instead of a vmv.v.x. If the vrgather splats the single source element, and has no passthru we can replace it with a vmv.v.x - which will in turn usually get folded into a vmerge if a select follows.

llvmbot · 2025-04-16T19:03:43Z

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

This extends the DAG combine introduced in 336b290 to handle the case where the prior value is defined by a vmv.s.x instead of a vmv.v.x. If the vrgather splats the single source element, and has no passthru we can replace it with a vmv.v.x - which will in turn usually get folded into a vmerge if a select follows.

Full diff: https://github.com/llvm/llvm-project/pull/136010.diff

3 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+29-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll (+6-13)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int.ll (+6-11)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f24752b8721f5..4dd237b5415dd 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -5573,7 +5573,6 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
     const int Lane = SVN->getSplatIndex();
     if (Lane >= 0) {
       MVT SVT = VT.getVectorElementType();
-
       // Turn splatted vector load into a strided load with an X0 stride.
       SDValue V = V1;
       // Peek through CONCAT_VECTORS as VectorCombine can concat a vector
@@ -19710,20 +19709,47 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
       return V;
     break;
   case RISCVISD::VRGATHER_VX_VL: {
-    // Drop a redundant vrgather_vx.
+    using namespace llvm::SDPatternMatch;
     // Note this assumes that out of bounds indices produce poison
     // and can thus be replaced without having to prove them inbounds..
+    EVT VT = N->getValueType(0);
     SDValue Src = N->getOperand(0);
+    SDValue Idx = N->getOperand(1);
     SDValue Passthru = N->getOperand(2);
     SDValue VL = N->getOperand(4);
+
+    // Warning: Unlike most cases we strip an insert_subvector, this one
+    // does not require the first operand to be undef.
+    if (Src.getOpcode() == ISD::INSERT_SUBVECTOR &&
+        sd_match(Src.getOperand(2), m_Zero()))
+      Src = Src.getOperand(1);
+
     switch (Src.getOpcode()) {
     default:
       break;
     case RISCVISD::VMV_V_X_VL:
     case RISCVISD::VFMV_V_F_VL:
-      if (Passthru.isUndef() && VL == Src.getOperand(2))
+      // Drop a redundant vrgather_vx.
+      // TODO: Remove the type restriction if we find a motivating
+      // test case?
+      if (Passthru.isUndef() && VL == Src.getOperand(2) &&
+          Src.getValueType() == VT)
         return Src;
       break;
+    case RISCVISD::VMV_S_X_VL:
+    case RISCVISD::VFMV_S_F_VL:
+      // If this use only demands lane zero from the source vmv.s.x, and
+      // doesn't have a passthru, then this vrgather.vi/vx is equivalent to
+      // a vmv.v.x.  Note that there can be other uses of the original
+      // vmv.s.x and thus we can't eliminate it.  (vfmv.s.f is analogous)
+      if (sd_match(Idx, m_Zero()) && Passthru.isUndef() &&
+          VL == Src.getOperand(2)) {
+        unsigned Opc =
+            VT.isFloatingPoint() ? RISCVISD::VFMV_V_F_VL : RISCVISD::VMV_V_X_VL;
+        return DAG.getNode(Opc, DL, VT, DAG.getUNDEF(VT), Src.getOperand(1),
+                           VL);
+      }
+      break;
     }
     break;
   }
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll
index 5aac2687122ae..f580b1b993395 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll
@@ -96,13 +96,11 @@ define <8 x float> @vmerge_vxm(<8 x float> %v, float %s) {
 ; CHECK-LABEL: vmerge_vxm:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a0, 25
-; CHECK-NEXT:    vsetivli zero, 8, e32, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    vsetivli zero, 1, e32, m4, tu, ma
 ; CHECK-NEXT:    vmv.s.x v0, a0
-; CHECK-NEXT:    vmv2r.v v10, v8
-; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, mu
-; CHECK-NEXT:    vrgather.vi v10, v8, 0, v0.t
-; CHECK-NEXT:    vmv.v.v v8, v10
+; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:    vfmerge.vfm v8, v8, fa0, v0
 ; CHECK-NEXT:    ret
   %ins = insertelement <8 x float> %v, float %s, i32 0
   %shuf = shufflevector <8 x float> %ins, <8 x float> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 0, i32 0, i32 5, i32 6, i32 7>
@@ -112,15 +110,10 @@ define <8 x float> @vmerge_vxm(<8 x float> %v, float %s) {
 define <8 x float> @vmerge_vxm2(<8 x float> %v, float %s) {
 ; CHECK-LABEL: vmerge_vxm2:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetivli zero, 1, e32, m4, tu, ma
-; CHECK-NEXT:    vmv1r.v v12, v8
-; CHECK-NEXT:    vmv2r.v v10, v8
 ; CHECK-NEXT:    li a0, 25
-; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
 ; CHECK-NEXT:    vmv.s.x v0, a0
-; CHECK-NEXT:    vmv1r.v v10, v12
-; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, mu
-; CHECK-NEXT:    vrgather.vi v8, v10, 0, v0.t
+; CHECK-NEXT:    vfmerge.vfm v8, v8, fa0, v0
 ; CHECK-NEXT:    ret
   %ins = insertelement <8 x float> %v, float %s, i32 0
   %shuf = shufflevector <8 x float> %v, <8 x float> %ins, <8 x i32> <i32 8, i32 1, i32 2, i32 8, i32 8, i32 5, i32 6, i32 7>
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int.ll
index 5c4ef29d7d5b7..8676803e20e3b 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-int.ll
@@ -1448,13 +1448,11 @@ define <8 x i8> @vmerge_vxm(<8 x i8> %v, i8 %s) {
 ; CHECK-LABEL: vmerge_vxm:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 25
-; CHECK-NEXT:    vsetivli zero, 8, e8, m1, tu, ma
-; CHECK-NEXT:    vmv.s.x v8, a0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v0, a1
-; CHECK-NEXT:    vmv1r.v v9, v8
-; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, mu
-; CHECK-NEXT:    vrgather.vi v9, v8, 0, v0.t
-; CHECK-NEXT:    vmv1r.v v8, v9
+; CHECK-NEXT:    vmv.s.x v8, a0
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
+; CHECK-NEXT:    vmerge.vxm v8, v8, a0, v0
 ; CHECK-NEXT:    ret
   %ins = insertelement <8 x i8> %v, i8 %s, i32 0
   %shuf = shufflevector <8 x i8> %ins, <8 x i8> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 0, i32 0, i32 5, i32 6, i32 7>
@@ -1465,12 +1463,9 @@ define <8 x i8> @vmerge_vxm2(<8 x i8> %v, i8 %s) {
 ; CHECK-LABEL: vmerge_vxm2:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 25
-; CHECK-NEXT:    vsetivli zero, 1, e8, m1, tu, ma
+; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
 ; CHECK-NEXT:    vmv.s.x v0, a1
-; CHECK-NEXT:    vmv1r.v v9, v8
-; CHECK-NEXT:    vmv.s.x v9, a0
-; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, mu
-; CHECK-NEXT:    vrgather.vi v8, v9, 0, v0.t
+; CHECK-NEXT:    vmerge.vxm v8, v8, a0, v0
 ; CHECK-NEXT:    ret
   %ins = insertelement <8 x i8> %v, i8 %s, i32 0
   %shuf = shufflevector <8 x i8> %v, <8 x i8> %ins, <8 x i32> <i32 8, i32 1, i32 2, i32 8, i32 8, i32 5, i32 6, i32 7>

preames · 2025-04-16T19:05:15Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll

@@ -96,13 +96,11 @@ define <8 x float> @vmerge_vxm(<8 x float> %v, float %s) {
 ; CHECK-LABEL: vmerge_vxm:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a0, 25
-; CHECK-NEXT:    vsetivli zero, 8, e32, m1, tu, ma
-; CHECK-NEXT:    vfmv.s.f v8, fa0
+; CHECK-NEXT:    vsetivli zero, 1, e32, m4, tu, ma


The choice of m4 here is very odd, but not really related to this change. This is happening in InsertVSETVLI because the LMUL isn't really demanded, but starts at m1, the original SEW of the vmv.s.x is e8, and we decide to adjust the input to preserve the SEW/LMUL ratio. I don't believe this actually matters, it just creates an odd looking diff.

preames · 2025-04-16T19:07:40Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-fp.ll

-; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, mu
-; CHECK-NEXT:    vrgather.vi v10, v8, 0, v0.t
-; CHECK-NEXT:    vmv.v.v v8, v10
+; CHECK-NEXT:    vfmv.s.f v8, fa0


Note that this vfmv.s.f could be eliminated if we rewrote the mask on the vmerge. I don't plan to do this, just noting it's vaguely possible. I'm mildly of the opinion that this approach (the post lowering DAG) has been pushed as far as we should, and that if we want to further improve, we should instead starting canonicalizing shuffles before lowering. I may change my mind based on what future cases I stumble into. :)

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

topperc

LGTM

…136010) This extends the DAG combine introduced in 336b290 to handle the case where the prior value is defined by a vmv.s.x instead of a vmv.v.x. If the vrgather splats the single source element, and has no passthru we can replace it with a vmv.v.x - which will in turn usually get folded into a vmerge if a select follows.

preames requested review from lukel97 and topperc April 16, 2025 19:03

llvmbot added the backend:RISC-V label Apr 16, 2025

preames commented Apr 16, 2025

View reviewed changes

topperc reviewed Apr 16, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

Address review comment

33c6869

topperc reviewed Apr 17, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

topperc approved these changes Apr 17, 2025

View reviewed changes

preames added 2 commits April 17, 2025 10:02

Address review comment

9f4fc5a

Remove stray whitespace change

af76305

preames merged commit 7866fc2 into llvm:main Apr 17, 2025
6 of 10 checks passed

preames deleted the pr-riscv-vrgather_vx_vmv.s.x-combine branch April 17, 2025 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Rewrite vrgather.vx undef, (vmv.s.x), 0, v0 as vmv.v.x #136010

[RISCV] Rewrite vrgather.vx undef, (vmv.s.x), 0, v0 as vmv.v.x #136010

Uh oh!

preames commented Apr 16, 2025

Uh oh!

llvmbot commented Apr 16, 2025

Uh oh!

preames Apr 16, 2025

Uh oh!

preames Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

topperc left a comment

Uh oh!

Uh oh!

Uh oh!

[RISCV] Rewrite vrgather.vx undef, (vmv.s.x), 0, v0 as vmv.v.x #136010

[RISCV] Rewrite vrgather.vx undef, (vmv.s.x), 0, v0 as vmv.v.x #136010

Uh oh!

Conversation

preames commented Apr 16, 2025

Uh oh!

llvmbot commented Apr 16, 2025

Uh oh!

preames Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

preames Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!