Skip to content

Conversation

@niwinanto
Copy link
Collaborator

Since shuffle modes are different for different AIE targets, refactored the use in legalizer. Also, extract subvector 128-bit doesn`t need a vshuffle, if we want to extract from idx 0 and known at compile time.

// the second 128-bits to lsb positions of output.
int64_t ModeLoValue;
int64_t ModeHiValue;
if (ST.isAIE2P()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can const-initialize the modes here:

      auto GetShuffleModes = [&] () {
        if (ST.isAIE2P()) 
          return std::make_pair(/*Lo*/ 8, /*Hi*/ 9);
        else
         llvm_unreachable("vshuffle mode value needed for target.");
      };

      const auto [ModeLoValue, ModeHiValue] = GetShuffleModes();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is smart. Done!

MIRBuilder.buildConstant(ModeReg, LaneIdx ? 9 : 8);
MIRBuilder.buildConstant(ModeReg, ModeHiValue);
// step 4: Create vshuffle. For LaneIdx 0, we dont have to use the
// shuffle, padded register itself is enough.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpaded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we pad 128-bit register to 512 in step 1. We could directly use it.

; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s128>) = COPY $wl0
; CHECK-NEXT: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[COPY]](<2 x s128>)
; CHECK-NEXT: [[AIE_PAD_VECTOR_UNDEF:%[0-9]+]]:_(<64 x s8>) = G_AIE_PAD_VECTOR_UNDEF [[BITCAST]](<32 x s8>)
; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<64 x s8>) = G_IMPLICIT_DEF
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@andcarminati
Copy link
Collaborator

It looks great, I left few comments.

@niwinanto niwinanto force-pushed the niwin.vshuffle.legalizer branch from 7f0f8b1 to 71f80b8 Compare October 24, 2025 14:49
// step 4: Create vshuffle. For LaneIdx 0, we dont have to use the
// shuffle, padded register itself is enough.
if (LaneIdx) {
ShuffleInstr =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ShuffleOrCopyInstr.

const unsigned LaneIdx = IdxVal->Value.getZExtValue();
MIRBuilder.buildConstant(ModeReg, LaneIdx ? 9 : 8);
MIRBuilder.buildConstant(ModeReg, ModeHiValue);
// step 4: Create vshuffle. For LaneIdx 0, we dont have to use the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't

const Register ModeHiReg =
MIRBuilder.buildConstant(S32, ModeHiValue).getReg(0);
MIRBuilder.buildSelect(ModeReg, IdxReg, ModeHiReg, ModeLoReg);
// step 4: Create vshuffle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ... vshuffle.

(dot)

Since shuffle modes are different for different AIE targets, refactored the use in legalizer. Also, extract subvector 128-bit doesn`t need a vshuffle, if we want to extract from idx 0 and known at compile time.
@niwinanto niwinanto force-pushed the niwin.vshuffle.legalizer branch from 71f80b8 to 524493c Compare October 27, 2025 08:51
@niwinanto niwinanto requested a review from mludevid as a code owner October 27, 2025 08:51
Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@niwinanto niwinanto enabled auto-merge (rebase) October 27, 2025 08:56
@niwinanto niwinanto merged commit 7f58447 into aie-public Oct 27, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants