-
Couldn't load subscription status.
- Fork 29
[AIEX]Refactor extract 128-bit subvector legalizer. #690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // the second 128-bits to lsb positions of output. | ||
| int64_t ModeLoValue; | ||
| int64_t ModeHiValue; | ||
| if (ST.isAIE2P()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can const-initialize the modes here:
auto GetShuffleModes = [&] () {
if (ST.isAIE2P())
return std::make_pair(/*Lo*/ 8, /*Hi*/ 9);
else
llvm_unreachable("vshuffle mode value needed for target.");
};
const auto [ModeLoValue, ModeHiValue] = GetShuffleModes();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is smart. Done!
| MIRBuilder.buildConstant(ModeReg, LaneIdx ? 9 : 8); | ||
| MIRBuilder.buildConstant(ModeReg, ModeHiValue); | ||
| // step 4: Create vshuffle. For LaneIdx 0, we dont have to use the | ||
| // shuffle, padded register itself is enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unpaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we pad 128-bit register to 512 in step 1. We could directly use it.
| ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(<2 x s128>) = COPY $wl0 | ||
| ; CHECK-NEXT: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[COPY]](<2 x s128>) | ||
| ; CHECK-NEXT: [[AIE_PAD_VECTOR_UNDEF:%[0-9]+]]:_(<64 x s8>) = G_AIE_PAD_VECTOR_UNDEF [[BITCAST]](<32 x s8>) | ||
| ; CHECK-NEXT: [[DEF:%[0-9]+]]:_(<64 x s8>) = G_IMPLICIT_DEF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
|
It looks great, I left few comments. |
7f0f8b1 to
71f80b8
Compare
| // step 4: Create vshuffle. For LaneIdx 0, we dont have to use the | ||
| // shuffle, padded register itself is enough. | ||
| if (LaneIdx) { | ||
| ShuffleInstr = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ShuffleOrCopyInstr.
| const unsigned LaneIdx = IdxVal->Value.getZExtValue(); | ||
| MIRBuilder.buildConstant(ModeReg, LaneIdx ? 9 : 8); | ||
| MIRBuilder.buildConstant(ModeReg, ModeHiValue); | ||
| // step 4: Create vshuffle. For LaneIdx 0, we dont have to use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't
| const Register ModeHiReg = | ||
| MIRBuilder.buildConstant(S32, ModeHiValue).getReg(0); | ||
| MIRBuilder.buildSelect(ModeReg, IdxReg, ModeHiReg, ModeLoReg); | ||
| // step 4: Create vshuffle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ... vshuffle.
(dot)
Since shuffle modes are different for different AIE targets, refactored the use in legalizer. Also, extract subvector 128-bit doesn`t need a vshuffle, if we want to extract from idx 0 and known at compile time.
71f80b8 to
524493c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Since shuffle modes are different for different AIE targets, refactored the use in legalizer. Also, extract subvector 128-bit doesn`t need a vshuffle, if we want to extract from idx 0 and known at compile time.