-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIEX] Shift G_CONCAT_VECTORS closer to the user #234
Conversation
QoR results:
The results quite modest at first glance, but the real goal of this PR is remove one PADD from GEMM in its innermost loop. GEMM aie-public:
GEMM this PR:
|
void llvm::applyUpdToConcat(MachineInstr &MI, MachineRegisterInfo &MRI, | ||
MachineIRBuilder &B, | ||
std::map<unsigned, Register> &IndexRegMap) { | ||
B.setInstrAndDebugLoc(MI); | ||
B.setDebugLoc(MI.getDebugLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why aren't the debug location and insertion point for MachineInstr
the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @F-Stuckmann , in this case, the goal is to retain the same debug information (related to the instruction that we are replacing), but building the instruction in a different (shifted) position.
MachineInstr &findClosestToUseInsertPoint(MachineInstr &MI, | ||
MachineRegisterInfo &MRI) { | ||
|
||
for (auto &User : MRI.use_instructions(MI.getOperand(0).getReg())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe add a top-level comment like Find a use of \p MI in the same block where it can be moved
for (auto &User : MRI.use_instructions(MI.getOperand(0).getReg())) { | ||
if (User.isPHI()) | ||
continue; | ||
if (User.getParent() == MI.getParent() && canDelayMemOp(MI, User, MRI)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super-nit: I think we should really rename canDelayMemOp
into just canDelayOp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, because it sounds strange to use it for non-memory instructions. On the other hand, this function check specific side effects of crossing memory operations, differently from canAdvanceOp
where the instruction of interest is assumed to not be a load/store. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can leave it like this, it might just be a bit too conservative when encountering a store. We will need to revamp it as to point anyway if we want the combiners to move past certain intrinsics with side effects.
When commiting applyUpdToConcat. This enables more postinc combiner cases.
1c65cc2
to
cb3924b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When commiting applyUpdToConcat. This enables more postinc combiner cases.