-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Martien.exhaustiveswp #242
Conversation
llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/conv2d_bf16-1.mir
Outdated
Show resolved
Hide resolved
llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/conv2d_bf16.mir
Outdated
Show resolved
Hide resolved
N.NumPushedLatest = 0; | ||
N.LastEarliestPusher = -1; | ||
N.LastLatestPusher = -1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parameters represents the path that pushed Earliest of the failing node beyond its modulo cycle in the previous iteration. If we don't reset it, we can use it to provide look-ahead in a rerun.
llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/conv2d_bf16-1.mir
Outdated
Show resolved
Hide resolved
} | ||
Scoreboard[C] |= Slots; | ||
MII = std::max(MII, C + 1); | ||
Counts += getSlotCounts(MI, TII); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
; CHECK-NEXT: nop | ||
; CHECK-NEXT: nop | ||
; CHECK-NEXT: nop | ||
; CHECK-NEXT: nop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original vst.conv
's can be also included, because they can be used a baseline for the epilogue merging.
38a992a
to
650a0ee
Compare
llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/conv2d_bf16-1.mir
Outdated
Show resolved
Hide resolved
f70cc9f
to
9257966
Compare
if (K < NInstr) { | ||
N.Earliest = N.StaticEarliest; | ||
N.Latest = N.StaticLatest; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that imply that nodes outside of the first iteration never get an Earliest
or Latest
? Should we make that explicit with an optional
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They get a default 0, -1 from reset, which are sane values, not sentinels. Earliest of the second iteration is used to represent LCD latencies, but this is propagated back to the first iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm done with the review, let me know if you want to discuss any of the comments :)
5a21673
to
25f8c16
Compare
25f8c16
to
b05dcd6
Compare
# (c) Copyright 2024 Advanced Micro Devices, Inc. or its affiliates | ||
|
||
# RUN: llc --mtriple=aie2 -O2 --start-before=postmisched %s \ | ||
# RUN: --debug-only=postpipeliner-summary -o - | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of --debug-only=postpipeliner-summary
in this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
developer convenience. update_llc_tests shows it on stderr, and the last line reports the final II.
Colors for discriminating edge types in dot graph
Include cycles neceessary for resources of ancesters/offspring in Earliest/Latest
b05dcd6
to
2a09596
Compare
int NumPushedLatest = 0; | ||
|
||
// Latest corrected by taking Earliest of an LCD successor into account | ||
int LCDLatest = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to have a symmetric LCDEarliest
for bottom-up scheduling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. That's an oversight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I guess we would not have a strategy to make use of it anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Leaving it for now, but we should keep in mind that LcdLatest is asymmetric wrt TopDown now.
We can choose the priority of a number of scheduling orderings This version only uses the previously used one.
1e61802
to
0eb616f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I think this adds a lot of flexibility to the pipeliner and I'm sure we'll do great things with it 💪
Set the baseline for GEMM and Conv2D_bf16
Add a provision to try multiple strategies for a single II.
Add a critical path heuristic
For best effect on QoR, the pre-pipeliner and the prescheduler should be skipped. This is forced by
#pragma clang loop pipeliner(disable)
in the source, or globally using --enable-pipeliner=0 --enable-misched=0
QOR:
I ran a like-for like comparison with the base aie-public version The CSV reporting failed, but the result json file shows significant overall improvements (up to >10%) with only a few minor sub-percent regressions
This is the tail of the diff. Trust me the regression head is tiny in comparison