-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert Loop Metadata to Asserts to help Loop rotation #227
base: aie-public
Are you sure you want to change the base?
Conversation
060a2a8
to
d244f5c
Compare
1e4b876
to
e8f1b47
Compare
Maybe you can fix the CI failure by running your test under x86? |
e8f1b47
to
f9835c9
Compare
I moved the pass between LICM and loop rotate, so that the loop invariance check is simplified and performed by LICM. This behavior is demonstrated by unittests. Additionally, the assumption is inserted into the Preheader, so that the loop header is not evaluated to true and converted into an endless loop. |
f2d9afe
to
abe2846
Compare
abe2846
to
0c6fa61
Compare
I also only lazily insert the Iteration 0 assumption so that the IR changes are minimal. QoR improves by 1.14% and program memory by 0.73%, when this pass is enabled vs with the builtin_assumes |
6bf41ef
to
d8d9dba
Compare
if (ContainsEqualPredicate) { | ||
LLVM_DEBUG(dbgs() << "LoopIterCountAssumptions-Warning: IterCount " | ||
"Assumption Insertion for EQ and NE no supported!"); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that dead code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding the second assumption may help us not run into the unroll trap.
We always insert NE
assumptions, even with EQ
predicates, since it is an abort criterium
to exit the loop (in which case we invert the predicate for the assumption generation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I missed that the previous if
isn't an early return. Does that mean that for predicates other than NE/EQ, we might insert two types of assumptions? How does hasVariableStepSize
work actually? Won't insertIterationAssumption
bail out because it can find neither an AddRec nor a loop-invariant value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think I start to understand, the operand could be a SCEVAddRecExpr
of degree >1
, in which case the step is itself a SCEVAddRecExpr
. But still, why does that need to be handled specifically? Why not just evaluate the SCEV at iteration MinIterCount - 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we insert two assumptions. with variable StepSizes, AddRec
is actually found and the Stepsize
is now a Value
instead of a Constant
.
Consider the following:
for (int i =0; i < n; i+=m) {
}
If we only add MinIterCount - 1
assumption, we only guarantee, that m* (MinIterCount-1) < n
, but this does not guarantee, that we will actually enter the loop. Consider m to be negative. We then guarantee, that n is negative. But we don't actually guarantee, that we enter the loop the first time.
Therefore we also have to guarantee that by also adding 0 < n
.
With NE this is even more clear
for (int i=0 ; i != n ; i++) {
}
MinIterCount-1 != n
, does not actually guarantee, that we will enter the loop, since we don't know if n != 0
is true
d8d9dba
to
9382a17
Compare
@@ -644,6 +653,9 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level, | |||
LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap, | |||
/*AllowSpeculation=*/false)); | |||
|
|||
if (EnableLoopIterCountToAssumptions) | |||
LPM1.addPass(LoopIterCountAssumptions()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we potentially run the pass twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run the pass and then immediately rotate the loop.
So on the second entry, we already have a rotated loop and bail
SCEV::NoWrapFlags NWF = AddRec->getNoWrapFlags( | ||
SCEV::NoWrapFlags(/*Mask=*/SCEV::FlagNUW | SCEV::FlagNSW)); | ||
|
||
// IterSCEV can either be an AddExpr or simplify to a MulExpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or simplify to a MulExpr
, is this really true? Do you have a test that shows this example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, for example this unit test LoopVariantUpperBound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it always occurs, when we have a variable stepsize and start of 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if adding a reference to a unittest would be appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, that makes sense. Somehow I read it as "recursive" MulExpr
, but this here simply represents a multiplication of two SCEVs.
note the first commit will be removed once the ( #225 ) is merged
Please note: I am still going through QoR to make sure I am not introducing any functional errors.