Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Loop Metadata to Asserts to help Loop rotation #227

Open
wants to merge 2 commits into
base: aie-public
Choose a base branch
from

Conversation

F-Stuckmann
Copy link
Collaborator

@F-Stuckmann F-Stuckmann commented Oct 28, 2024

note the first commit will be removed once the ( #225 ) is merged

Please note: I am still going through QoR to make sure I am not introducing any functional errors.

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch from 060a2a8 to d244f5c Compare December 4, 2024 13:50
@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch 2 times, most recently from 1e4b876 to e8f1b47 Compare December 5, 2024 12:51
@andcarminati
Copy link
Collaborator

Maybe you can fix the CI failure by running your test under x86?

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch from e8f1b47 to f9835c9 Compare December 10, 2024 09:49
@F-Stuckmann
Copy link
Collaborator Author

I moved the pass between LICM and loop rotate, so that the loop invariance check is simplified and performed by LICM. This behavior is demonstrated by unittests.

Additionally, the assumption is inserted into the Preheader, so that the loop header is not evaluated to true and converted into an endless loop.

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch 2 times, most recently from f2d9afe to abe2846 Compare December 10, 2024 13:45
@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch from abe2846 to 0c6fa61 Compare December 11, 2024 14:44
@F-Stuckmann
Copy link
Collaborator Author

F-Stuckmann commented Dec 11, 2024

  • Assumption Insertion: Evaluate at MinIterCount to not introduce loop guards after loop unrolling
  • Assumption Insertion: Evaluate at Iteration 0, so that EQ/NE and variable stepsizes can be processed. Hint: EQ/NE only get this assumption inserted

I also only lazily insert the Iteration 0 assumption so that the IR changes are minimal.

QoR improves by 1.14% and program memory by 0.73%, when this pass is enabled vs with the builtin_assumes

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch 2 times, most recently from 6bf41ef to d8d9dba Compare December 11, 2024 17:04
if (ContainsEqualPredicate) {
LLVM_DEBUG(dbgs() << "LoopIterCountAssumptions-Warning: IterCount "
"Assumption Insertion for EQ and NE no supported!");
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that dead code?

Copy link
Collaborator Author

@F-Stuckmann F-Stuckmann Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, adding the second assumption may help us not run into the unroll trap.

We always insert NE assumptions, even with EQ predicates, since it is an abort criterium to exit the loop (in which case we invert the predicate for the assumption generation).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I missed that the previous if isn't an early return. Does that mean that for predicates other than NE/EQ, we might insert two types of assumptions? How does hasVariableStepSize work actually? Won't insertIterationAssumption bail out because it can find neither an AddRec nor a loop-invariant value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think I start to understand, the operand could be a SCEVAddRecExpr of degree >1, in which case the step is itself a SCEVAddRecExpr. But still, why does that need to be handled specifically? Why not just evaluate the SCEV at iteration MinIterCount - 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we insert two assumptions. with variable StepSizes, AddRec is actually found and the Stepsize is now a Value instead of a Constant.

Consider the following:

for (int i =0; i < n; i+=m) {
}

If we only add MinIterCount - 1 assumption, we only guarantee, that m* (MinIterCount-1) < n , but this does not guarantee, that we will actually enter the loop. Consider m to be negative. We then guarantee, that n is negative. But we don't actually guarantee, that we enter the loop the first time.
Therefore we also have to guarantee that by also adding 0 < n.

With NE this is even more clear

for (int i=0 ; i != n ; i++) {
}

MinIterCount-1 != n, does not actually guarantee, that we will enter the loop, since we don't know if n != 0 is true

@F-Stuckmann F-Stuckmann force-pushed the stuckmann.loop.metadata.asserts branch from d8d9dba to 9382a17 Compare December 12, 2024 15:33
@@ -644,6 +653,9 @@ PassBuilder::buildFunctionSimplificationPipeline(OptimizationLevel Level,
LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap,
/*AllowSpeculation=*/false));

if (EnableLoopIterCountToAssumptions)
LPM1.addPass(LoopIterCountAssumptions());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we potentially run the pass twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run the pass and then immediately rotate the loop.

So on the second entry, we already have a rotated loop and bail

SCEV::NoWrapFlags NWF = AddRec->getNoWrapFlags(
SCEV::NoWrapFlags(/*Mask=*/SCEV::FlagNUW | SCEV::FlagNSW));

// IterSCEV can either be an AddExpr or simplify to a MulExpr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or simplify to a MulExpr, is this really true? Do you have a test that shows this example?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, for example this unit test LoopVariantUpperBound

Copy link
Collaborator Author

@F-Stuckmann F-Stuckmann Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it always occurs, when we have a variable stepsize and start of 0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if adding a reference to a unittest would be appropriate

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, that makes sense. Somehow I read it as "recursive" MulExpr, but this here simply represents a multiplication of two SCEVs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants