-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring & improvements to algorithmic complexity of inline.rs. #811
base: main
Are you sure you want to change the base?
Conversation
I've tried to explain what exactly I am doing in each commit, and the rationale behind it. This is a second draft, but if some parts seem unclear or I missed something simple, be sure to tell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is a whole heck of a lot, and my burnt-out brainthinker can barely cope with reviewing it, but hopefully we can get this through. (I'd definitely like @eddyb to take a look here, too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just want to wait a bit to see if eddyb wants to review!
…c inlining. By inlining in callee -> caller order, we avoid the need to continue inlining the code we just inlined. A simple reachability test from one of the entry points helps avoid unnecessary work as well. The algorithm however remains quadratic in case where OpAccessChains repeatedly find their way into function parameters. There are two ways out: either a more complex control flow analysis, or conservatively inlining all function calls which reference FunctionParameters as arguments. I don't think either case is very worth it.
We need pointer types, and re-checking all the types to see if we already have one is rather slow, it's better to keep track.
…ned. The functions we are going to delete definitely either need to be inlined, or are never called (so we don't care what to decide about them).
Since during inlining, the only escaping value is the return value, we can calculate and update whether it has an invalid-to-call-with value as well. (Note that this is, strictly speaking, more rigor than get_invalid_values() applies, because it doesn't look behind OpPhis) As a nice bonus, we got rid of OpLoad/OpStore in favor of OpPhi, which means no type mucking and no work created for mem2reg.
Originally, this algorithm walked a linked list by the back-edges, copying and skipping. It is easier to just go with front-edges and gobble up a series of potential blocks at once. The predecessor finding algorithm really just wanted to find 1-to-1 edges (it was split between `compute_all_preds` and `fuse_trivial_branches`), so made it that.
Just inlining entry points deletes functions from tests and makes everyone sad.
This partially reverts commit 990425b.
Updated the MR and resolved conflicts, @eddyb do you wish to take a look? |
inline.rs was a sore spot for me reading the codebase because of how many TODOs and quadratic algorithms it had sitting there.
Looking at it, I was able to see some nicer algorithms trying to poke out. Also a little bit more uniformity for doing the same stuff (like finding variable split points), less cloning, and more caching :)
Tests aren't affected at all, which I'd consider a success.
P.S. Idle hands shave yaks, or so they say.