Skip to content

Inlining stages #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
osa1 opened this issue Feb 12, 2025 · 3 comments
Open

Inlining stages #12

osa1 opened this issue Feb 12, 2025 · 3 comments

Comments

@osa1
Copy link

osa1 commented Feb 12, 2025

Inlining large functions can sometimes be important for runtime. For example, inlining higher-order functions can eliminate closure allocations when calling a higher-order function and indirect calls when calling function arguments in the higher-order function.

However large functions often make the binaries larger, as unlike small functions (which are often optimized into a small number of instructions when inlined) it's less likely for large functions to be optimized to a small number of instructions.

In these cases having control over inlining stages would be useful, to be able to say "never inline this function before runtime, always inline it in runtime". This makes it possible to keep the binaries smaller while still inlining large functions and elimating indirect calls, allocations etc.

@ecmziegler
Copy link
Collaborator

I see two ways of addressing this:

  1. Leave this to the AOT optimizer to defer inlining to later if the function body is too large.
  2. Have a separate section for AOT hints and for runtime hints. Only the latter section would be part of this proposal, the former would likely be a toolchain convention but could follow the same structure.

Do you see a reason why this decision should not be left to the AOT optimizer, potentially even dependent on command-line flags (optimize for size vs optimize for speed) or compilation target (optimize for web vs optimize for backend resp.)?

@osa1
Copy link
Author

osa1 commented Feb 13, 2025

Do you see a reason why this decision should not be left to the AOT optimizer, potentially even dependent on command-line flags (optimize for size vs optimize for speed) or compilation target (optimize for web vs optimize for backend resp.)?

Which decision do you mean exactly?

I think ideally we should give control to the user (who writes the performance sensitive code, generates hints via the source language's pragmas/annotations etc.) rather than heuristics.

In this particular case, we wouldn't want to tweak heuristics to inline large functions as that would be wasteful in majority of the cases. We just want to inline some specific large functions in runtime.

To be more concrete, this is the compilation pipeline that we have in dart2wasm:

  1. Dart to Wasm compiler. This does some basic optimizations and inlining.
  2. wasm-opt, most of the inlinings and optimizations are done here.
  3. The engine, e.g. V8.

What we want is to inline a large function in (3), and only inline the specified large function (so we don't want to tweak heuristics). If we inline it in (2) the binary size gets larger.

Currently the only solution that I can think of that is not specific to a particular optimizer in step (2) is generating the hints after step (2). Index of the function to be inlined can be found in the names section after (2) to generate the extra hints. It should work but it feels hacky.

@ecmziegler
Copy link
Collaborator

Sorry if I wasn't clear. I mean the decision what to inline AOT vs in the runtime. The compiler could output inlining hints in step (1), the optimizer could decide which ones it wants to inline itself based on its own heuristics in (2) and emit a new inlining section to be processed by the runtime in step (3).

I'm not sure what information the compiler would have that wasm-opt doesn't and that would allow it to make better decisions. wasm-opt has access to the inlining information which tells it that a function is likely hot and it has access to the function size.

While we could have different hints for wasm-opt and the runtime (the former expressed as a toolchain convention, the latter as a spec'd standard), it would eventually always be a hint and not a hard rule. I know that toolchains often would like more control, but that's historically not what the Wasm community favored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants