Generalize 'inline' hints to 'instruction_frequency' or 'block_frequency'? #8

eqrion · 2024-06-18T15:18:42Z

I've been trying to understand how to interpret the 'inline' hint section.

If I understand correctly, it only specifies how many times a call instruction is executed during an average execution of the function that contains it. This can be used to hint that a call is 'cold' and so even if the call target is completely monomorphic and inlineable, it may not be worth inlining.

This seems to be generally useful, and not specific to call instructions. Could we generalize so that these hints can point at any instruction? If we did that we could rename this to something like 'instruction_frequency' (or something else).

There's some strong overlap here with branch hinting, because the frequency of an instruction getting executed is really tied to how often its block is executed, which is tied to how often the branches that target it execute. So an alternate design could be to specify how frequently blocks/branches are executed, instead of the call instruction within them.

ecmziegler · 2024-06-20T12:18:35Z

We could do that, sure. Such information is very expensive to collect through instrumentation, but could be easily obtained through sampling-based profiling with incomplete but likely good enough information to be interesting.

I haven't thought much about what we could to with that information, but there is certainly potential. We could e.g. decide not to do certain expensive optimizations within rarely executed blocks (e.g. loop unrolling). Knowing that a loop gets executed 10000 times vs only once could make a difference.

eqrion · 2024-06-20T20:54:50Z

We could do that, sure. Such information is very expensive to collect through instrumentation, but could be easily obtained through sampling-based profiling with incomplete but likely good enough information to be interesting.

I haven't thought much about what we could to with that information, but there is certainly potential. We could e.g. decide not to do certain expensive optimizations within rarely executed blocks (e.g. loop unrolling). Knowing that a loop gets executed 10000 times vs only once could make a difference.

I agree that collecting this information for all blocks would be expensive, and I don't have a big use-case for it outside of call instructions yet.

It seems like there wouldn't be much of a cost to generalizing this hint, as long as we can let the profilers (that record the hints) decide how much of this information is worth collecting and shipping in a binary. I'd guess they would by default only collect for call sites.

titzer · 2024-06-20T22:46:31Z

I like the idea of generalizing to instruction frequency. Personally I wouldn't over-index on how expensive it is to collect instruction frequency, because there are valid offline profiling scenarios where instrumentation cost isn't an issue. E.g. we're looking at using Wizard to analyze large repeatable program traces now and it's easy to run them over and over again with different analyses.

ecmziegler · 2024-06-26T09:22:58Z

We can certainly allow for additional hints here. I just want to limit the scope of the first version to quantities that we can reliable generate and that will impact optimizations in at least one engine.

I will try to come up with a list of potential annotatable instruction types and what we could use them for so people understand if it's worth collecting (and adding, because binary size is also a factor here) such information.

Generally, I want all of these optimizations to be extensible and optional, so adding future frequency annotations will be backwards compatible. Perhaps this needs to made clearer in the overview.

ecmziegler · 2025-05-13T10:14:31Z

I've been toying with the idea a little and I think we have three options:

We can allow call frequencies to be added anywhere and leave it up to the engine which ones to use. -- I think this is not very desirable because without any guidance where to put hints, it's hard to make the right call for tool chains where to generate them.
We can allow frequencies at specific sites that we consider interesting for optimizations. At the moment, these would be calls and loops. Other block-level information for infrequently used blocks are likely sufficiently well covered by the binary information provided in branch hints. -- This is currently my preferred option and is open to future extensions.
We could extend the branch hints proposal to allow more quantitative information and extend its scope to other branching instructions like br_table and potentially also exception handling (though one can argue that this should always be so unlikely that it's never worth optimizing for). In theory, we could extract call frequencies for any instruction from that, but it requires modeling the program flow in the engine just to get useful information out of it. -- The complexity doesn't seem to be worth the effort, especially since any missing annotation spoils the whole model.

Option 2 would cover inlining, loop unrolling/peeling and deferred blocks. I'm wondering if there's an optimization use case for block-level annotations that I've overlooked that would require additional annotations.

The instruction_frequency name should allow extending the list of annotations in a semantically meaningful way in the future if needed.

…15) This change renames the inline section to instr_freq and allows such frequencies for arbitrary instructions with the given frequencies applying to all following instructions. It also changes the convoluted logarithmic scheme to a more transparent log2 based scheme at an acceptable loss of resolution. This implements the changes requested in issue #8.

ecmziegler mentioned this issue Jun 10, 2025

Refactor inline section into more general instruction frequencies #15

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generalize 'inline' hints to 'instruction_frequency' or 'block_frequency'? #8

Generalize 'inline' hints to 'instruction_frequency' or 'block_frequency'? #8

eqrion commented Jun 18, 2024

ecmziegler commented Jun 20, 2024

Uh oh!

eqrion commented Jun 20, 2024

Uh oh!

titzer commented Jun 20, 2024

Uh oh!

ecmziegler commented Jun 26, 2024

Uh oh!

ecmziegler commented May 13, 2025

Uh oh!

Generalize 'inline' hints to 'instruction_frequency' or 'block_frequency'? #8

Generalize 'inline' hints to 'instruction_frequency' or 'block_frequency'? #8

Comments

eqrion commented Jun 18, 2024

ecmziegler commented Jun 20, 2024

Uh oh!

eqrion commented Jun 20, 2024

Uh oh!

titzer commented Jun 20, 2024

Uh oh!

ecmziegler commented Jun 26, 2024

Uh oh!

ecmziegler commented May 13, 2025

Uh oh!