Skip to content

Commit 61bad0f

Browse files
authored
Refactor inline section into more general instruction frequencies (PR #15)
This change renames the inline section to instr_freq and allows such frequencies for arbitrary instructions with the given frequencies applying to all following instructions. It also changes the convoluted logarithmic scheme to a more transparent log2 based scheme at an acceptable loss of resolution. This implements the changes requested in issue #8.
2 parents e4ad4c5 + e45e148 commit 61bad0f

File tree

1 file changed

+34
-24
lines changed

1 file changed

+34
-24
lines changed

proposals/compilation-hints/Overview.md

Lines changed: 34 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -49,33 +49,43 @@ It is expected and even desired that not all functions are annotated to keep thi
4949
*Note: This should be moved to `metadata.function.compilation_order` without the byte offset if such a namespace will be supported by custom annotations.*
5050

5151

52-
### Inlining
52+
### Instruction frequencies
5353

54-
An engine might decide to inline certain call targets based on its own feedback collection or other hints (e.g. *call targets* section), but explicit hints can be added per call target and per function using the following annotations.
54+
Instruction frequencies might be useful to guide optimizations like inlining, loop unrolling, block deferrals, etc. Within a function, these frequencies inform which blocks lie on the hot path and deserve more expensive optimizations, as well as which are on the cold path and might even allow very expensive steps to even execute the code within (e.g. outlining or de-optimization). An engine can take those decisisions based on the instruction frequency observed, but cannot assume that any part of the code is unreachable based on the instruction frequency.
5555

56-
The `metadata.code.inline` section contains instruction level annotations for all affected call sites.
57-
* *byte offset* |U32| from the beginning of the function to the wire byte index of the call instruction (this must be a `call`, `call_ref` or a `call_indirect`, otherwise the hint will be ignored)
56+
The `metadata.code.instr_freq` section contains instruction level annotations for optimizable instruction sequences.
57+
* *byte offset* |U32| from the beginning of the function to the wire byte index of an instruction (e.g. start of a loop or a block containing a call instruction).
5858
* *hint length* |U32| in bytes (always 1 for now, might be higher for future extensions)
59-
* *log call frequency* |U8| determining the estimated number of times the callee gets called per call of the caller.
60-
61-
The call frequency can be thought of the estimated number of times a callee gets called during one call of the caller. It is a logarithmic value based on the formula $f = \max(1, \min(126, 10 \log_{10} \frac{n}{N} + 32))$ where $n$ is the number of callee calls from this call site and $N$ is the number of caller calls.
62-
63-
The actual decision which function should be inlined can be based on runtime data that the engine collected, additional heuristics and available resources. There is no guarantee that a function is or is not inlined, but it should roughly be expected that functions of higher call frequency are prefered over ones with lower frequency.
64-
Special values of 0 and +127 indicate that a function should never or always be inlined respectively. Engines should respect such annotations over their own heuristics and toolchains should therefore avoid generating such annotations unless there is a good reason for it (e.g. "no inline" annotations in the source).
65-
66-
|log call frequency|calls per parent call|
67-
|-----------------:|:-------------------:|
68-
| 0| *never inline*|
69-
| 1| <0.0008|
70-
| 22| 0.1 |
71-
| 32| 1 |
72-
| 42| 10 |
73-
| 52| 100 |
74-
| 62| 1,000 |
75-
| 126| >2,511,886,432 |
76-
| 127| *always inline*|
77-
78-
If the *byte offset* is 0, the hint applies to all call sites where the function is the **target**. It serves as a shorthand notation unless explicitly overridden. In this case, the call frequency should be a rough estimate of the average call frequency of all potential sites. *Note: This should likely be moved to a dedicated section for clearer separation, e.g. `metadata.function.inline` if such a namespace will be supported by custom annotations.*
59+
* *offset log2 frequency* |U8| determining the estimated number of times the instruction gets executed per execution of the containing function.
60+
61+
Instruction frequencies are always relative to the surrounding function and therefore every instruction at the beginning of a function has an implicit frequency of 1 assigned to it. Each annotation remains valid for all following instructions until the next control flow instruction (`br`, `br_table`, `br_if`, `if`, `return`, `unreachable`, `br_catch`, `throw`).
62+
63+
For now, we expect annotations to only affect `call`, `call_ref`, `call_indirect` and `loop` instructions. Tools can focus on providing annotations for these instruction to make sure they have the expected impact without unnecessary binary bloat. In the future, this can be extended to more instructions if needed. The structure and the name of this annotation has been specifically chosen to allow for that.
64+
65+
The instruction frequency can be thought of the estimated number of times an instruction gets executed during one call of the function. It is a logarithmic value based on the formula $f = \max(1, \min(64, \lfloor \log_{2} \frac{n}{N} \rfloor + 32))$ where $n$ is the total number of executions of this instruction and $N$ is the number of calls to this function.
66+
67+
The main expected use of this hint by engines it to guide inlining decisions. However the actual decision which function should be inlined can be based on runtime data that the engine collected, additional heuristics and available resources. There is no guarantee that a function is or is not inlined, but it should roughly be expected that functions of higher call frequency are prefered over ones with lower frequency.
68+
Special values of 0 and 127 indicate that a function should never or always be inlined respectively. Engines should respect such annotations over their own heuristics and toolchains should therefore avoid generating such annotations unless there is a good reason for it (e.g. "no inline" annotations in the source).
69+
70+
|binary value|log2 frequency|executions per call|
71+
|-----------:|-------------:|:------------------:|
72+
| 0| |*never optimize* |
73+
| 1| -31| <9.313e-10|
74+
| 8| -24| 5.960e-08|
75+
| 16| -16| 1.526e-05|
76+
| 24| -8| 0.00391 |
77+
| 28| -4| 0.0625 |
78+
| 30| -2| 0.25 |
79+
| 31| -1| 0.5 |
80+
| 32| 0| 1 |
81+
| 33| +1| 2 |
82+
| 34| +2| 4 |
83+
| 36| +4| 16 |
84+
| 40| +8| 256 |
85+
| 48| +16| 65536 |
86+
| 56| +24| 1.678e+07|
87+
| 64| +32| >4.295e+09|
88+
| 127| |*always optimize* |
7989

8090

8191
### Call targets

0 commit comments

Comments
 (0)