You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor inline section into more general instruction frequencies (PR #15)
This change renames the inline section to instr_freq and allows such frequencies for arbitrary instructions with the given frequencies applying to all following instructions.
It also changes the convoluted logarithmic scheme to a more transparent log2 based scheme at an acceptable loss of resolution.
This implements the changes requested in issue #8.
Copy file name to clipboardExpand all lines: proposals/compilation-hints/Overview.md
+34-24Lines changed: 34 additions & 24 deletions
Original file line number
Diff line number
Diff line change
@@ -49,33 +49,43 @@ It is expected and even desired that not all functions are annotated to keep thi
49
49
*Note: This should be moved to `metadata.function.compilation_order` without the byte offset if such a namespace will be supported by custom annotations.*
50
50
51
51
52
-
### Inlining
52
+
### Instruction frequencies
53
53
54
-
An engine might decide to inline certain call targets based on its own feedback collection or other hints (e.g. *call targets* section), but explicit hints can be added per call target and per function using the following annotations.
54
+
Instruction frequencies might be useful to guide optimizations like inlining, loop unrolling, block deferrals, etc. Within a function, these frequencies inform which blocks lie on the hot path and deserve more expensive optimizations, as well as which are on the cold path and might even allow very expensive steps to even execute the code within (e.g. outlining or de-optimization). An engine can take those decisisions based on the instruction frequency observed, but cannot assume that any part of the code is unreachable based on the instruction frequency.
55
55
56
-
The `metadata.code.inline` section contains instruction level annotations for all affected call sites.
57
-
**byte offset* |U32| from the beginning of the function to the wire byte index of the call instruction (this must be a `call`, `call_ref`or a `call_indirect`, otherwise the hint will be ignored)
56
+
The `metadata.code.instr_freq` section contains instruction level annotations for optimizable instruction sequences.
57
+
**byte offset* |U32| from the beginning of the function to the wire byte index of an instruction (e.g. start of a loop or a block containing a call instruction).
58
58
**hint length* |U32| in bytes (always 1 for now, might be higher for future extensions)
59
-
**log call frequency* |U8| determining the estimated number of times the callee gets called per call of the caller.
60
-
61
-
The call frequency can be thought of the estimated number of times a callee gets called during one call of the caller. It is a logarithmic value based on the formula $f = \max(1, \min(126, 10 \log_{10} \frac{n}{N} + 32))$ where $n$ is the number of callee calls from this call site and $N$ is the number of caller calls.
62
-
63
-
The actual decision which function should be inlined can be based on runtime data that the engine collected, additional heuristics and available resources. There is no guarantee that a function is or is not inlined, but it should roughly be expected that functions of higher call frequency are prefered over ones with lower frequency.
64
-
Special values of 0 and +127 indicate that a function should never or always be inlined respectively. Engines should respect such annotations over their own heuristics and toolchains should therefore avoid generating such annotations unless there is a good reason for it (e.g. "no inline" annotations in the source).
65
-
66
-
|log call frequency|calls per parent call|
67
-
|-----------------:|:-------------------:|
68
-
| 0|*never inline*|
69
-
| 1| <0.0008|
70
-
| 22| 0.1 |
71
-
| 32| 1 |
72
-
| 42| 10 |
73
-
| 52| 100 |
74
-
| 62| 1,000 |
75
-
| 126| >2,511,886,432 |
76
-
| 127|*always inline*|
77
-
78
-
If the *byte offset* is 0, the hint applies to all call sites where the function is the **target**. It serves as a shorthand notation unless explicitly overridden. In this case, the call frequency should be a rough estimate of the average call frequency of all potential sites. *Note: This should likely be moved to a dedicated section for clearer separation, e.g. `metadata.function.inline` if such a namespace will be supported by custom annotations.*
59
+
**offset log2 frequency* |U8| determining the estimated number of times the instruction gets executed per execution of the containing function.
60
+
61
+
Instruction frequencies are always relative to the surrounding function and therefore every instruction at the beginning of a function has an implicit frequency of 1 assigned to it. Each annotation remains valid for all following instructions until the next control flow instruction (`br`, `br_table`, `br_if`, `if`, `return`, `unreachable`, `br_catch`, `throw`).
62
+
63
+
For now, we expect annotations to only affect `call`, `call_ref`, `call_indirect` and `loop` instructions. Tools can focus on providing annotations for these instruction to make sure they have the expected impact without unnecessary binary bloat. In the future, this can be extended to more instructions if needed. The structure and the name of this annotation has been specifically chosen to allow for that.
64
+
65
+
The instruction frequency can be thought of the estimated number of times an instruction gets executed during one call of the function. It is a logarithmic value based on the formula $f = \max(1, \min(64, \lfloor \log_{2} \frac{n}{N} \rfloor + 32))$ where $n$ is the total number of executions of this instruction and $N$ is the number of calls to this function.
66
+
67
+
The main expected use of this hint by engines it to guide inlining decisions. However the actual decision which function should be inlined can be based on runtime data that the engine collected, additional heuristics and available resources. There is no guarantee that a function is or is not inlined, but it should roughly be expected that functions of higher call frequency are prefered over ones with lower frequency.
68
+
Special values of 0 and 127 indicate that a function should never or always be inlined respectively. Engines should respect such annotations over their own heuristics and toolchains should therefore avoid generating such annotations unless there is a good reason for it (e.g. "no inline" annotations in the source).
0 commit comments