-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Document ILC generic analysis #123566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Document ILC generic analysis #123566
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds comprehensive documentation about how ILC (IL Compiler) handles generic types and methods in NativeAOT compilation. The documentation is based on learnings from PR #122012, which dealt with tracking concrete dependencies of open generic methods.
Changes:
- Adds a new "Compiling generics" section to the ILC architecture documentation covering shared generics, canonicalization, runtime-determined types, generic dictionaries, generic virtual methods (GVMs), and shadow method nodes.
agocke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really useful additions, thank you!
|
|
||
| This sharing only applies to reference type instantiations. Value type instantiations (e.g. `List<int>`) require separate native code bodies because differences such as size affect code generation. When type arguments are themselves generic types with mixed value/reference components, the canonical form reflects this. For example, `List<KeyValuePair<int, string>>` canonicalizes to `List<KeyValuePair<int, __Canon>>`. | ||
|
|
||
| When the dependency graph includes a method like `List<string>.Add`, ILC adds a dependency on the canonical method body `List<__Canon>.Add` and generates a *generic dictionary* for the `List<string>` instantiation. When ILC invokes RyuJIT to compile a method, it passes the canonical form (e.g. `List<__Canon>.Add`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like we could use a definition for generic dictionary (in another file?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Add a link to shared-generics.md to resolve this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is not a copilot PR so this request won't work.
|
|
||
| ### Runtime-determined types | ||
|
|
||
| When analyzing shared generic code, ILC uses *runtime-determined types* (`RuntimeDeterminedType`) to represent types with type arguments that will only be resolved at runtime. A runtime-determined type pairs the canonical type with the generic parameter it originated from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, when you say it "pairs" with the canonical type, what does that mean? As in, there's a pointer in the RuntimeDeterminedType structure to the canonical type? (which is represented by another data structure, I assume?)
|
|
||
| Neither the canonical form nor the uninstantiated form alone is sufficient for dependency analysis. The canonical form loses parameter identity (both `T` and `U` become `__Canon`), while the uninstantiated form with signature variables (`!!0`, `!!1`) lacks concrete type information needed for operations like `sizeof(T)`. Runtime-determined types preserve both: the parameter binding and the type's shape (size, GC layout). These types are used in dependency analysis and for communicating dictionary lookup requirements to the codegen backend. | ||
|
|
||
| ### Generic dictionaries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, it's down here. If these were separate files you could presumably use hyperlinks? Not a bad way to structure this, imho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a file that describes generic sharing and generic dictionaries (shared-generics.md)
|
|
||
| Canonical code cannot embed instantiation-specific information directly. Instead, it obtains type handles, method handles, field offsets, and other instantiation-specific data at runtime from a *generic dictionary*—an array of slots associated with each concrete instantiation. | ||
|
|
||
| The dictionary is provided to the method via the MethodTable pointer from the object reference (for instance methods on generic reference types), a hidden MethodTable parameter (for static methods or methods on value types), or a hidden method dictionary parameter (for generic methods). See `GenericContextSource`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to description of MethodTable?
| } | ||
| ``` | ||
|
|
||
| At a call site like `baseRef.Method<string>()`, the compiler needs the runtime type of `baseRef` to determine which implementation to call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, if the type argument is a struct, presumably you also need the specific instantiation.
|
|
||
| Canonical code like `List<__Canon>.Add` is shared across instantiations, but its *dependencies* may differ per instantiation. For example, if the code references `List<T>`, then `List<string>.Add` needs `List<string>` while `List<object>.Add` needs `List<object>`. | ||
|
|
||
| Shadow method nodes track these instantiation-specific dependencies without generating separate code. Both `ShadowConcreteMethodNode` and `ShadowNonConcreteMethodNode` extend `ShadowMethodNode`, which works by examining the canonical method's dependencies. Dependencies that implement `INodeWithRuntimeDeterminedDependencies` are *instantiated* with the shadow node's type/method arguments, converting abstract dependencies (e.g. "MethodTable for `List<T>`") into concrete ones (e.g. "MethodTable for `List<string>`"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly do not understand the "shadow" terminology at all.
I tried to write down some of what I learned (with much help from @MichalStrehovsky) while working on #122012 in a way that might be helpful to others learning about ILC in the future.