-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
leave type section symbols unchanged on resem, fix overly general double semcheck for forward types #24888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Unfortunately the issue opened a rabbit hole that required fixing forward types. If all we want is to fix (really workaround) the issue then this 1 line change is enough and not incompatible, this would work backported as well. The PR as it is now is likely to cause regressions. The forward type changes don't depend on the rest of the PR, maybe it could be moved out and the resem changes can go in a PR after it. |
Thanks for your hard work on this PR! Hint: mm: orc; opt: speed; options: -d:release |
…ble semcheck for forward types (#24888) fixes #24887 (really just this [1 line commit](632c7b3) would have been enough to fix the issue but it would ignore the general problem) When a type definition is encountered where the symbol already has a type (not a forward type), the type is left alone (not reset to `tyForward`) and the RHS is handled differently: The RHS is still semchecked, but the type of the symbol is not updated, and nominal type nodes are ignored entirely (specifically if they are the same kind as the symbol's existing type but this restriction is not really needed). If the existing type of the symbol is an enum and and the RHS has a nominal enum type node, the enum fields of the existing type are added to scope rather than creating a new type from the RHS and adding its symbols instead. The goal is to prevent any incompatible nominal types from being generated during resem as in #24887. But it also restricts what macros can do if they generate type section AST, for example if we have: ```nim type Foo = int ``` and a macro modifies the type section while keeping the symbol node for `Foo` like: ```nim type Foo = float ``` Then the type of `Foo` will still remain `int`, while it previously became `float`. While we could maybe allow this and make it so only nominal types cannot be changed, it gets even more complex when considering generic params and whether or not they get updated. So to keep it as simple as possible the rule is that the symbol type does not change, but maybe this behavior was useful for macros. Only nominal type nodes are ignored for semchecking on the RHS, so that cases like this do not cause a regression: ```nim template foo(): untyped = proc bar() {.inject.} = discard int type Foo = foo() bar() # normally works ``` However this specific code exposed a problem with forward type handling: --- In specific cases, when the type section is undergoing the final pass, if the type fits some overly general criteria (it is not an object, enum, alias or a sink type and its node is not a nominal type node), the entire RHS is semchecked for a 2nd time as a standalone type (with `nil` prev) and *maybe* reassigned to the new semchecked type, depending on its type kind. (for some reason including nominal types when we excluded them before?) This causes a redefinition error if the RHS defines a symbol. This code goes all the way back to the first commit and I could not find the reason why it was there, but removing it showed a failure in `thard_tyforward`: If a generic forward type is invoked, it is left as an unresolved `tyGenericInvocation` on the first run. Semchecking it again at the end turns it into a `tyGenericInst`. So my understanding is that it exists to handle these loose forward types, but it is way too general and there is a similar mechanism `c.skipTypes` which is supposed to do the same thing but doesn't. So this is no longer done, and `c.skipTypes` is revamped (and renamed): It is now a list of types and the nodes that are supposed to evaluate to them, such that types needing to be updated later due to containing forward types are added to it along with their nodes. When finishing the type section, these types are reassigned to the semchecked value of their nodes so that the forward types in them are fully resolved. The "reassigning" here works due to updating the data inside the type pointer directly, and is how forward types work by themselves normally (`tyForward` types are modified in place as `s.typ`). For example, as mentioned before, generic invocations of forward types are first created as `tyGenericInvocation` and need to become `tyGenericInst` later. So they are now added to this list along with their node. Object types with forward types as their base types also need to be updated later to check that the base type is correct/inherit fields from it: For this the entire object type and its node are added to the list. Similarly, any case where whether a component type is `tyGenericInst` or `tyGenericInvocation` matters also needs to cascade this (`set` does presumably to check the instantiated type). This is not complete: Generic invocations with forward types only check that their base type is a forward type, but not any of their arguments, which causes #16754 and #24133. The generated invocations also need to cascade properly: `Foo[Bar[ForwardType]]` for example would see that `Bar[ForwardType]` is a generic invocation and stay as a generic invocation itself, but it might not queue itself to be updated later. Even if it did, only the entire type `Foo[Bar[ForwardType]]` needs to be queued, updating `Bar[ForwardType]` by itself would be redundant or it would not change anything at all. But these can be done later. (cherry picked from commit 525d64f)
fixes #24887 (really just this 1 line commit would have been enough to fix the issue but it would ignore the general problem)
When a type definition is encountered where the symbol already has a type (not a forward type), the type is left alone (not reset to
tyForward
) and the RHS is handled differently: The RHS is still semchecked, but the type of the symbol is not updated, and nominal type nodes are ignored entirely (specifically if they are the same kind as the symbol's existing type but this restriction is not really needed). If the existing type of the symbol is an enum and and the RHS has a nominal enum type node, the enum fields of the existing type are added to scope rather than creating a new type from the RHS and adding its symbols instead.The goal is to prevent any incompatible nominal types from being generated during resem as in #24887. But it also restricts what macros can do if they generate type section AST, for example if we have:
and a macro modifies the type section while keeping the symbol node for
Foo
like:Then the type of
Foo
will still remainint
, while it previously becamefloat
. While we could maybe allow this and make it so only nominal types cannot be changed, it gets even more complex when considering generic params and whether or not they get updated. So to keep it as simple as possible the rule is that the symbol type does not change, but maybe this behavior was useful for macros.Only nominal type nodes are ignored for semchecking on the RHS, so that cases like this do not cause a regression:
However this specific code exposed a problem with forward type handling:
In specific cases, when the type section is undergoing the final pass, if the type fits some overly general criteria (it is not an object, enum, alias or a sink type and its node is not a nominal type node), the entire RHS is semchecked for a 2nd time as a standalone type (with
nil
prev) and maybe reassigned to the new semchecked type, depending on its type kind. (for some reason including nominal types when we excluded them before?) This causes a redefinition error if the RHS defines a symbol.This code goes all the way back to the first commit and I could not find the reason why it was there, but removing it showed a failure in
thard_tyforward
: If a generic forward type is invoked, it is left as an unresolvedtyGenericInvocation
on the first run. Semchecking it again at the end turns it into atyGenericInst
. So my understanding is that it exists to handle these loose forward types, but it is way too general and there is a similar mechanismc.skipTypes
which is supposed to do the same thing but doesn't.So this is no longer done, and
c.skipTypes
is revamped (and renamed): It is now a list of types and the nodes that are supposed to evaluate to them, such that types needing to be updated later due to containing forward types are added to it along with their nodes. When finishing the type section, these types are reassigned to the semchecked value of their nodes so that the forward types in them are fully resolved. The "reassigning" here works due to updating the data inside the type pointer directly, and is how forward types work by themselves normally (tyForward
types are modified in place ass.typ
).For example, as mentioned before, generic invocations of forward types are first created as
tyGenericInvocation
and need to becometyGenericInst
later. So they are now added to this list along with their node. Object types with forward types as their base types also need to be updated later to check that the base type is correct/inherit fields from it: For this the entire object type and its node are added to the list. Similarly, any case where whether a component type istyGenericInst
ortyGenericInvocation
matters also needs to cascade this (set
does presumably to check the instantiated type).This is not complete: Generic invocations with forward types only check that their base type is a forward type, but not any of their arguments, which causes #16754 and #24133. The generated invocations also need to cascade properly:
Foo[Bar[ForwardType]]
for example would see thatBar[ForwardType]
is a generic invocation and stay as a generic invocation itself, but it might not queue itself to be updated later. Even if it did, only the entire typeFoo[Bar[ForwardType]]
needs to be queued, updatingBar[ForwardType]
by itself would be redundant or it would not change anything at all. But these can be done later.