Skip to content

Conversation

metagn
Copy link
Collaborator

@metagn metagn commented Apr 18, 2025

fixes #24887 (really just this 1 line commit would have been enough to fix the issue but it would ignore the general problem)

When a type definition is encountered where the symbol already has a type (not a forward type), the type is left alone (not reset to tyForward) and the RHS is handled differently: The RHS is still semchecked, but the type of the symbol is not updated, and nominal type nodes are ignored entirely (specifically if they are the same kind as the symbol's existing type but this restriction is not really needed). If the existing type of the symbol is an enum and and the RHS has a nominal enum type node, the enum fields of the existing type are added to scope rather than creating a new type from the RHS and adding its symbols instead.

The goal is to prevent any incompatible nominal types from being generated during resem as in #24887. But it also restricts what macros can do if they generate type section AST, for example if we have:

type Foo = int

and a macro modifies the type section while keeping the symbol node for Foo like:

type Foo = float

Then the type of Foo will still remain int, while it previously became float. While we could maybe allow this and make it so only nominal types cannot be changed, it gets even more complex when considering generic params and whether or not they get updated. So to keep it as simple as possible the rule is that the symbol type does not change, but maybe this behavior was useful for macros.

Only nominal type nodes are ignored for semchecking on the RHS, so that cases like this do not cause a regression:

template foo(): untyped =
  proc bar() {.inject.} = discard
  int

type Foo = foo()
bar() # normally works

However this specific code exposed a problem with forward type handling:


In specific cases, when the type section is undergoing the final pass, if the type fits some overly general criteria (it is not an object, enum, alias or a sink type and its node is not a nominal type node), the entire RHS is semchecked for a 2nd time as a standalone type (with nil prev) and maybe reassigned to the new semchecked type, depending on its type kind. (for some reason including nominal types when we excluded them before?) This causes a redefinition error if the RHS defines a symbol.

This code goes all the way back to the first commit and I could not find the reason why it was there, but removing it showed a failure in thard_tyforward: If a generic forward type is invoked, it is left as an unresolved tyGenericInvocation on the first run. Semchecking it again at the end turns it into a tyGenericInst. So my understanding is that it exists to handle these loose forward types, but it is way too general and there is a similar mechanism c.skipTypes which is supposed to do the same thing but doesn't.

So this is no longer done, and c.skipTypes is revamped (and renamed): It is now a list of types and the nodes that are supposed to evaluate to them, such that types needing to be updated later due to containing forward types are added to it along with their nodes. When finishing the type section, these types are reassigned to the semchecked value of their nodes so that the forward types in them are fully resolved. The "reassigning" here works due to updating the data inside the type pointer directly, and is how forward types work by themselves normally (tyForward types are modified in place as s.typ).

For example, as mentioned before, generic invocations of forward types are first created as tyGenericInvocation and need to become tyGenericInst later. So they are now added to this list along with their node. Object types with forward types as their base types also need to be updated later to check that the base type is correct/inherit fields from it: For this the entire object type and its node are added to the list. Similarly, any case where whether a component type is tyGenericInst or tyGenericInvocation matters also needs to cascade this (set does presumably to check the instantiated type).

This is not complete: Generic invocations with forward types only check that their base type is a forward type, but not any of their arguments, which causes #16754 and #24133. The generated invocations also need to cascade properly: Foo[Bar[ForwardType]] for example would see that Bar[ForwardType] is a generic invocation and stay as a generic invocation itself, but it might not queue itself to be updated later. Even if it did, only the entire type Foo[Bar[ForwardType]] needs to be queued, updating Bar[ForwardType] by itself would be redundant or it would not change anything at all. But these can be done later.

@metagn metagn changed the title test not resetting type of type symbols on resem leave type section symbols unchanged on resem, fix overly general double semcheck for forward types Apr 21, 2025
@metagn metagn marked this pull request as ready for review April 21, 2025 05:23
@metagn
Copy link
Collaborator Author

metagn commented Apr 21, 2025

Unfortunately the issue opened a rabbit hole that required fixing forward types. If all we want is to fix (really workaround) the issue then this 1 line change is enough and not incompatible, this would work backported as well. The PR as it is now is likely to cause regressions.

The forward type changes don't depend on the rest of the PR, maybe it could be moved out and the resem changes can go in a PR after it.

@Araq Araq merged commit 525d64f into nim-lang:devel Apr 21, 2025
18 checks passed
Copy link
Contributor

Thanks for your hard work on this PR!
The lines below are statistics of the Nim compiler built from 525d64f

Hint: mm: orc; opt: speed; options: -d:release
179371 lines; 9.005s; 651.309MiB peakmem

metagn added a commit that referenced this pull request Apr 21, 2025
narimiran pushed a commit that referenced this pull request Apr 21, 2025
…ble semcheck for forward types (#24888)

fixes #24887 (really just this [1 line
commit](632c7b3)
would have been enough to fix the issue but it would ignore the general
problem)

When a type definition is encountered where the symbol already has a
type (not a forward type), the type is left alone (not reset to
`tyForward`) and the RHS is handled differently: The RHS is still
semchecked, but the type of the symbol is not updated, and nominal type
nodes are ignored entirely (specifically if they are the same kind as
the symbol's existing type but this restriction is not really needed).
If the existing type of the symbol is an enum and and the RHS has a
nominal enum type node, the enum fields of the existing type are added
to scope rather than creating a new type from the RHS and adding its
symbols instead.

The goal is to prevent any incompatible nominal types from being
generated during resem as in #24887. But it also restricts what macros
can do if they generate type section AST, for example if we have:

```nim
type Foo = int
```

and a macro modifies the type section while keeping the symbol node for
`Foo` like:

```nim
type Foo = float
```

Then the type of `Foo` will still remain `int`, while it previously
became `float`. While we could maybe allow this and make it so only
nominal types cannot be changed, it gets even more complex when
considering generic params and whether or not they get updated. So to
keep it as simple as possible the rule is that the symbol type does not
change, but maybe this behavior was useful for macros.

Only nominal type nodes are ignored for semchecking on the RHS, so that
cases like this do not cause a regression:

```nim
template foo(): untyped =
  proc bar() {.inject.} = discard
  int

type Foo = foo()
bar() # normally works
```

However this specific code exposed a problem with forward type handling:

---

In specific cases, when the type section is undergoing the final pass,
if the type fits some overly general criteria (it is not an object,
enum, alias or a sink type and its node is not a nominal type node), the
entire RHS is semchecked for a 2nd time as a standalone type (with `nil`
prev) and *maybe* reassigned to the new semchecked type, depending on
its type kind. (for some reason including nominal types when we excluded
them before?) This causes a redefinition error if the RHS defines a
symbol.

This code goes all the way back to the first commit and I could not find
the reason why it was there, but removing it showed a failure in
`thard_tyforward`: If a generic forward type is invoked, it is left as
an unresolved `tyGenericInvocation` on the first run. Semchecking it
again at the end turns it into a `tyGenericInst`. So my understanding is
that it exists to handle these loose forward types, but it is way too
general and there is a similar mechanism `c.skipTypes` which is supposed
to do the same thing but doesn't.

So this is no longer done, and `c.skipTypes` is revamped (and renamed):
It is now a list of types and the nodes that are supposed to evaluate to
them, such that types needing to be updated later due to containing
forward types are added to it along with their nodes. When finishing the
type section, these types are reassigned to the semchecked value of
their nodes so that the forward types in them are fully resolved. The
"reassigning" here works due to updating the data inside the type
pointer directly, and is how forward types work by themselves normally
(`tyForward` types are modified in place as `s.typ`).

For example, as mentioned before, generic invocations of forward types
are first created as `tyGenericInvocation` and need to become
`tyGenericInst` later. So they are now added to this list along with
their node. Object types with forward types as their base types also
need to be updated later to check that the base type is correct/inherit
fields from it: For this the entire object type and its node are added
to the list. Similarly, any case where whether a component type is
`tyGenericInst` or `tyGenericInvocation` matters also needs to cascade
this (`set` does presumably to check the instantiated type).

This is not complete: Generic invocations with forward types only check
that their base type is a forward type, but not any of their arguments,
which causes #16754 and #24133. The generated invocations also need to
cascade properly: `Foo[Bar[ForwardType]]` for example would see that
`Bar[ForwardType]` is a generic invocation and stay as a generic
invocation itself, but it might not queue itself to be updated later.
Even if it did, only the entire type `Foo[Bar[ForwardType]]` needs to be
queued, updating `Bar[ForwardType]` by itself would be redundant or it
would not change anything at all. But these can be done later.

(cherry picked from commit 525d64f)
narimiran pushed a commit that referenced this pull request Apr 21, 2025
narimiran added a commit that referenced this pull request Apr 21, 2025
…eral double semcheck for forward types (#24888)"

This reverts commit cfe8909.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

const values of distinct ints are coerced to base type literals (in expandMacros block)

2 participants