Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support error recovery in check for invalid parse trees #4689

Open
dwblaikie opened this issue Dec 16, 2024 · 4 comments
Open

Support error recovery in check for invalid parse trees #4689

dwblaikie opened this issue Dec 16, 2024 · 4 comments

Comments

@dwblaikie
Copy link
Contributor

Description of the bug:

No response

What did you do, or what's a simple way to reproduce the bug?

base class Foo { }
class Base {
  extend base Foo;
}
$ carbon compile a.carbon

What did you expect to happen?

In this particular case, it could've recovered with an assumed :, but I'd also (or perhaps initially/first) like to fix the more general case, where a recovery might not be feasible/correct/helpful.

What actually happened?

CHECK-fail/crash:

FATAL failure at toolchain/parse/parse.cpp:61: Invalid tree returned by Parse(): NodeId #node14 couldn't be extracted as a ClassDefinition. Trace:
Aggregate N6Carbon5Parse15ClassDefinitionE: begin
Vector: begin
NodeIdInCategory Decl: kind InvalidParseSubtree consumed
NodeIdInCategory Decl: kind InvalidParse consumed
NodeIdInCategory Decl error: kind ExtendModifier doesn't match
Vector: end
NodeIdForKind error: wrong kind ExtendModifier, expected ClassDefinitionStart
Aggregate N6Carbon5Parse15ClassDefinitionE: error
Error: ExtendModifier node left unconsumed.

Any other information, logs, or outputs that you want to share?

I know we (myself, @zygoloid and @jonmeow ) have had some conversations about this, but some was in person/un-recorded, or the records are hard for me to find/synthesize.

Currently, the failure in

if (!context.ConsumeAndAddLeafNodeIf(Lex::TokenKind::Colon,
NodeKind::BaseColon)) {
// TODO: If the next token isn't a colon or `class`, try to recover
// based on whether we're in a class, whether we have an `extend`
// modifier, and the following tokens.
CARBON_DIAGNOSTIC(ExpectedAfterBase, Error,
"`class` or `:` expected after `base`");
context.emitter().Emit(*context.position(), ExpectedAfterBase);
context.RecoverFromDeclError(state, NodeKind::BaseDecl,
/*skip_past_likely_end=*/true);
return;
}
constructs an invalid parse tree that looks like this:

  {kind: 'BaseIntroducer', text: 'base'},
  {kind: 'ExtendModifier', text: 'extend'},
{kind: 'BaseDecl', text: ';', has_error: yes, subtree_size: 3},

Specifically, this does not match the "shape" a BaseDecl expects, which is this:

  {kind: 'BaseIntroducer', text: 'base'},
  {kind: 'ExtendModifier', text: 'extend'},
  {kind: 'BaseColon', text: ':'},
  {kind: 'IdentifierNameExpr', text: 'Foo'},
{kind: 'BaseDecl', text: ';', subtree_size: 5},

And so the above crash/check-fail occurs.

I know we had talked about different representations for the failure and recovery.

But I'd like to focus on the case where recovery isn't feasible.

What should we do in that case? (eg: if the name (Foo) is missing entirely)

I thought we could, in check, skip any error nodes in Check's

while (auto maybe_node_id = traversal.Next()) {
node_id = *maybe_node_id;
auto parse_kind = context_.parse_tree().node_kind(node_id);

  while (auto maybe_node_id = traversal.Next()) {
    node_id = *maybe_node_id;
    auto parse_kind = context.parse_tree().node_kind(node_id);
    if (context.parse_tree().node_has_error(node_id)) {
      return false;
    }

But judging by the code in https://github.com/carbon-language/carbon-lang/blob/f922988c8ccaaf68e703e3126a4a26709efb460f/toolchain/check/handle_noop.cpp that's not the intent (the intent seems to be that we do walk through even the error nodes) - but perhaps the proposed skip ^ is the way to address these TODOs? (or perhaps some more general "we shouldn't check a parse tree with any invalid nodes at all" might be even more general/failing earlier?)

After this discussion, I wouldn't mind discussing how to improve the error recovery - I know we talked about some kind of "synthetic token" we could use to put the colon into the parse tree?

@jonmeow
Copy link
Contributor

jonmeow commented Dec 16, 2024

Stepping back a moment, we currently expect that (a) parse can create (subtly) invalid parse trees, and (b) check will crash on these. We do not fuzz test invalid parse trees as a consequence. So broadly speaking, I think it's fair to say crashes like this are probably common. Similarly, I think we should be cautious about trying to chase down all the issues like this unless we're ready to really commit to error recovery -- my leaning has been that we benefit from prioritizing language features first.

Regarding just bumping out, my expectation has been that we'd try to check invalid parse trees as best as possible. At a minimum, we should be okay checking a parse tree until the first invalid node. However, you might have something like:

class Foo {
  fn
}

class Bar {
  fn Foo() -> i32 { return 1.0; }
}

My own expectation is that this should print a diagnostic for the invalid fn declaration, and also a diagnostic for 1.0 not being i32. So I think that means trying to recover in check.

In this specific case, I think the right choice is to recover in parse. For now, I believe that's creating BaseColon with has_error = true. Maybe in the future we should designate that better with some kind of synthetic token, but if you want a fix right now, I think that's what it should look like.

Short-term, we could exit with a TODO in ProcessNodeIds when node_has_error.

@dwblaikie
Copy link
Contributor Author

Thanks for the discussion/details - I'll work on the short-term fix for now, after that don't mind leaving this bug open or closing for now since we aren't planning to work on this feature (check recovery from parse failures) at this time.

@jonmeow
Copy link
Contributor

jonmeow commented Dec 17, 2024

Maybe re-title the issue to reflect the broader scope of it, or I could create a new issue that captures the high-level problem/plan?

@dwblaikie dwblaikie changed the title Crash in check on missing colon in extend base Support error recovery in check for invalid parse trees Dec 17, 2024
@dwblaikie
Copy link
Contributor Author

Retitling sounds good - I could edit the main post at the top to discuss this/focus on this broader issue?

github-merge-queue bot pushed a commit that referenced this issue Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants