Skip to content

[umbrella] Type aware control flow #2219

@stroxler

Description

@stroxler

Describe the Bug

This issue is to track a large ongoing effort of related features in Pyrefly around type-aware control flow.

For the most part, python control flow is independent of types (this makes sense - the actual runtime control flow is in some sense always independent - the compiler has no knowledge of types, so it's impossible for static types to affect the bytecode per se). This is a property Pyrefly is built around, in the sense that control flow analysis happens at binding time in an AST traversal before we have types, and we do type analysis on a raw graph generated from then.

But there are cases where a type checker should account for types in control flow. The cases I'm aware of fall into two buckets:

  • Cases where type annotations or other structural properties of code that a type checker analyzes are used to actually declare control flow properties of code. Examples of this include Never and NoReturn function calls that indicate termination of control flow and the presence of error handlers in context managers
  • Cases where runtime type introspection affects control flow directly and a static type checker is expected to understand. This mostly comes up in narrowing with things like isinstance checks.

The main reason this presents a problem is that when we hit a branching data structure, we need to know whether the branching was exhaustive or not in order to know (a) whether the control flow terminates (if all branches terminate, the entire flow does) for things like return type inference and (b) whether the base flow is part of the Phi that creates the environment below the branching (which affects types, uninitialized local checks, and so forth below the branch).

This task tracks the effort to handle know gaps:

  • implement support for type aware control flow termination (Never and NoReturn)
  • implement type aware exhaustiveness for if/else and match statements

Each of these is already partially implemented on trunk (as of writing this issue I've already begun) but it needs to be feature complete which is essentially a single large issue: the two problems have to be solved together to fully work from a user perspective because termination in branches impacts termination and Phi behavior of the branching statement, so a gap at any point will frequently lead to incorrect results on examples that stress test anything.

The things we need are:

  • basic support for Never / NoReturn analysis to understand branch termination (done)
    • support for accounting for ^^ when creating a Phi for the joined types (done)
    • support for accounting for ^^ in uninitialized local checks (in progress)
  • basic plumbing for understanding exhaustiveness of match and if/else (partially done but very limited)
    • support for accounting for ^^ + termination in return analysis (partially done)
    • support for accounting for ^^ in the Phi (not done)
    • support for accounting for ^^ in uninitialized local check (not done)
  • making sure that these things compose properly
    • it should already be true that Phis compose, so nothing to do there I think
    • deferred uninitialized local check information has to compose (not done)
    • termination analysis has to compose (i.e. an exhaustive branching statement where all branches
      terminate, if it is that last statement inside a branch of an enclosing branching statement, needs
      to terminate that branch): without this, all the other features will break as soon as we nest
      exhaustive match and if/else statements inside one another.

All of that plumbing is more or less independent of the actual type analysis for exhaustiveness - it can and should be built with only the exhaustiveness checks we support today, which are basically bool, Literal, and Enum types.

Then there's a separate and ambiguously-scoped problem of expanding exhaustiveness checks beyond just the simple types where we support them today:

  • We definitely need them for unions of types when cases fully "consume" the type (e.g. a A | B where we match A() and B()
  • We definitely need to then handle some amount of type composition for ^^
    • bound and constrained TypeVars need to work for sure
    • intersections might need to work?
      • we can likely skip this in v0, but it probably is needed, e.g. if code narrows to a union
        and later exhaustively match on the union we'll have gaps if we don't handle intersection
  • We might need to implement some more edge cases for relatively simple union matches:
    • Typed dicts that are discriminated on a single field are the main edge case I know right now
  • Then, we can potentially consider pattern matches that decompose
    • This is where the problem becomes much harder because
      • The complexity of the code itself increases dramatically
      • The complexity of the type analysis increases proportional to the depth of of decomposition

This task description explicitly captures my judgment that we should fully implement the control flow analysis itself, which is a nontrivial but fixed-scope problem, before we worry too much about expanding the types that are relevant.


In addition to the feature list above, which is mandatory, we also have non-exhaustiveness warnings on match statements. This is unspecified and it's an open question which kinds of matches should be expected to be exhaustive: all of them? Just enums and bools? Some larger set? Or do we need different error kinds so people can opt into different levels of rigor?

I generally consider the details exhaustiveness warnings out-of-scope for this issue, because they are not a mandatory feature, and it's unclear to me whether most Python devs today want their match statements to be exhaustive. But as we implement we'll at least have to make decisions about when to produce the existing warning.

Sandbox Link

No response

(Only applicable for extension issues) IDE Information

No response

Metadata

Metadata

Assignees

Labels

typecheckingumbrellaAn issue that is an umbrella for work tracked in other issues

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions