-
Notifications
You must be signed in to change notification settings - Fork 259
Description
Describe the Bug
This issue is to track a large ongoing effort of related features in Pyrefly around type-aware control flow.
For the most part, python control flow is independent of types (this makes sense - the actual runtime control flow is in some sense always independent - the compiler has no knowledge of types, so it's impossible for static types to affect the bytecode per se). This is a property Pyrefly is built around, in the sense that control flow analysis happens at binding time in an AST traversal before we have types, and we do type analysis on a raw graph generated from then.
But there are cases where a type checker should account for types in control flow. The cases I'm aware of fall into two buckets:
- Cases where type annotations or other structural properties of code that a type checker analyzes are used to actually declare control flow properties of code. Examples of this include
NeverandNoReturnfunction calls that indicate termination of control flow and the presence of error handlers in context managers - Cases where runtime type introspection affects control flow directly and a static type checker is expected to understand. This mostly comes up in narrowing with things like
isinstancechecks.
The main reason this presents a problem is that when we hit a branching data structure, we need to know whether the branching was exhaustive or not in order to know (a) whether the control flow terminates (if all branches terminate, the entire flow does) for things like return type inference and (b) whether the base flow is part of the Phi that creates the environment below the branching (which affects types, uninitialized local checks, and so forth below the branch).
This task tracks the effort to handle know gaps:
- implement support for type aware control flow termination (Never and NoReturn)
- implement type aware exhaustiveness for if/else and match statements
Each of these is already partially implemented on trunk (as of writing this issue I've already begun) but it needs to be feature complete which is essentially a single large issue: the two problems have to be solved together to fully work from a user perspective because termination in branches impacts termination and Phi behavior of the branching statement, so a gap at any point will frequently lead to incorrect results on examples that stress test anything.
The things we need are:
- basic support for Never / NoReturn analysis to understand branch termination (done)
- support for accounting for ^^ when creating a Phi for the joined types (done)
- support for accounting for ^^ in uninitialized local checks (in progress)
- basic plumbing for understanding exhaustiveness of match and if/else (partially done but very limited)
- support for accounting for ^^ + termination in return analysis (partially done)
- support for accounting for ^^ in the Phi (not done)
- support for accounting for ^^ in uninitialized local check (not done)
- making sure that these things compose properly
- it should already be true that Phis compose, so nothing to do there I think
- deferred uninitialized local check information has to compose (not done)
- termination analysis has to compose (i.e. an exhaustive branching statement where all branches
terminate, if it is that last statement inside a branch of an enclosing branching statement, needs
to terminate that branch): without this, all the other features will break as soon as we nest
exhaustivematchandif/elsestatements inside one another.
All of that plumbing is more or less independent of the actual type analysis for exhaustiveness - it can and should be built with only the exhaustiveness checks we support today, which are basically bool, Literal, and Enum types.
Then there's a separate and ambiguously-scoped problem of expanding exhaustiveness checks beyond just the simple types where we support them today:
- We definitely need them for unions of types when cases fully "consume" the type (e.g. a
A | Bwhere we matchA()andB() - We definitely need to then handle some amount of type composition for ^^
- bound and constrained TypeVars need to work for sure
- intersections might need to work?
- we can likely skip this in v0, but it probably is needed, e.g. if code narrows to a union
and later exhaustively match on the union we'll have gaps if we don't handle intersection
- we can likely skip this in v0, but it probably is needed, e.g. if code narrows to a union
- We might need to implement some more edge cases for relatively simple union matches:
- Typed dicts that are discriminated on a single field are the main edge case I know right now
- Then, we can potentially consider pattern matches that decompose
- This is where the problem becomes much harder because
- The complexity of the code itself increases dramatically
- The complexity of the type analysis increases proportional to the depth of of decomposition
- This is where the problem becomes much harder because
This task description explicitly captures my judgment that we should fully implement the control flow analysis itself, which is a nontrivial but fixed-scope problem, before we worry too much about expanding the types that are relevant.
In addition to the feature list above, which is mandatory, we also have non-exhaustiveness warnings on match statements. This is unspecified and it's an open question which kinds of matches should be expected to be exhaustive: all of them? Just enums and bools? Some larger set? Or do we need different error kinds so people can opt into different levels of rigor?
I generally consider the details exhaustiveness warnings out-of-scope for this issue, because they are not a mandatory feature, and it's unclear to me whether most Python devs today want their match statements to be exhaustive. But as we implement we'll at least have to make decisions about when to produce the existing warning.
Sandbox Link
No response
(Only applicable for extension issues) IDE Information
No response