Skip to content

Preserve Literal types through TypeVar solving for unions#2883

Closed
migeed-z wants to merge 2 commits intofacebook:mainfrom
migeed-z:export-D97844612
Closed

Preserve Literal types through TypeVar solving for unions#2883
migeed-z wants to merge 2 commits intofacebook:mainfrom
migeed-z:export-D97844612

Conversation

@migeed-z
Copy link
Contributor

@migeed-z migeed-z commented Mar 23, 2026

Summary:
When solving a TypeVar from a union of implicit literals (e.g.,
Literal["a"] | Literal["b"] from tuple element types), skip literal
promotion. Previously, enumerate(("a", "b")) would promote the literals
to str when solving enumerate[T], losing type information.

The fix: when the type being checked against a TypeVar is a union where
all members are implicit literals, preserve them as-is instead of promoting
to the base type.

Fixes #1323

Differential Revision: D97844612

Summary:
Iterating over a tuple of literals with `enumerate()` promotes the
Literal types to their base type (e.g., `Literal["a"]` becomes `str`),
while direct iteration preserves them. This is because `enumerate[T]`
solves `T` via TypeVar solving, which promotes implicit literals.

Adds a bug-marked test capturing the current behavior.

Related: facebook#1323

Differential Revision: D97841359
@meta-cla meta-cla bot added the cla signed label Mar 23, 2026
Summary:
When solving a TypeVar from a union of implicit literals (e.g.,
Literal["a"] | Literal["b"] from tuple element types), skip literal
promotion. Previously, enumerate(("a", "b")) would promote the literals
to str when solving enumerate[T], losing type information.

The fix: when the type being checked against a TypeVar is a union where
all members are implicit literals, preserve them as-is instead of promoting
to the base type.

Fixes facebook#1323

Differential Revision: D97844612
@meta-codesync
Copy link

meta-codesync bot commented Mar 23, 2026

@migeed-z has exported this pull request. If you are a Meta employee, you can view the originating Diff in D97844612.

@github-actions
Copy link

Diff from mypy_primer, showing the effect of this PR on open source code:

vision (https://github.com/pytorch/vision)
+ ERROR torchvision/prototype/datasets/_builtin/imagenet.py:157:20-161:10: No matching overload found for function `dict.__init__` called with arguments: (dict[Literal['label', 'wnid'], Label | str], path=str, image=EncodedImage) [no-matching-overload]
+ ERROR torchvision/prototype/datasets/_builtin/mnist.py:402:22-409:10: No matching overload found for function `typing.MutableMapping.update` called with arguments: (dict[Literal['digit_index', 'global_digit_index', 'nist_hsf_series', 'nist_label', 'nist_writer_id'], int]) [no-matching-overload]
+ ERROR torchvision/prototype/datasets/_builtin/mnist.py:410:22-102: No matching overload found for function `typing.MutableMapping.update` called with arguments: (dict[Literal['duplicate', 'unused'], bool]) [no-matching-overload]

pandas (https://github.com/pandas-dev/pandas)
+ ERROR pandas/core/computation/ops.py:326:28-31: No matching overload found for function `typing.MutableMapping.update` called with arguments: (dict[Literal['!=', '<', '<=', '==', '>', '>=', 'in', 'not in'], ((a: _SupportsComparison, b: _SupportsComparison, /) -> Any) | ((a: object, b: object, /) -> Any) | ((x: Unknown, y: Unknown) -> Unknown)] | dict[Literal['%', '*', '**', '+', '-', '/', '//'], ((a: Any, b: Any, /) -> Any)] | dict[Literal['&', 'and', 'or', '|'], ((a: Any, b: Any, /) -> Any)]) [no-matching-overload]
- ERROR pandas/io/html.py:967:35-39: Argument `str` is not assignable to parameter `flavor` with type `Literal['bs4', 'html5lib', 'lxml'] | None` in function `_parser_dispatch` [bad-argument-type]
+ ERROR pandas/io/html.py:967:35-39: Argument `Literal['bs4', 'lxml'] | str` is not assignable to parameter `flavor` with type `Literal['bs4', 'html5lib', 'lxml'] | None` in function `_parser_dispatch` [bad-argument-type]

core (https://github.com/home-assistant/core)
+ ERROR homeassistant/components/withings/sensor.py:691:12-18: Returned type `set[Literal['sleep', 'steps', 'weight']]` is not assignable to declared return type `set[str]` [bad-return]

parso (https://github.com/davidhalter/parso)
+ ERROR parso/python/errors.py:753:44-57: Argument `Literal['annotations']` is not assignable to parameter `object` with type `Literal['absolute_import', 'division', 'generator_stop', 'generators', 'nested_scopes', 'print_function', 'unicode_literals', 'with_statement']` in function `list.append` [bad-argument-type]

jax (https://github.com/google/jax)
+ ERROR jax/experimental/pallas/ops/gpu/ragged_dot_mgpu.py:299:23-29: `dict[Literal['block_k', 'block_m', 'block_n', 'grid_block_n', 'max_concurrent_steps'], int]` is not assignable to variable `best_kwargs` with type `dict[str, int]` [bad-assignment]

@github-actions
Copy link

Primer Diff Classification

❌ 4 regression(s) | ✅ 1 improvement(s) | 5 project(s) total | +8, -1 errors

4 regression(s) across vision, pandas, core, parso. error kinds: no-matching-overload on dict/MutableMapping with Literal keys from zip(), no-matching-overload on dict.update with preserved literals, bad-argument-type message change on html.py. 1 improvement(s) across jax.

Project Verdict Changes Error Kinds Root Cause
vision ❌ Regression +3 no-matching-overload on dict/MutableMapping with Literal keys from zip() pyrefly/lib/solver/solver.rs
pandas ❌ Regression +2, -1 no-matching-overload on dict.update with preserved literals pyrefly/lib/solver/solver.rs
core ❌ Regression +1 bad-return pyrefly/lib/solver/solver.rs
parso ❌ Regression +1 False positive from overly precise literal inference on list() pyrefly/lib/solver/solver.rs
jax ✅ Improvement +1 dict invariance with preserved Literal key types pyrefly/lib/solver/solver.rs
Detailed analysis

❌ Regression (4)

vision (+3)

no-matching-overload on dict/MutableMapping with Literal keys from zip(): All 3 errors stem from the same root cause: the PR's change to preserve Literal types through TypeVar solving causes zip() with tuple-of-string-literals to produce dicts with Literal key types instead of str. These Literal-keyed dicts then fail overload resolution for dict.__init__ and MutableMapping.update, even though Literal string types are subtypes of str and should be compatible. For the mnist.py cases, sample is typed as dict[str, Any], so calling .update() with dict[Literal[...], int] or dict[Literal[...], bool] should be valid since the Literal keys are subtypes of str and int/bool are subtypes of Any. The code is correct and works at runtime. Neither mypy nor pyright flags these. These are regressions — false positives caused by the PR's Literal preservation interacting poorly with pyrefly's overload resolution.

Overall: The PR preserves Literal types through TypeVar solving for unions of implicit literals. While this is beneficial for the enumerate use case shown in the test, it has a negative side effect: zip() with a tuple of string literals now produces dict[Literal['label', 'wnid'], ...] instead of dict[str, ...]. This more precise Literal union type should still be compatible with str-keyed dict operations since Literal['label', 'wnid'] is a subtype of str, but pyrefly's overload resolution fails to match these Literal-keyed dicts against the overloads for dict.__init__ and MutableMapping.update. The code is perfectly valid Python — dict(zip(('label', 'wnid'), values)) is a standard idiom, and sample (typed dict[str, Any]) calling .update() with a dict[Literal[...], int] should work since Literal strings are subtypes of str. Neither mypy nor pyright flags these. These are false positives introduced by the PR's Literal preservation interacting poorly with overload resolution.

Attribution: The change in pyrefly/lib/solver/solver.rs in the Subset impl block modifies TypeVar solving to skip literal promotion when the type is a union of implicit literals. This causes zip(("label", "wnid"), ...) to produce dict[Literal["label"] | Literal["wnid"], ...] instead of dict[str, ...]. The resulting Literal-keyed dict type then fails to match the overload signatures of dict.__init__ and MutableMapping.update, producing the three no-matching-overload errors.

pandas (+2, -1)

no-matching-overload on dict.update with preserved literals: The PR preserves literal types in dict keys from zip(tuple_of_literals, ...), causing _binary_ops_dict.update(d) to fail overload resolution when d is a union of dicts with different Literal key types. This is a false positive — dict.update should accept these dicts since all Literal string types are subtypes of str. Pyrefly-only, not flagged by mypy or pyright.
bad-argument-type message change on html.py: The error message changed from str to Literal['bs4', 'lxml'] | str due to literal preservation. The _validate_flavor function has no return type annotation. When flavor is None, it assigns flavor = "lxml", "bs4" — with the PR, these are preserved as literals. Other code paths (str input, arbitrary iterable) still produce str. So the inferred element type when iterating the return value is Literal['bs4', 'lxml'] | str. The underlying issue is the same — _validate_flavor returns an untyped tuple so the element type doesn't narrow to HTMLFlavors. This is a neutral message change for an existing error.

Overall: The PR preserves literal types through TypeVar solving for unions of implicit literals. This is a targeted fix for enumerate over literal tuples, but it has unintended side effects:

ops.py line 326 (new no-matching-overload): CMP_OPS_SYMS is a tuple of literal strings ('>', '<', '>=', '<=', '==', '!=', 'in', 'not in'). After the PR, dict(zip(CMP_OPS_SYMS, _cmp_ops_funcs)) produces a dict with Literal['>', '<', ...] keys instead of str keys. Similarly for _bool_ops_dict and _arith_ops_dict. When the loop calls _binary_ops_dict.update(d) where d is a union of these three dict types with different Literal key types, pyrefly can't find a matching overload for MutableMapping.update. This is a false positive — the code is perfectly valid; dict.update() should accept any dict[str, V] or dict[Literal[...], V] since Literal[...] is a subtype of str. Neither mypy nor pyright flag this.

html.py line 967 (changed error message): The old error reported str not assignable; the new error reports Literal['bs4', 'lxml'] | str not assignable. The _validate_flavor function has no return type annotation. When flavor is None, it sets flavor = "lxml", "bs4" (a tuple of two literal strings). With the PR's literal preservation, this path produces Literal['lxml', 'bs4'] elements. But other code paths (when flavor is a str or arbitrary iterable) produce str elements. So the inferred element type when iterating the returned tuple is Literal['bs4', 'lxml'] | str. The underlying issue is the same — _validate_flavor has no return type annotation, so the inferred type doesn't narrow to HTMLFlavors. The error message changed slightly due to the literal preservation, but it's the same false positive. Net effect: one error replaced with a slightly different version (neutral), plus one new false positive added.

The new no-matching-overload error on ops.py is clearly a regression — it's a false positive caused by the PR's overly broad literal preservation interacting poorly with dict.update() overload resolution.

Attribution: The change in pyrefly/lib/solver/solver.rs in the Subset impl modifies TypeVar solving to skip literal promotion when the type is a union of all implicit literals. This causes: (1) In ops.py, dict(zip(CMP_OPS_SYMS, _cmp_ops_funcs, strict=True)) now preserves the literal string keys from the tuples (e.g., Literal['>', '<', ...]) instead of promoting to str. When iterating over the three dicts and calling .update(d), the union of these three dict types with different Literal key sets doesn't match any overload of MutableMapping.update. (2) In html.py, _validate_flavor's default ('lxml', 'bs4') now preserves as Literal['lxml'] | Literal['bs4'] instead of promoting to str, changing the inferred type of flav when iterating.

core (+1)

This is a regression. The PR's change to preserve literal types through TypeVar solving has a side effect: when iterating over a tuple of module-level string constants (STEP_GOAL = "steps", SLEEP_GOAL = "sleep", WEIGHT_GOAL = "weight") and adding them to a set via .add(), the set's element type is now inferred as a union of literals rather than str. The function get_current_goals declares -> set[str] and returns a set built by calling result = set() and then result.add(goal) in a loop. Because the PR now preserves literal types through TypeVar solving, the type parameter of set.add(self, T) -> None resolves to the literal types of these constants, causing the set to be inferred as set[Literal['sleep', 'steps', 'weight']]. Because set is invariant, set[Literal['sleep', 'steps', 'weight']] is not assignable to set[str]. However, this is a case where the type checker is being too strict — neither mypy nor pyright flag this pattern. The set is constructed locally with result = set() (which should be inferred as set[str] given the return type annotation context), and the .add() calls with literal strings should not narrow the set's type parameter. The real issue is that pyrefly is now inferring a more specific type for the set than necessary and then failing to widen it back when checking against the return type. This is a false positive that creates noise in real-world code.
Attribution: The change in pyrefly/lib/solver/solver.rs in the Subset impl preserves literal types through TypeVar solving for unions of implicit literals instead of promoting them to their base types. Previously, iterating over (STEP_GOAL, SLEEP_GOAL, WEIGHT_GOAL) — which are module-level string constants assigned literal values 'steps', 'sleep', 'weight' — would promote the literals to str. Now pyrefly preserves them as Literal['sleep'] | Literal['steps'] | Literal['weight'], causing the inferred type of result to be set[Literal['sleep', 'steps', 'weight']] instead of set[str]. This then triggers the bad-return error because set is invariant.

parso (+1)

False positive from overly precise literal inference on list(): The PR preserves literal unions through TypeVar solving, which causes list(ALLOWED_FUTURES) to be typed as list[Literal[...specific values...]] instead of list[str]. This prevents appending new string values that aren't in the original literal union, which is a common and valid pattern. This is a regression — a new false positive.

Overall: The analysis is factually correct. The key claims are:

  1. ALLOWED_FUTURES is a tuple of string literals (line 16-19), which is correct.
  2. list(ALLOWED_FUTURES) on line 751 would, with literal type preservation through TypeVar solving, produce list[Literal['absolute_import', 'division', 'generator_stop', 'generators', 'nested_scopes', 'print_function', 'unicode_literals', 'with_statement']] instead of list[str]. This is correct because list() is defined as class list(Generic[T]) with __init__(self, iterable: Iterable[T]), and if the tuple type preserves its literal element types, the TypeVar T would be solved to the union of those literal types.
  3. allowed_futures.append('annotations') on line 753 would then fail because Literal['annotations'] is not assignable to the parameter type which would be the literal union type. The list.append method signature is append(self, object: T) -> None, and Literal['annotations'] is not a member of that literal union. This is correct.
  4. The error message confirms this: it says Literal['annotations'] is not assignable to the parameter type which is the union of the specific literal values.
  5. The annotation [mypy: no, pyright: yes] is consistent with the claim that pyright also flags this but mypy does not.
  6. The code is clearly valid at runtime - it creates a mutable list copy and conditionally appends to it.
  7. The characterization as a false positive from overly precise literal inference is accurate.

All factual claims check out against the source code and Python type system behavior.

Attribution: The change in pyrefly/lib/solver/solver.rs in the Subset impl specifically preserves unions of implicit literals through TypeVar solving instead of promoting them to their base type (str). This means when list(ALLOWED_FUTURES) is called, the tuple type tuple[Literal['nested_scopes'], Literal['generators'], ...] gets preserved as list[Literal['nested_scopes'] | Literal['generators'] | ...] instead of being promoted to list[str]. This causes the subsequent allowed_futures.append('annotations') to fail because Literal['annotations'] is not in the inferred literal union.

✅ Improvement (1)

jax (+1)

dict invariance with preserved Literal key types: The PR preserves Literal types through TypeVar solving, causing dict(zip(names, config)) to infer dict[Literal['block_k', ...], int] instead of dict[str, int]. Due to dict invariance, this more precise type is not assignable to the annotated dict[str, int]. Pyright agrees this is a type error. This is a correct error arising from improved type inference — the code genuinely has a type-level invariance mismatch, even though it works at runtime.

Overall: This is a real type-level issue caused by dict invariance. dict is invariant in its key type per the typing spec (https://typing.readthedocs.io/en/latest/spec/generics.html#variance). A dict[Literal['block_k', ...], int] is not assignable to dict[str, int] because you could theoretically write best_kwargs['arbitrary_key'] = 42 through the dict[str, int] reference, which would violate the more specific type. Pyright agrees with this assessment. However, this is a consequence of the PR's change to preserve literal types through TypeVar solving — the more precise inference is correct, and the resulting error is a genuine (if pedantic) type-level invariance violation. The code is annotated dict[str, int] and the inferred type is more specific; this is a known friction point with dict invariance. Since pyright also flags this, it's a real type issue that pyrefly is now correctly catching due to better type inference.

Attribution: The change in pyrefly/lib/solver/solver.rs in the Subset impl preserves literal types through TypeVar solving for unions of implicit literals. Previously, dict(zip(names, config)) where names is a tuple of literal strings would have the key type promoted to str, producing dict[str, int]. Now, the literal types are preserved, producing dict[Literal['block_k', 'block_m', 'block_n', 'grid_block_n', 'max_concurrent_steps'], int]. This more precise type then fails the invariance check when assigned to best_kwargs: dict[str, int].

Suggested fixes

Summary: The PR's change to preserve literal unions through TypeVar solving causes false positives when literal-typed collections interact with invariant generic containers (dict, set, list) and their mutation methods (update, append, add).

1. In the Subset impl in pyrefly/lib/solver/solver.rs, the guard condition matches!(t1, Type::Union(u) if u.members.iter().all(|t| t.[is_implicit_literal()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/solver/solver.rs))) is too broad — it preserves literal unions in ALL TypeVar solving contexts, including when solving the type parameter for mutable containers like list(), dict(), set(), and their methods. The fix should narrow the scope: only preserve literal unions when the TypeVar is being solved in a covariant or read-only context (e.g., function parameter matching like enumerate), NOT when it's being used to determine the type parameter of a mutable generic container. A more targeted approach: instead of checking whether t1 is a union of all implicit literals, check whether the quantified variable q has a variance that is covariant. If the TypeVar is invariant (as it is for list[T], dict[K, V], set[T]), still promote implicit literals to their base types. Pseudo-code: change the condition from if matches!(t1, Type::Union(u) if u.members.iter().all(|t| t.[is_implicit_literal()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/solver/solver.rs))) to if matches!(t1, Type::Union(u) if u.members.iter().all(|t| t.[is_implicit_literal()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/solver/solver.rs))) && q.kind() == QuantifiedKind::Covariant. This way, enumerate (which uses covariant iteration) preserves literals, but list(), dict(), set() (which have invariant type parameters) promote literals to their base types.

Files: pyrefly/lib/solver/solver.rs
Confidence: medium
Affected projects: vision, pandas, core, parso
Fixes: no-matching-overload, bad-return, bad-argument-type
The root cause is that the literal preservation applies uniformly to all TypeVar solving, but the problematic cases all involve invariant type parameters of mutable containers. For enumerate, the TypeVar flows through Iterator/Iterable which are covariant in their yield type, so preserving literals is correct. For list(), dict(), set(), the type parameters are invariant, and preserving literals creates types that are too narrow — you can't append/update with values outside the literal union. Checking variance of the quantified variable would preserve the test case (enumerate over literal tuple) while avoiding the regressions. This would eliminate 3 errors in vision, 1 in pandas (ops.py), 1 in core, and 1 in parso (6 total pyrefly-only errors across 4 projects). However, the variance information may not be directly available on q at this point in the code, so an alternative approach may be needed — see next suggestion.

2. Alternative simpler fix: In the Subset impl in pyrefly/lib/solver/solver.rs, instead of completely skipping literal promotion for unions of all implicit literals, promote individual implicit literals but preserve the union structure. The real issue the PR tries to solve is that enumerate over tuple[Literal['a'], Literal['b']] was promoting each element to str before unioning. A better approach: only skip promotion when is_partial is false (i.e., for fully quantified TypeVars that appear in covariant positions like iterator yields), or alternatively, revert this change entirely and fix the enumerate case differently — e.g., by ensuring tuple element types are preserved when iterating tuples in enumerate. The simplest safe fix is to revert the guard and instead handle the enumerate case specifically in tuple iteration logic, where the tuple's element types should be preserved as-is when yielded through __iter__.

Files: pyrefly/lib/solver/solver.rs
Confidence: medium
Affected projects: vision, pandas, core, parso
Fixes: no-matching-overload, bad-return, bad-argument-type
The test case test_enumerate_preserves_literal_type shows the desired behavior: iterating a tuple[Literal['a'], Literal['b']] should yield Literal['a'] | Literal['b'], not str. This should already work if tuple iteration correctly produces the union of element types without going through literal promotion. The promotion happens during TypeVar solving in Subset, but if the tuple's __iter__ return type already carries the literal union, it shouldn't need special handling in the solver. The current fix is too broad — it affects ALL TypeVar solving with literal unions, causing 7 false positives across 4 projects while fixing 1 case. A more targeted fix in tuple iteration would be safer.

3. Most conservative fix: In the Subset impl in pyrefly/lib/solver/solver.rs, add an additional condition to the literal-preservation guard that checks whether the TypeVar being solved (q) is used as a type parameter of a known mutable container class. Specifically, change the condition to also require that is_partial is true (i.e., PartialQuantified), since partial quantification is used during incremental type inference where preserving precision matters, while full quantification (Quantified) is used for function signatures where promotion is safer. Pseudo-code: if matches!(t1, Type::Union(u) if u.members.iter().all(|t| t.[is_implicit_literal()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/solver/solver.rs))) && is_partial.

Files: pyrefly/lib/solver/solver.rs
Confidence: low
Affected projects: vision, pandas, core, parso
Fixes: no-matching-overload, bad-return, bad-argument-type
This is speculative — the is_partial flag may not correctly distinguish the enumerate case from the container construction cases. But it's worth investigating since the code already extracts is_partial right before the literal promotion logic. If partial quantification corresponds to the intermediate inference steps (like inferring tuple element types during iteration) while full quantification corresponds to solving constructor/method TypeVars, this would correctly preserve literals for enumerate while promoting them for list/dict/set constructors.


Was this helpful? React with 👍 or 👎

Classification by primer-classifier (5 LLM)

@migeed-z migeed-z closed this Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Iterating over enumerated tuple loses typing information

1 participant