chore(typing): enable ty checks for unit tests by jxnl · Pull Request #1982 · 567-labs/instructor

jxnl · 2026-01-04T15:08:32Z

This starts expanding ty coverage to the test suite.

What changed

Add ty-tests.toml and run uv run ty check --config-file ty-tests.toml tests in CI.
- Excludes: tests/llm/, tests/v2/, tests/docs/ (tracked separately).
Fix ty errors in unit tests by tightening mocks/casts and removing patterns ty can’t infer.
- tests/test_function_calls.py: build typed ChatCompletion objects instead of dicts; add casts where from_response types are too narrow.
- tests/test_exceptions.py: avoid direct assignment to NamedTuple fields; add None narrowing for failed_attempts.
- tests/test_json_extraction.py, tests/processing/test_anthropic_json.py: cast parse helpers to concrete schema types.
- tests/test_retry_json_mode.py: narrow pytest exception type.
- tests/test_simple_types.py, tests/test_batch_in_memory.py, tests/test_multimodal.py: add targeted type: ignore[arg-type] where tests intentionally pass wrong types.

Follow-ups

tests/llm/ and tests/v2/ still have ty diagnostics and remain excluded for now.

Important

Enable type checking for unit tests with ty-tests.toml and fix type errors in various test files.

Type Checking:
- Adds ty-tests.toml for type checking configuration, excluding tests/llm/, tests/v2/, tests/docs/.
- Updates CI in .github/workflows/ty.yml to run uv run ty check --config-file ty-tests.toml tests.
Type Error Fixes:
- tests/test_function_calls.py: Builds typed ChatCompletion objects; adds casts for from_response.
- tests/test_exceptions.py: Avoids direct assignment to NamedTuple fields; narrows None for failed_attempts.
- tests/test_json_extraction.py, tests/processing/test_anthropic_json.py: Casts parse helpers to schema types.
- tests/test_retry_json_mode.py: Narrows pytest exception type.
- tests/test_simple_types.py, tests/test_batch_in_memory.py, tests/test_multimodal.py: Adds type: ignore[arg-type] for intentional wrong types.
Follow-ups:
- tests/llm/ and tests/v2/ still have ty diagnostics and remain excluded for now.

^{This description was created by}^{for 7498512. You can customize this summary. It will automatically update as commits are pushed.}

cloudflare-workers-and-pages · 2026-01-04T15:10:02Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	instructor	`eb173c2`	Commit Preview URL Branch Preview URL	Jan 08 2026, 04:13 PM

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to 7498512 in 2 minutes and 5 seconds. Click for details.

Reviewed 499 lines of code in 17 files
Skipped 0 files when reviewing.
Skipped posting 15 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. .github/workflows/ty.yml:28

Draft comment:
The CI workflow now includes a ty check for tests using a custom config file. This is a good addition. Make sure to update the excluded directories in ty-tests.toml as typing support for tests improves.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% The comment is asking the author to update a configuration file as typing support improves. This is a suggestion related to maintaining the codebase as changes occur, which is not directly actionable or specific to the current PR. It doesn't align with the rules provided, as it doesn't suggest a specific code change or test addition.

2. tests/docs/test_mkdocs.py:13

Draft comment:
Using cast(Any, importlib.import_module('mktestdocs')) improves type inference for an untyped module. Consider adding an inline comment explaining that mktestdocs lacks type annotations.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 85% The comment is suggesting adding an explanatory inline comment. While this could be helpful, I need to consider: 1) The rules state "Extremely Complicated Code Needs Comments" - is this code extremely complicated? Not really, it's a fairly straightforward workaround. 2) The use of cast(Any, ...) is already self-documenting to some extent - it signals to readers that there's a type issue. 3) This is in a test file that's currently skipped, so the importance is lower. 4) The comment is somewhat minor/optional - it's not pointing out a bug or critical issue. 5) The rules say "Do NOT comment unless there is clearly a code change required" and "Do NOT make comments that are obvious or unimportant." This suggestion is borderline - it's a minor quality improvement but not clearly required. The code pattern of using cast(Any, importlib.import_module(...)) might not be immediately obvious to all developers, and adding a comment could genuinely improve code maintainability. The suggestion is actionable and specific, which aligns with the rule that "Comments that suggest code quality refactors are good! But only if they are actionable and clear." While the suggestion is actionable, the use of cast(Any, ...) is already a strong signal that there's a type issue, and experienced Python developers would understand this pattern. The comment doesn't point to a bug or necessary change, just a minor documentation improvement. Given the rule to "Do NOT comment unless there is clearly a code change required" and that this is a low-priority improvement in a skipped test, this comment is likely not meeting the bar for "clearly required." This comment suggests a minor documentation improvement that is not clearly required. While it could be helpful, it falls into the category of optional/nice-to-have rather than necessary. The code pattern is already somewhat self-documenting through the use of cast(Any, ...). Given the rules emphasizing only commenting when changes are clearly required, this comment should be deleted.

3. tests/llm/test_core_providers/conftest.py:5

Draft comment:
Removing sys.path manipulation and directly importing from tests.llm.shared_config makes the code cleaner and avoids manual path adjustments.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that the change is good, which violates the rule against making purely informative comments.

4. tests/processing/test_anthropic_json.py:45

Draft comment:
Casting the result of _AnthropicTestModel.parse_anthropic_json to _AnthropicTestModel improves type safety. This explicit cast is a good workaround for type narrowing.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply praises a change without offering any constructive criticism or questions.

5. tests/test_batch_in_memory.py:148

Draft comment:
Using 'type: ignore[arg-type]' for intentionally invalid type inputs clarifies the intent of the test. This targeted ignore is acceptable for testing error conditions.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply acknowledges the use of a specific type ignore directive, which does not align with the rules provided.

6. tests/test_exceptions.py:418

Draft comment:
Adding assertions that 'failed_attempts is not None' improves the robustness of exception tests, ensuring detailed exception information is preserved.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that adding assertions improves robustness, which is not aligned with the rules provided.

7. tests/test_function_calls.py:22

Draft comment:
Updating the mock fixtures with proper casting (e.g. using cast(dict[str, Any], request.param)) enhances type safety in test_function_calls. This pattern is consistently applied and is a pragmatic solution.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

8. tests/test_json_extraction.py:307

Draft comment:
Casting the result of PersonSchema.parse_bedrock_json to PersonSchema helps satisfy the type checker. Consider adding a brief comment on why the cast is necessary.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

9. tests/test_multimodal.py:189

Draft comment:
Using 'type: ignore[arg-type]' in tests for intentionally passing wrong types is acceptable. It clearly documents that the wrong type is expected for error testing.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that using type: ignore[arg-type] is acceptable in tests, which is not necessary for the PR author to know.

10. tests/test_multitask.py:13

Draft comment:
Casting IterableModel(Search) to type[OpenAISchema] is a pragmatic solution to satisfy the type-checker. This clearly indicates the expected schema type.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that the casting is a good solution, which does not align with the rules provided.

11. tests/test_retry_json_mode.py:87

Draft comment:
Explicitly casting exc_info.value to InstructorRetryException clarifies the expected exception type in retry tests. This improves type clarity.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

12. tests/test_simple_types.py:25

Draft comment:
The error message in test_partial_not_simple now hard-codes 'Partial[int]'. This might limit debugging if the actual type information is different. Consider printing dynamic type information if appropriate.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 85% vs. threshold = 85% The diff shows that the code was changed from str(Partial[int]) to the hardcoded string "Partial[int]". But wait - the actual assertion is testing Partial[SampleModel], not Partial[int]. So there's a mismatch. The comment is pointing out that the error message is now hardcoded and suggests using str(Partial[SampleModel]) instead, which would be the correct type. This is a legitimate bug - the error message would be misleading if the test fails. The comment is about a change made in the diff and identifies a real issue. Could the original code have intentionally had Partial[int] in the error message for some reason? Maybe it was already wrong before this change, and the PR author just simplified it without realizing the error. If the bug existed before, should we still comment on it? Looking at the diff, the original code had str(Partial[int]) which would have evaluated to the string "Partial[int]" at runtime anyway. So the change from dynamic to hardcoded didn't introduce the bug - the bug already existed. However, the comment is still about the changed line and suggests a fix that would correct the error message to match the actual type being tested. The comment correctly identifies that the error message doesn't match the actual type being tested (says "Partial[int]" but tests "Partial[SampleModel]"). While the bug may have existed before, the comment is about a changed line and provides an actionable fix. This is a legitimate issue that should be addressed.

13. tests/v2/test_provider_modes.py:21

Draft comment:
Loading instructor.v2 with importlib and casting to Any is a reasonable temporary workaround for missing type definitions. Monitor this code for updates when v2 becomes fully typed.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide a specific suggestion or request for action. It does not align with the rules for useful comments, as it does not ask for a specific change or confirmation of intent.

14. ty-tests.toml:3

Draft comment:
The new ty-tests.toml configuration file correctly excludes problematic directories and respects ignore files. This is a good setup for gradually expanding type coverage in tests.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply praises the current setup without offering any specific guidance or questions.

15. tests/docs/test_mkdocs.py:10

Draft comment:
There is an extraneous ): on line 10 that appears to be a typographical error. It should be corrected or removed.
Reason this comment was not posted:
Comment was on unchanged code.

Workflow ID: wflow_bbmT4gxGHjVh6b4x

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

chore(typing): add ty checks for tests

7498512

github-actions bot added enhancement New feature or request python Pull requests that update python code size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 4, 2026

ellipsis-dev bot reviewed Jan 4, 2026

View reviewed changes

Merge branch 'main' into chore/ty-tests

eb173c2

jxnl closed this Jan 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(typing): enable ty checks for unit tests#1982

chore(typing): enable ty checks for unit tests#1982
jxnl wants to merge 2 commits intomainfrom
chore/ty-tests

jxnl commented Jan 4, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Jan 4, 2026 •

edited

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jxnl commented Jan 4, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloudflare-workers-and-pages bot commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jxnl commented Jan 4, 2026 •

edited by ellipsis-dev bot

Loading

cloudflare-workers-and-pages bot commented Jan 4, 2026 •

edited

Loading