feat(toon): add TOON mode for token-efficient structured outputs across major providers #1986

hsaeed3 · 2026-01-06T03:16:15Z

TOON (Token-Oriented Object Notation) is a YAML-like format achieving
~17-28% token reduction compared to JSON/MD_JSON modes.

Core implementation:

Add Mode.TOON and provider-specific modes (ANTHROPIC_TOON, MISTRAL_TOON,
GENAI_TOON, COHERE_TOON, BEDROCK_TOON, XAI_TOON)
Centralized TOON logic in processing/toon.py for maintainability
Recursive structure generation for nested Pydantic models
Support for all type annotations: Enum, Literal, Union, Optional, Annotated
Streaming support via extract_code_block_from_stream utilities
Iterable streaming support in dsl/iterable.py

Supported providers:

OpenAI, OpenRouter, Together, Anyscale, Groq (Mode.TOON)
Anthropic Claude (Mode.ANTHROPIC_TOON)
Mistral (Mode.MISTRAL_TOON)
Google GenAI/Gemini (Mode.GENAI_TOON)
Cohere Command (Mode.COHERE_TOON)
AWS Bedrock (Mode.BEDROCK_TOON)
xAI Grok (Mode.XAI_TOON)

Refactored (from original PR):

Consolidated duplicate TOON logic from provider utils into processing/toon.py
Unified system prompt and reask message generation across all providers
Added Mode.toon_modes() classification method

Includes:

Comprehensive unit tests for TOON parsing and type coercion
Example scripts for each provider in examples/toon/
Documentation updates to modes-comparison.md

Requires: toon-format library (https://github.com/toon-format/toon-python)

This PR was written by Cursor

Important

Adds TOON mode for token-efficient structured outputs across major providers, with core logic in processing/toon.py and updates to tests and documentation.

Behavior:
- Introduces Mode.TOON and provider-specific modes like ANTHROPIC_TOON, MISTRAL_TOON, etc.
- Implements TOON logic in processing/toon.py for structured output generation.
- Supports nested Pydantic models and type annotations like Enum, Literal, Union.
- Adds streaming support via extract_code_block_from_stream utilities.
Providers:
- Adds TOON mode support for OpenAI, Anthropic, Mistral, Google GenAI, Cohere, AWS Bedrock, and xAI.
- Updates provider-specific handlers in utils.py files for each provider.
Tests:
- Adds comprehensive tests for TOON mode in test_toon_mode.py.
Documentation:
- Updates modes-comparison.md with TOON mode details.
Misc:
- Requires toon-format library for TOON mode functionality.

^{This description was created by}^{for a70bd3b. You can customize this summary. It will automatically update as commits are pushed.}

…ss *all major providers TOON (Token-Oriented Object Notation) is a YAML-like format achieving ~17-28% token reduction compared to JSON/TOOLS modes. Core implementation: - Add Mode.TOON and provider-specific modes (ANTHROPIC_TOON, MISTRAL_TOON, GENAI_TOON, COHERE_TOON, BEDROCK_TOON, XAI_TOON) - Centralized TOON logic in processing/toon.py for maintainability - Recursive structure generation for nested Pydantic models - Support for all type annotations: Enum, Literal, Union, Optional, Annotated - Streaming support via extract_code_block_from_stream utilities - Iterable streaming support in dsl/iterable.py Supported providers: - OpenAI, OpenRouter, Together, Anyscale, Groq (Mode.TOON) - Anthropic Claude (Mode.ANTHROPIC_TOON) - Mistral (Mode.MISTRAL_TOON) - Google GenAI/Gemini (Mode.GENAI_TOON) - Cohere Command (Mode.COHERE_TOON) - AWS Bedrock (Mode.BEDROCK_TOON) - xAI Grok (Mode.XAI_TOON) Refactored (from previous PR): - Consolidated duplicate TOON logic from provider utils into processing/toon.py - Unified system prompt and reask message generation across all providers - Added Mode.toon_modes() classification method Includes: - Comprehensive unit tests for TOON parsing and type coercion - Example scripts for each provider in examples/toon/ - Documentation updates to modes-comparison.md Requires: toon-format library (https://github.com/toon-format/toon-python) This PR was written by [Cursor](cursor.com)

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to a70bd3b in 3 minutes and 3 seconds. Click for details.

Reviewed 2990 lines of code in 30 files
Skipped 0 files when reviewing.
Skipped posting 16 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. instructor/processing/toon.py:18

Draft comment:
Good use of type annotations and dedent for system prompt. Consider adding more inline comments in complex functions (e.g., _format_type_for_toon and generate_toon_structure) to clarify the handling of various type annotations.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

2. instructor/dsl/partial.py:174

Draft comment:
Nice integration of TOON mode in from_streaming_response_async. Ensure similar test coverage is maintained for both streaming and non-streaming TOON outputs.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

3. instructor/mode.py:26

Draft comment:
New TOON and provider-specific TOON modes are clearly defined. Consider documenting in the enum docstring the token reduction benefit for each mode.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

4. instructor/processing/function_calls.py:913

Draft comment:
The new parse_toon method is well structured. Consider logging detailed errors on decode failures to aid troubleshooting.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 35% vs. threshold = 85% The comment is about adding logging to aid troubleshooting. Looking at the parse_toon method (lines 917-974), I see that decode failures are caught at line 963 and wrapped in a ResponseParsingError. However, there's no logger.debug() or logger.error() call before raising the exception. Comparing this to other methods like _validate_model_from_json (lines 105-110), I see they do log errors before re-raising. This suggests the comment has merit - adding logging would be consistent with the codebase pattern and would help with debugging. The comment is about code added in the diff (parse_toon is a new method), so it's about changes. It's actionable and follows good practices. The comment is somewhat vague - it says "consider logging detailed errors" but doesn't specify what level (debug, error, warning) or exactly what information should be logged. Also, the exception already includes the error message in the ResponseParsingError, so the value of additional logging might be debatable. The error is being raised with context, which might be sufficient for troubleshooting. While the comment could be more specific, it's following an established pattern in the codebase where other parsing methods do log errors before raising exceptions (see lines 106, 109). The suggestion is actionable and would improve consistency. However, since the exception already includes detailed information and will be caught/logged at a higher level, the additional logging might be redundant. This is more of a "nice to have" than a critical issue. The comment suggests a code quality improvement that would make the code more consistent with other methods in the file. However, it's not a critical issue since the exception already contains detailed error information. The comment is somewhat vague and could be more specific about what to log and at what level. Given the rules that comments should be clearly actionable and not speculative, and that we should only keep comments with strong evidence they're correct, this falls into a gray area.

5. tests/test_toon_mode.py:171

Draft comment:
Tests for HANDLE TOON changes are comprehensive. Make sure edge cases such as missing system messages or non-string message content are also covered.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

6. instructor/providers/anthropic/utils.py:485

Draft comment:
The reask handler for Anthropic TOON mode uses get_toon_reask_message appropriately; ensure that the fallback text when no text is available is sufficient.
Reason this comment was not posted:
Confidence changes required: 75% <= threshold 85% None

7. instructor/providers/cohere/utils.py:90

Draft comment:
Reask for Cohere TOON mode handles both V1 and V2 nicely. Consider unifying similar logic in reask functions across providers to reduce duplication.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

8. instructor/providers/gemini/utils.py:115

Draft comment:
In the function handle_genai_structured_outputs, consider adding inline comments to explain merging of user thinking_config, as this is critical for GenAI output.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

9. instructor/providers/openai/utils.py:552

Draft comment:
In the OPENAI_HANDLERS registry, the TOON entry is added. Consider validating that the reask message clearly instructs users to reformat the output with the TOON schema.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

10. instructor/providers/xai/utils.py:139

Draft comment:
The conversion function _convert_messages is simple. Consider expanding to support additional content types (beyond string) if needed later.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

11. instructor/utils/__init__.py:62

Draft comment:
The re-export and lazy import pattern is nicely implemented. Ensure that new TOON-related exports are added as needed.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

12. instructor/utils/core.py:703

Draft comment:
In prepare_response_model, the check for Iterable types might need to ensure the correct use of get_origin. Consider verifying that the case when response_model is e.g. List[...] is correctly handled.
Reason this comment was not posted:
Confidence changes required: 75% <= threshold 85% None

13. tests/test_toon_mode.py:560

Draft comment:
Comprehensive tests cover structure generation and reask behavior. Consider adding tests for cases where the TOON extraction fails (e.g. invalid code block formatting).
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

14. examples/toon/run_mistral.py:48

Draft comment:
Typo: In the comment, "this wasnt pulling key on it's own" should be corrected to "this wasn't pulling the key on its own". Consider fixing the missing apostrophe in "wasn’t" and replacing "it's" with "its" for proper possessive.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 85% This is a grammar/spelling correction in a code comment. While technically correct, I need to check if this violates the rules. The rules say "Do NOT comment unless there is clearly a code change required" and "Do NOT make comments that are obvious or unimportant." Grammar fixes in comments are not code changes - they don't affect functionality. This is a very minor issue in an informal NOTE comment. The comment is also somewhat pedantic and doesn't improve code quality or functionality. This seems like an unimportant, obvious issue that doesn't require a code change to fix. Could this be considered important for code quality or documentation standards? Some teams have strict standards for grammar in comments, especially in example files that users might read. While some teams do have strict documentation standards, this is an informal NOTE comment (not user-facing documentation), and the rules explicitly state not to make obvious or unimportant comments. Grammar nitpicks in informal code comments fall into this category. The meaning is perfectly clear despite the minor grammar issues. This comment should be deleted. It's a minor grammar correction in an informal code comment that doesn't affect functionality or code quality in any meaningful way. It violates the rule about not making obvious or unimportant comments.

15. instructor/providers/anthropic/utils.py:501

Draft comment:
There's a minor grammatical error in the assertion message. Consider changing "Response must be a Anthropic Message" to "Response must be an Anthropic Message".
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 70% vs. threshold = 85% The comment is about line 501 which is in the new reask_anthropic_toon function that was added in this diff. So it IS about changed code. The grammatical correction is technically correct. However, looking at the rules, I need to consider: 1) Is this an obvious or unimportant comment? It's a very minor grammatical fix in an error message. 2) The same error exists in lines 151 and 200 (in unchanged code), so this creates inconsistency if only this one is fixed. 3) The rules say "Do NOT make comments that are obvious or unimportant." A grammatical article in an assertion message that will rarely be seen (only when there's a type error) seems quite minor. 4) However, the rules also mention "Assertions should always have an error message that is formatted well" - this could support keeping the comment. While the comment is technically correct and the code was changed in this diff, this is an extremely minor grammatical issue in an assertion message. The same error exists elsewhere in the file (lines 151, 200) in unchanged code, so fixing only this instance creates inconsistency. The rule about assertions having well-formatted error messages could support this, but the rule about not making obvious or unimportant comments might override it. The rule explicitly states "Assertions should always have an error message that is formatted well" which directly applies here. Even though it's minor, proper grammar in error messages is part of being well-formatted. The fact that other instances exist doesn't mean this one shouldn't be fixed - those are in unchanged code and outside the scope of this review. This is actionable, clear, and improves code quality. Keep the comment. While minor, it addresses a grammatical error in an assertion message, and the rules explicitly state that assertions should have well-formatted error messages. The comment is about changed code, is technically correct, and provides an actionable fix.

16. instructor/providers/openai/utils.py:554

Draft comment:
It appears that the handler registry is using reask_tools (line 554), whereas the newly added function is named reask_toon. Please confirm if this is a typographical error and the correct function should be reask_toon.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_dq6voS9bC95Y0RYC

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

… for partial streaming

hsaeed3 · 2026-01-06T03:58:39Z

@jxnl redid PR as the original was classifying toon as a JSON mode, as well as not being structured for extensibility. This newer PR supports TOON for a wider range of providers with cleaner processing logic.

hsaeed3 added 3 commits January 5, 2026 19:10

chore: ensure pre-commit-hooks

a70bd3b

chore: add toon-format dependency for TOON mode support

c1ac8b6

ellipsis-dev bot reviewed Jan 6, 2026

View reviewed changes

chore(toon): ensure all added TOON compatible providers are supported…

bbdcfb1

… for partial streaming

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(toon): add TOON mode for token-efficient structured outputs across major providers #1986

feat(toon): add TOON mode for token-efficient structured outputs across major providers #1986

Uh oh!

hsaeed3 commented Jan 6, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

hsaeed3 commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

feat(toon): add TOON mode for token-efficient structured outputs across major providers #1986

Are you sure you want to change the base?

feat(toon): add TOON mode for token-efficient structured outputs across major providers #1986

Uh oh!

Conversation

hsaeed3 commented Jan 6, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

hsaeed3 commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hsaeed3 commented Jan 6, 2026 •

edited by ellipsis-dev bot

Loading