-
Notifications
You must be signed in to change notification settings - Fork 8.4k
perf: Limit enum options in tool schemas to reduce token usage #11370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
WalkthroughThis PR introduces a cap of 50 options (MAX_OPTIONS_FOR_TOOL_ENUM) for Literal enum generation in tool schemas. When options exceed this limit, the code skips enum creation to reduce token consumption. New tests validate the schema optimization and CurrentDateComponent functionality. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/lfx/src/lfx/io/schema.py`:
- Around line 225-229: Combine the nested checks into one conditional: replace
the two-level if that tests hasattr(input_model, "options") and
isinstance(input_model.options, list) and input_model.options followed by a
length check against MAX_OPTIONS_FOR_TOOL_ENUM with a single if that combines
all four predicates; then keep the existing body that builds literal_string and
sets field_type using eval (referencing input_model.options,
MAX_OPTIONS_FOR_TOOL_ENUM, literal_string, and field_type) unchanged.
In `@src/lfx/tests/unit/components/utilities/test_current_date.py`:
- Around line 1-8: Add an empty __init__.py to the tests package directory
containing the CurrentDateComponent test (the directory housing
test_current_date.py) so Python treats it as a regular package; this will
eliminate implicit namespace packaging and linter failures—create the file with
no content (or a single comment) alongside test_current_date.py and rerun the
linter to confirm the issue is resolved.
🧹 Nitpick comments (1)
src/lfx/src/lfx/io/schema.py (1)
263-267: Apply the same fix for consistency.This has the same nested
ifpattern. Apply the same consolidation for consistency and to preempt a similar linter warning.Proposed fix
- if hasattr(input_model, "options") and isinstance(input_model.options, list) and input_model.options: - # Skip enum for large option lists to avoid token waste (issue `#8226`) - if len(input_model.options) <= MAX_OPTIONS_FOR_TOOL_ENUM: - literal_string = f"Literal{input_model.options}" - field_type = eval(literal_string, {"Literal": Literal}) # noqa: S307 + # Skip enum for large option lists to avoid token waste (issue `#8226`) + if ( + hasattr(input_model, "options") + and isinstance(input_model.options, list) + and input_model.options + and len(input_model.options) <= MAX_OPTIONS_FOR_TOOL_ENUM + ): + literal_string = f"Literal{input_model.options}" + field_type = eval(literal_string, {"Literal": Literal}) # noqa: S307
…angflow into cz/fix-current-date-tokens
* fix current date tokens usage * Update src/lfx/src/lfx/io/schema.py * remove comment --------- Co-authored-by: Himavarsha <[email protected]>
* Add nightly hash history script to nightly workflow * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * Add lfx-nightly to script * Handle first run * Try fixing version on nightly hash history * remove lfx lockfile since it does not exist * Get full version in build, handle the [extras] in pyprojects * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * language update * Handle extras in langflow-base dependency in all workflows * [autofix.ci] apply automated fixes * Fix import in lfx status response * [autofix.ci] apply automated fixes * Use built artifact for jobs, remove wait period * use [complete] when building test cli job * skip slack message added to success * Update merge hash histry job to only run when ref is main * Updates pyproject naming to add nightly suffix * [autofix.ci] apply automated fixes * Fix ordering of lfx imports' * [autofix.ci] apply automated fixes * Ah, ignore auto-import fixes by ruff * [autofix.ci] apply automated fixes * update test to look at _all_ exported instead * [autofix.ci] apply automated fixes * perf: Limit enum options in tool schemas to reduce token usage (langflow-ai#11370) * fix current date tokens usage * Update src/lfx/src/lfx/io/schema.py * remove comment --------- Co-authored-by: Himavarsha <[email protected]> * update date test to reflect changes to lfx * ruff * [autofix.ci] apply automated fixes --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Cristhian Zanforlin Lousa <[email protected]> Co-authored-by: Himavarsha <[email protected]>
This pull request optimizes the tool schema generation for components with large option lists, specifically addressing excessive token usage when serializing schemas for LLMs (see issue #8226). It introduces a maximum threshold for enumerating options, modifies schema generation logic to skip enums for large lists, and adds comprehensive tests to ensure both schema efficiency and component functionality.
Tool schema optimization:
MAX_OPTIONS_FOR_TOOL_ENUMconstant inlfx/io/schema.pyto limit the number of options included as enum in tool schemas, defaulting to string type when the limit is exceeded to avoid wasting tokens.create_input_schemaandcreate_input_schema_from_dictfunctions inlfx/io/schema.pyto skip enum generation for option lists exceeding the threshold, using string type with a default value instead. [1] [2]Testing and validation:
test_current_date.pyto verify that schemas for tools with large option lists do not include enums, include default values, and are significantly smaller in size.CurrentDateComponentcontinues to function correctly, handling default, specific, and invalid timezone cases.Summary by CodeRabbit
Chores
Tests
✏️ Tip: You can customize this high-level summary in your review settings.