Skip to content

Conversation

@Cristhianzl
Copy link
Member

@Cristhianzl Cristhianzl commented Jan 20, 2026

This pull request optimizes the tool schema generation for components with large option lists, specifically addressing excessive token usage when serializing schemas for LLMs (see issue #8226). It introduces a maximum threshold for enumerating options, modifies schema generation logic to skip enums for large lists, and adds comprehensive tests to ensure both schema efficiency and component functionality.

Tool schema optimization:

  • Introduced MAX_OPTIONS_FOR_TOOL_ENUM constant in lfx/io/schema.py to limit the number of options included as enum in tool schemas, defaulting to string type when the limit is exceeded to avoid wasting tokens.
  • Updated create_input_schema and create_input_schema_from_dict functions in lfx/io/schema.py to skip enum generation for option lists exceeding the threshold, using string type with a default value instead. [1] [2]

Testing and validation:

  • Added new unit tests in test_current_date.py to verify that schemas for tools with large option lists do not include enums, include default values, and are significantly smaller in size.
  • Included tests to ensure that the CurrentDateComponent continues to function correctly, handling default, specific, and invalid timezone cases.

Summary by CodeRabbit

  • Chores

    • Optimized schema generation for tools with large option lists (>50 options) to reduce token consumption and improve efficiency.
  • Tests

    • Added tests validating optimized schema behavior and tool functionality across various scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@Cristhianzl Cristhianzl self-assigned this Jan 20, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review

Walkthrough

This PR introduces a cap of 50 options (MAX_OPTIONS_FOR_TOOL_ENUM) for Literal enum generation in tool schemas. When options exceed this limit, the code skips enum creation to reduce token consumption. New tests validate the schema optimization and CurrentDateComponent functionality.

Changes

Cohort / File(s) Summary
Schema optimization
src/lfx/src/lfx/io/schema.py
Adds MAX_OPTIONS_FOR_TOOL_ENUM constant set to 50 and introduces conditional guards in three code paths (create_input_schema, create_input_schema_from_dict, and in-context option handling) to skip Literal enum creation when options exceed the limit.
Test coverage
src/lfx/tests/unit/components/utilities/test_current_date.py
New test module with TestCurrentDateToolSchema (verifying enum omission for large option lists, default timezone value, and schema size reduction) and TestCurrentDateFunctionality (validating UTC default, timezone handling, and error cases).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Quality And Coverage ⚠️ Warning Test file lacks comprehensive unit test coverage for schema.py implementation changes, missing direct tests for create_input_schema, create_input_schema_from_dict, in-context option handling, and boundary edge cases. Add dedicated unit tests for each code path with boundary conditions (49, 50, 51 options), different option types, and edge cases to validate threshold enforcement and enum creation logic.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: limiting enum options in tool schemas to reduce token usage, which aligns with the core objective of the PR.
Test Coverage For New Implementations ✅ Passed The PR includes comprehensive test coverage for new enum optimization functionality with tests validating that large option lists skip enum generation, schema size reduction, and preserved defaults.
Test File Naming And Structure ✅ Passed Test file test_current_date.py follows correct pytest naming convention with descriptive test classes and methods covering both positive and negative scenarios.
Excessive Mock Usage Warning ✅ Passed The test file contains zero mock objects, decorators, or mocking patterns. All six tests instantiate real CurrentDateComponent objects and utilize actual Python libraries to validate genuine component behavior.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch cz/fix-current-date-tokens

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the performance Maintenance tasks and housekeeping label Jan 20, 2026
@github-actions github-actions bot added performance Maintenance tasks and housekeeping and removed performance Maintenance tasks and housekeeping labels Jan 20, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/lfx/src/lfx/io/schema.py`:
- Around line 225-229: Combine the nested checks into one conditional: replace
the two-level if that tests hasattr(input_model, "options") and
isinstance(input_model.options, list) and input_model.options followed by a
length check against MAX_OPTIONS_FOR_TOOL_ENUM with a single if that combines
all four predicates; then keep the existing body that builds literal_string and
sets field_type using eval (referencing input_model.options,
MAX_OPTIONS_FOR_TOOL_ENUM, literal_string, and field_type) unchanged.

In `@src/lfx/tests/unit/components/utilities/test_current_date.py`:
- Around line 1-8: Add an empty __init__.py to the tests package directory
containing the CurrentDateComponent test (the directory housing
test_current_date.py) so Python treats it as a regular package; this will
eliminate implicit namespace packaging and linter failures—create the file with
no content (or a single comment) alongside test_current_date.py and rerun the
linter to confirm the issue is resolved.
🧹 Nitpick comments (1)
src/lfx/src/lfx/io/schema.py (1)

263-267: Apply the same fix for consistency.

This has the same nested if pattern. Apply the same consolidation for consistency and to preempt a similar linter warning.

Proposed fix
-        if hasattr(input_model, "options") and isinstance(input_model.options, list) and input_model.options:
-            # Skip enum for large option lists to avoid token waste (issue `#8226`)
-            if len(input_model.options) <= MAX_OPTIONS_FOR_TOOL_ENUM:
-                literal_string = f"Literal{input_model.options}"
-                field_type = eval(literal_string, {"Literal": Literal})  # noqa: S307
+        # Skip enum for large option lists to avoid token waste (issue `#8226`)
+        if (
+            hasattr(input_model, "options")
+            and isinstance(input_model.options, list)
+            and input_model.options
+            and len(input_model.options) <= MAX_OPTIONS_FOR_TOOL_ENUM
+        ):
+            literal_string = f"Literal{input_model.options}"
+            field_type = eval(literal_string, {"Literal": Literal})  # noqa: S307

@github-actions github-actions bot added performance Maintenance tasks and housekeeping and removed performance Maintenance tasks and housekeeping labels Jan 20, 2026
@github-actions github-actions bot added performance Maintenance tasks and housekeeping and removed performance Maintenance tasks and housekeeping labels Jan 26, 2026
@github-actions github-actions bot added the lgtm This PR has been approved by a maintainer label Jan 26, 2026
@Cristhianzl Cristhianzl added this pull request to the merge queue Jan 26, 2026
Merged via the queue into main with commit d7f81cb Jan 26, 2026
24 checks passed
@Cristhianzl Cristhianzl deleted the cz/fix-current-date-tokens branch January 26, 2026 21:10
jordanrfrazier pushed a commit that referenced this pull request Jan 26, 2026
* fix current date tokens usage

* Update src/lfx/src/lfx/io/schema.py

* remove comment

---------

Co-authored-by: Himavarsha <[email protected]>
pull bot pushed a commit to KornaAI/langflow that referenced this pull request Jan 27, 2026
* Add nightly hash history script to nightly workflow

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* Add lfx-nightly to script

* Handle first run

* Try fixing version on nightly hash history

* remove lfx lockfile since it does not exist

* Get full version in build, handle the [extras] in pyprojects

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* language update

* Handle extras in langflow-base dependency in all workflows

* [autofix.ci] apply automated fixes

* Fix import in lfx status response

* [autofix.ci] apply automated fixes

* Use built artifact for jobs, remove wait period

* use [complete] when building test cli job

* skip slack message added to success

* Update merge hash histry job to only run when ref is main

* Updates pyproject naming to add nightly suffix

* [autofix.ci] apply automated fixes

* Fix ordering of lfx imports'

* [autofix.ci] apply automated fixes

* Ah, ignore auto-import fixes by ruff

* [autofix.ci] apply automated fixes

* update test to look at _all_ exported instead

* [autofix.ci] apply automated fixes

* perf: Limit enum options in tool schemas to reduce token usage (langflow-ai#11370)

* fix current date tokens usage

* Update src/lfx/src/lfx/io/schema.py

* remove comment

---------

Co-authored-by: Himavarsha <[email protected]>

* update date test to reflect changes to lfx

* ruff

* [autofix.ci] apply automated fixes

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Cristhian Zanforlin Lousa <[email protected]>
Co-authored-by: Himavarsha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer performance Maintenance tasks and housekeeping

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants