⚡️ Speed up function calculate_text_metrics by 20% in PR #11114 (feat/langchain-1.0)
#11360
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #11114
If you approve this dependent PR, these changes will be merged into the original PR branch
feat/langchain-1.0.📄 20% (0.20x) speedup for
calculate_text_metricsinsrc/backend/base/langflow/api/v1/knowledge_bases.py⏱️ Runtime :
48.4 milliseconds→40.4 milliseconds(best of90runs)📝 Explanation and details
The optimized code achieves a 19% speedup by replacing an expensive string splitting operation with a more efficient regex-based counting approach.
Key Optimization
Original approach:
Optimized approach:
Why This Is Faster
The original code uses
str.split().str.len()which:The optimized code uses
str.count(_WORD_RE)with a pre-compiled regexr"\S+"(non-whitespace sequences) which:From the line profiler results, this change reduces the word counting line from 75.3ms (42.5% of total time) to 45.7ms (31.2% of total time) — a ~40% improvement on this specific operation.
Test Case Performance
The optimization benefits all test cases, but shows particular gains for:
Impact Considerations
Since
calculate_text_metricsprocesses knowledge base text data:✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr11114-2026-01-19T22.17.34and push.