Architecture Discussion: Economy Mode and Vector Database Full-Text Search CapabilitiesSuggestions for New Features #22955
yunqiqiliang
started this conversation in
Suggestion
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
Hi Dify community!
I'd like to start a discussion about a fundamental architecture issue I discovered while integrating ClickZetta vector database with Dify. This affects all vector databases with full-text search capabilities, not just ClickZetta.
🔍 The Problem
Dify currently supports three search modes when you choose High Quality indexing:
However, if you want to use Economy Mode to save costs:
The core issue: You're forced to choose High Quality mode (with expensive vector embeddings) just to access full-text search capabilities that are actually cost-efficient.
Current forced choice:
What users actually want:
📊 Impact Analysis
I checked Dify's codebase and found that ALL 43 vector database implementations support the
search_by_full_text()
method:🚨 Real-World Customer Issue
This issue was discovered by an actual customer who reported it to me today. They were trying to use ClickZetta (which has excellent inverted index capabilities) with Dify's Economy Mode to save costs while still getting professional search quality.
Customer's expectation:
What actually happened:
Customer impact:
🔧 Code Evidence
Looking at the architecture:
BaseVector abstract class requires both methods:
Economy mode logic in
indexing_runner.py
(lines 533-540):All vector databases implement full-text search, but economy mode never uses it.
💡 Proposed Solutions
Option 1: Redesign Economy Mode (Recommended)
Option 2: Improve Current Modes
Current state (already supported):
The issue: You MUST choose High Quality mode to get any vector database functionality, even if you only want full-text search.
Option 3: Storage Backend Choice
Let users explicitly choose where to store their data, regardless of search strategy.
🤔 Discussion Questions
Cost vs. Functionality: Should users be able to use vector database full-text search without paying for vector embeddings?
Architecture Design: Would it make sense to add a "Full-text Only" mode that uses vector databases but skips embeddings?
User Choice: Should Economy mode be redesigned to allow vector database storage with keyword-only search?
Performance Trade-offs: What are the real performance differences between:
Hybrid Architecture: How can we better separate storage decisions from search strategy decisions?
🎯 Why This Matters
For Real Customers (like the one who reported this):
For Dify:
For the Ecosystem:
🚀 Call to Action
I believe this deserves community-wide discussion because:
🤝 What's Next?
I'd love to hear from:
Let's discuss! I'm happy to contribute code once we align on the direction.
Tags: #architecture #vector-database #search #economy-mode #full-text-search
What are your thoughts on this architectural challenge? 🤔
2. Additional context or comments
No response
3. Can you help us with this feature?
Beta Was this translation helpful? Give feedback.
All reactions