Releases: scientist-labs/lancelot
Lancelot 0.3.0 Release
New Features
- Reciprocal Rank Fusion (RRF): Added support for hybrid search combining vector and text search results with configurable alpha weighting (#4)
- Optional Field Support: Records can now be added with missing fields - unspecified fields will be set to null instead of raising errors
Improvements
- Enhanced README with comprehensive quick start guide and usage examples
- Added example demonstrating optional field usage (examples/optional_fields_demo.rb)
- Expanded test coverage for new features
Bug Fixes
- Fixed field requirement validation to properly handle optional fields in dataset schemas
This release focuses on making Lancelot more flexible for real-world use cases where not all data fields are always present, and enables sophisticated hybrid search capabilities through RRF.
🎯 Lancelot 0.2.0 - Hybrid Search with Reciprocal Rank Fusion
We're thrilled to announce Lancelot 0.2.0, bringing powerful hybrid search capabilities to Ruby! This release introduces Reciprocal Rank Fusion (RRF), enabling you to combine vector and text
search results for superior search quality.
🆕 What's New
Hybrid Search with RRF
The star feature of this release is the new hybrid_search method, which intelligently combines results from different search modalities:
# Combine semantic vector search with keyword text search
results = dataset.hybrid_search(
"machine learning", # Text query
vector: text_to_embedding("ML/AI"), # Vector query
vector_column: "embedding",
text_column: "content",
limit: 10
)
# Results include RRF scores for ranking
results.each do |doc|
puts "#{doc[:title]} - Score: #{doc[:rrf_score]}"
endFlexible Search Combinations
- Same Query, Multiple Modalities: Use the same query for both vector and text search to capture both semantic and lexical matches
- Different Queries per Modality: Use conceptual queries for vectors and specific keywords for text
- Multi-Column Text Search: Search across multiple text columns while doing vector similarity
- Custom Fusion Parameters: Tune the RRF k parameter to control result blending
RankFusion Module
For advanced use cases, directly access the RRF algorithm:
# Combine results from multiple independent searches
vector_results = dataset.vector_search(embedding1, limit: 20)
text_results = dataset.text_search("neural networks", limit: 20)
keyword_results = dataset.text_search("PyTorch", limit: 20)
fused = Lancelot::RankFusion.reciprocal_rank_fusion(
[vector_results, text_results, keyword_results],
k: 60
)🔬 Why RRF?
Reciprocal Rank Fusion is a robust, parameter-free fusion method that:
- Combines rankings from different search types without score normalization
- Handles missing documents gracefully (documents that appear in only some result sets)
- Provides consistent, high-quality results across diverse query types
- Uses the formula: score = Σ(1/(k+rank)) where k=60 by default
📈 Performance & Quality
- Better Recall: Captures relevant documents that might rank poorly in one modality but well in another
- Improved Precision: Documents appearing high in multiple result lists get boosted
- Flexible Querying: Supports everything from single-modality searches to complex multi-query fusion
💎 Ruby-First Design
As always, we've kept the API idiomatic and intuitive:
- Named parameters for clarity
- Sensible defaults (k=60, limit=10)
- Graceful handling of edge cases
- Comprehensive error messages
🚀 Upgrade Guide
gem update lancelot
The new features are additive - all existing code continues to work. To use hybrid search, ensure you have both vector and text indices created:
dataset.create_vector_index("embedding")
dataset.create_text_index("content")🙏 Acknowledgments
Thanks to our contributors and the Ruby ML community for feedback and suggestions.
Documentation: Updated examples and API docs at https://github.com/cpetersen/lancelot
🚀 Lancelot 0.1.0 - Initial Release
We're excited to announce the first release of Lancelot - Ruby bindings for https://github.com/lancedb/lance, a modern columnar data format for ML!
✨ Features
Core Functionality
- Dataset Management: Create and open Lance datasets with Ruby-native APIs
- Schema Support: Define schemas with multiple data types including vectors
- Document Operations: Add, retrieve, and iterate through documents with full Enumerable support
- Vector Search: Build ANN indices and perform fast similarity search
- Full-Text Search: Create inverted indices for BM25-powered text search across single or multiple columns
- SQL Filtering: Query datasets using SQL-like WHERE clauses
- Ruby-First Design: Idiomatic Ruby API with operator overloading (<<), enumerable methods, and familiar patterns
Supported Data Types
- Strings (:string)
- Floating point (:float32, :float64)
- Integers (:int32, :int64)
- Booleans (:boolean)
- Fixed-size vectors with configurable dimensions
🔧 Technical Details
- Built with Magnus for Ruby-Rust interop
- Embedded Tokio runtime for async Lance operations
- Clean separation between Ruby API and Rust implementation
- Comprehensive test coverage with RSpec
📚 Example Usage
require 'lancelot'
# Create a dataset
dataset = Lancelot::Dataset.create("./my_data", schema: {
title: :string,
content: :string,
embedding: { type: "vector", dimension: 384 }
})
# Add documents
dataset << {
title: "Introduction to Ruby",
content: "Ruby is a dynamic programming language...",
embedding: [0.1, 0.2, ...]
}
# Create indices
dataset.create_vector_index("embedding")
dataset.create_text_index("content")
# Search
vector_results = dataset.vector_search(query_embedding, column: "embedding", limit: 10)
text_results = dataset.text_search("ruby programming", column: "content", limit: 10)🎯 Who is this for?
Lancelot is perfect for Ruby developers who need:
- Efficient storage and retrieval of embeddings
- Combined vector and text search capabilities
- A columnar format optimized for ML workloads
- Integration with the Ruby ML ecosystem (works great with red-candle!)
🚧 Coming Soon
- Hybrid search with Reciprocal Rank Fusion (RRF)
🙏 Acknowledgments
Special thanks to the Lance team for creating such a powerful columnar format, and to the Magnus project for making Ruby-Rust interop a pleasure to work with.
Installation: gem install lancelot or add to your Gemfile: gem 'lancelot'
Documentation: https://github.com/cpetersen/lancelot
License: MIT