🎉 Ruby 3.1+ Support
Breaking Change: This release expands Ruby compatibility by lowering the minimum required version from Ruby 3.2.0 to Ruby 3.1.0, making Red Candle accessible to more developers and systems.
✨ Major Features
Custom Tokenizer Support for All Models
All safetensors model types (Llama, Mistral, Gemma, Qwen, and Phi) now support custom tokenizers:
# Load a model with a specific tokenizer
llm = Candle::LLM.from_pretrained(
"model-without-bundled-tokenizer",
tokenizer: "appropriate-tokenizer-repo"
)
# Useful for models that don't include tokenizers or when using specialized tokenizers
llm = Candle::LLM.from_pretrained(
"meta-llama/Llama-2-7b",
tokenizer: "meta-llama/Llama-2-7b-chat-hf" # Use chat-tuned tokenizer
)Enhanced Phi Model Support
- Phi-3-mini-128k models now work correctly
- Robust configuration preprocessing handles non-standard fields
- Automatic handling of
gegeluactivation andff_intermediate_sizeparameters - Improved compatibility with various Phi-3 model variants
Reranker Performance Improvements
- Configurable max length for better performance tuning
- Automatic truncation of long inputs to prevent errors
- Significant performance optimizations for document ranking tasks
# Configure max length for optimal performance
reranker = Candle::Reranker.from_pretrained(
"BAAI/bge-reranker-base",
max_length: 512 # Customize based on your use case
)🛡️ Security & Stability
Dependency Updates
- Security Fix: Resolved PyO3 buffer overflow vulnerability (CVE in PyO3 < 0.24.1)
- Major Updates:
tokenizers: 0.21.1 → 0.22.0outlines-core: 0.2 → 0.2.11 (eliminates PyO3 dependency)- Most Rust dependencies updated to latest stable versions
- Note:
hf-hubdowngraded from 0.4.3 to 0.4.1 for compatibility
Improved Error Messages
Error messages now provide actionable guidance:
- Missing tokenizers suggest explicit tokenizer parameter
- Authentication errors mention HF_TOKEN requirement
- Network issues include connectivity troubleshooting tips
🔧 Developer Experience
Logging System
New Ruby logger integration for cleaner output:
- Controlled via
CANDLE_VERBOSEenvironment variable - Reduced noise from Rust internals
- Better debugging experience
Repository Migration
- Moved from
assaydepottoscientist-labsorganization - Updated all documentation and CI/CD pipelines
- Maintained backward compatibility
📊 Testing
- Comprehensive test suite for custom tokenizer functionality
- All tests passing with improved coverage
- Added specs for truncation and max length configuration
- Security vulnerability regression tests
🔄 Migration Guide
From v1.2.x to v1.3.0
-
Ruby Version: Ensure you have Ruby 3.1.0 or higher (previously required 3.2.0)
-
Custom Tokenizers: If you have models failing due to missing tokenizers, you can now specify one:
# Before: Would fail llm = Candle::LLM.from_pretrained("model-without-tokenizer") # After: Works with explicit tokenizer llm = Candle::LLM.from_pretrained( "model-without-tokenizer", tokenizer: "appropriate-tokenizer" )
-
Reranker Configuration: Take advantage of performance tuning:
# Optimize for shorter documents reranker = Candle::Reranker.from_pretrained( "BAAI/bge-reranker-base", max_length: 256 )
-
Logging: Use the new logging system:
# Enable verbose logging CANDLE_VERBOSE=1 ruby your_script.rb
🐛 Bug Fixes
- Fixed regex validation security vulnerabilities
- Resolved build issues with certain dependency combinations
- Fixed Phi-3 model loading failures
- Corrected truncation behavior in rerankers
📝 Documentation
- Updated MODEL_SUPPORT.md with current compatibility matrix
- Enhanced HUGGINGFACE.md with custom tokenizer examples
- Added comprehensive examples for new features
📦 Installation
gem install red-candle -v 1.3.0Or add to your Gemfile:
gem 'red-candle', '~> 1.3.0'