Releases · scientist-labs/red-candle

13 Sep 16:44

1.3.0

f5d8072

Latest

🎉 Ruby 3.1+ Support

Breaking Change: This release expands Ruby compatibility by lowering the minimum required version from Ruby 3.2.0 to Ruby 3.1.0, making Red Candle accessible to more developers and systems.

✨ Major Features

Custom Tokenizer Support for All Models

All safetensors model types (Llama, Mistral, Gemma, Qwen, and Phi) now support custom tokenizers:

# Load a model with a specific tokenizer
llm = Candle::LLM.from_pretrained(
  "model-without-bundled-tokenizer",
  tokenizer: "appropriate-tokenizer-repo"
)

# Useful for models that don't include tokenizers or when using specialized tokenizers
llm = Candle::LLM.from_pretrained(
  "meta-llama/Llama-2-7b",
  tokenizer: "meta-llama/Llama-2-7b-chat-hf"  # Use chat-tuned tokenizer
)

Enhanced Phi Model Support

Phi-3-mini-128k models now work correctly
Robust configuration preprocessing handles non-standard fields
Automatic handling of gegelu activation and ff_intermediate_size parameters
Improved compatibility with various Phi-3 model variants

Reranker Performance Improvements

Configurable max length for better performance tuning
Automatic truncation of long inputs to prevent errors
Significant performance optimizations for document ranking tasks

# Configure max length for optimal performance
reranker = Candle::Reranker.from_pretrained(
  "BAAI/bge-reranker-base",
  max_length: 512  # Customize based on your use case
)

🛡️ Security & Stability

Dependency Updates

Security Fix: Resolved PyO3 buffer overflow vulnerability (CVE in PyO3 < 0.24.1)
Major Updates:
- tokenizers: 0.21.1 → 0.22.0
- outlines-core: 0.2 → 0.2.11 (eliminates PyO3 dependency)
- Most Rust dependencies updated to latest stable versions
Note: hf-hub downgraded from 0.4.3 to 0.4.1 for compatibility

Improved Error Messages

Error messages now provide actionable guidance:

Missing tokenizers suggest explicit tokenizer parameter
Authentication errors mention HF_TOKEN requirement
Network issues include connectivity troubleshooting tips

🔧 Developer Experience

Logging System

New Ruby logger integration for cleaner output:

Controlled via CANDLE_VERBOSE environment variable
Reduced noise from Rust internals
Better debugging experience

Repository Migration

Moved from assaydepot to scientist-labs organization
Updated all documentation and CI/CD pipelines
Maintained backward compatibility

📊 Testing

Comprehensive test suite for custom tokenizer functionality
All tests passing with improved coverage
Added specs for truncation and max length configuration
Security vulnerability regression tests

🔄 Migration Guide

From v1.2.x to v1.3.0

Ruby Version: Ensure you have Ruby 3.1.0 or higher (previously required 3.2.0)

Custom Tokenizers: If you have models failing due to missing tokenizers, you can now specify one:

# Before: Would fail
llm = Candle::LLM.from_pretrained("model-without-tokenizer")

# After: Works with explicit tokenizer
llm = Candle::LLM.from_pretrained(
  "model-without-tokenizer",
  tokenizer: "appropriate-tokenizer"
)

Reranker Configuration: Take advantage of performance tuning:

# Optimize for shorter documents
reranker = Candle::Reranker.from_pretrained(
  "BAAI/bge-reranker-base",
  max_length: 256
)

Logging: Use the new logging system:

# Enable verbose logging
CANDLE_VERBOSE=1 ruby your_script.rb

🐛 Bug Fixes

Fixed regex validation security vulnerabilities
Resolved build issues with certain dependency combinations
Fixed Phi-3 model loading failures
Corrected truncation behavior in rerankers

📝 Documentation

Updated MODEL_SUPPORT.md with current compatibility matrix
Enhanced HUGGINGFACE.md with custom tokenizer examples
Added comprehensive examples for new features

📦 Installation

gem install red-candle -v 1.3.0

Or add to your Gemfile:

gem 'red-candle', '~> 1.3.0'

🔗 Links

GitHub: https://github.com/scientist-labs/red-candle
Documentation: https://github.com/scientist-labs/red-candle/blob/main/README.md
Issues: https://github.com/scientist-labs/red-candle/issues

Assets 2

10 Aug 03:04

cpetersen

1.2.0

ae1a949

Red Candle v1.2.0 Release

🎯 Major API Standardization

This release brings significant API improvements focused on consistency and developer experience.

Standardized `from_pretrained` Interface

All model classes now use a consistent from_pretrained method for loading models:

# Before (v1.1.0) - inconsistent APIs
embedding = Candle::EmbeddingModel.new(repo: "BAAI/bge-base-en-v1.5")
reranker = Candle::Reranker.new(repo: "BAAI/bge-reranker-base")
llm = Candle::LLM.new(model_path: "path/to/model")

# After (v1.2.0) - standardized
embedding = Candle::EmbeddingModel.from_pretrained("BAAI/bge-base-en-v1.5")
reranker = Candle::Reranker.from_pretrained("BAAI/bge-reranker-base")
llm = Candle::LLM.from_pretrained("TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF", 
                                  gguf_file: "tinyllama-1.1b-chat-v1.0.Q3_K_S.gguf")

Consistent Parameter Naming

Renamed repo → model_id across all model types
Standardized device parameter handling
Unified tokenizer specification approach

Symbol-keyed Hashes

Model metadata and configurations now return hashes with symbolic keys for better Ruby idioms:

# Configuration hashes use symbols
config = { temperature: 0.7, max_length: 100, seed: 42 }
llm.generate("Hello", config: config)

🧪 Testing Infrastructure Overhaul

Migration to RSpec

Complete migration from Minitest to RSpec
274 comprehensive specs with 82.73% code coverage
Shared examples for common model architectures
Improved test organization and maintainability

CI/CD Improvements

S3 Model Caching: Pre-downloaded models cached in AWS S3 for faster CI runs
HuggingFace Rate Limiting: Eliminated HTTP 429 errors with offline mode
Optimized Test Pipeline: Removed Valgrind from regular CI (20+ min → 2 min)
Separate workflows for memory profiling

🚀 Additional Improvements

Enhanced error messages with actionable solutions
Improved tokenizer auto-detection for GGUF models
Better device compatibility testing
Consolidated model caching logic
Documentation reorganization (moved to docs/ directory)
New Red Candle logo assets

📦 Dependency Updates

Added RSpec as development dependency
Updated rake tasks for new test framework

🔄 Breaking Changes

EmbeddingModel.new(repo:) → EmbeddingModel.from_pretrained(model_id)
Reranker.new(repo:) → Reranker.from_pretrained(model_id)
LLM.new(model_path:) → LLM.from_pretrained(model_id)
Test suite now uses RSpec instead of Minitest

🔧 Migration Guide

Update your code to use the new standardized API:

# Update model loading
model = Candle::EmbeddingModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

# Use symbolic keys in configurations
config = { temperature: 0.8, max_length: 200 }

For testing, switch from Minitest to RSpec patterns and run tests with rake spec instead of rake test.

Assets 2

27 Jul 03:07

cpetersen

1.1.0

a383e6c

Red Candle 1.1.0 Release 🚀

We're excited to announce Red Candle 1.1.0, our first major update bringing powerful structured generation capabilities and expanded model support!

🎯 Structured Generation with Outlines

Red Candle now integrates outlines-core to enable structured generation. Control your model outputs with JSON schemas or regular expressions to ensure you always get valid, parseable results.

Quick Examples

JSON Schema Constraints

require 'candle'

llm = Candle::LLM.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
                                  gguf_file: "mistral-7b-instruct-v0.2.Q4_K_M.gguf")

# Define your schema
schema = {
  type: "object",
  properties: {
    answer: { type: "string", enum: ["yes", "no"] },
    confidence: { type: "number", minimum: 0, maximum: 1 }
  },
  required: ["answer"]
}

# Get structured output (automatically parsed!)
result = llm.generate_structured("Is Ruby a programming language?", schema: schema)
puts result
# => {"answer"=>"yes", "confidence"=>0.95}

Regular Expression Constraints

# Phone numbers
phone_constraint = llm.constraint_from_regex('\d{3}-\d{3}-\d{4}')
config = Candle::GenerationConfig.balanced(constraint: phone_constraint)
phone = llm.generate("Generate a phone number:", config: config)
# => "555-123-4567"

# Dates
date_constraint = llm.constraint_from_regex('\d{4}-\d{2}-\d{2}')
date = llm.generate("Today's date:", config: Candle::GenerationConfig.new(constraint: date_constraint))
# => "2024-12-19"

🆕 New Model Support

Phi Models

Red Candle now supports Microsoft's Phi family of models:

# Phi-2 (2.7B parameters)
llm = Candle::LLM.from_pretrained("microsoft/phi-2")
result = llm.generate("Write a factorial function in Python:")

# Phi-2 GGUF (quantized)
llm = Candle::LLM.from_pretrained("TheBloke/phi-2-GGUF",
                                  gguf_file: "phi-2.Q4_K_M.gguf")

# Phi-3 with chat interface
llm = Candle::LLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
messages = [
  { role: "system", content: "You are a helpful coding assistant." },
  { role: "user", content: "How do I reverse a string in Ruby?" }
]
response = llm.chat(messages)

Qwen Models

Support for Alibaba's Qwen family:

# Qwen2 GGUF models (recommended)
llm = Candle::LLM.from_pretrained("Qwen/Qwen2-7B-Instruct-GGUF",
                                  gguf_file: "qwen2-7b-instruct-q4_k_m.gguf")

# Qwen2.5 models
llm = Candle::LLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct-GGUF",
                                  gguf_file: "qwen2.5-7b-instruct-q4_k_m.gguf")

# Chat with Qwen
messages = [
  { role: "user", content: "Explain Ruby blocks in simple terms" }
]
response = llm.chat(messages)

Note: Qwen3 GGUF support requires candle-transformers > 0.9.1. We recommend using Qwen2 or Qwen2.5 models for now.

📊 Structured Generation + New Models

Combine structured generation with any supported model:

# Use Phi for structured code generation
phi = Candle::LLM.from_pretrained("microsoft/phi-2")
code_schema = {
  type: "object",
  properties: {
    language: { type: "string", enum: ["python", "ruby", "javascript"] },
    code: { type: "string" },
    complexity: { type: "string", enum: ["simple", "intermediate", "advanced"] }
  }
}
code = phi.generate_structured("Write a quicksort implementation", schema: code_schema)

# Use Qwen for structured analysis
qwen = Candle::LLM.from_pretrained("Qwen/Qwen2-7B-Instruct-GGUF",
                                   gguf_file: "qwen2-7b-instruct-q4_k_m.gguf")
sentiment_schema = {
  type: "object",
  properties: {
    sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
    score: { type: "number", minimum: -1, maximum: 1 }
  }
}
analysis = qwen.generate_structured("Ruby makes programming fun!", schema: sentiment_schema)

🚀 Performance Tips

Model Selection: Larger models (7B+) have 90-95% success rate with complex schemas
Temperature: Use lower temperatures (0.1-0.5) for better structured generation
Max Length: Ensure sufficient tokens for complete JSON output
Schema Design: Keep schemas as simple as possible while meeting requirements

📦 Installation

gem install red-candle

Or add to your Gemfile:

gem 'red-candle', '~> 1.1.0'

🔧 Breaking Changes

None! This release maintains full backward compatibility with 1.0.x.

🐛 Bug Fixes

Improved tokenizer auto-detection for GGUF models
Better error messages for model loading failures
Fixed memory leaks in streaming generation

📚 Documentation

New Structured Generation Guide
Updated Model Support Matrix
New examples for Phi and Qwen models

🙏 Acknowledgments

Thanks to the outlines-core team for their excellent structured generation library, and to all contributors who helped with testing and feedback!

Ready to add structure to your LLM outputs? Update to 1.1.0 and start generating exactly what you need! 🎯✨

Assets 2

27 Jul 02:57

cpetersen

1.0.0

8fd0179

Red Candle 1.0.0 Release 🚀

We're thrilled to announce the 1.0.0 release of red-candle, bringing the power of Hugging Face's Candle ML library to the Ruby ecosystem!

What is Red Candle?

Red Candle is a Ruby gem that provides native access to state-of-the-art machine learning models through Rust bindings. It enables Ruby developers to leverage modern NLP capabilities with the performance and efficiency of Rust.

Features

🤖 Large Language Models (LLMs)

Support for both quantized (GGUF) and unquantized (SafeTensors) formats
Built-in support for popular models:
- Mistral - Efficient and powerful instruction-following models
- Gemma - Google's lightweight, open models
- Llama - Meta's family of foundation models
Streaming generation with customizable parameters
Chat interface with automatic template formatting

📊 NLP Tools

Embedding Models - Generate high-quality text embeddings for semantic search and similarity
Rerankers - Improve search relevance with cross-encoder models
Named Entity Recognition (NER) - Extract entities from text with pre-trained models
Tokenizers - Direct access to model tokenization with full vocabulary control

🚀 Performance

Hardware acceleration support:
- Metal (Apple Silicon)
- CUDA (NVIDIA GPUs)
- Optimized CPU inference
Automatic device detection and selection
Memory-efficient model loading

Getting Started

require 'candle'

# Load an LLM
llm = Candle::LLM.from_pretrained("TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
                                  gguf_file: "mistral-7b-instruct-v0.2.Q4_K_M.gguf")

# Generate text
response = llm.generate("What is Ruby?")

# Or use the chat interface
messages = [
  { role: "user", content: "Explain Ruby in one sentence." }
]
response = llm.chat(messages)

Installation

gem install red-candle

Requirements

Ruby 3.0+
Rust toolchain (for compilation)
Optional: CUDA toolkit or Metal framework for GPU acceleration

What's Next?

This is just the beginning! We're committed to expanding Red Candle's capabilities and would love your feedback and contributions. Check out our GitHub repository for examples, documentation, and to report issues.

Acknowledgments

Special thanks to the Hugging Face team for the incredible Candle library, and to all our early testers and contributors who helped shape this release.

Happy coding with Red Candle! 🕯️✨

Assets 2

Releases: scientist-labs/red-candle

Red Candle v1.3.0 Release Notes

🎉 Ruby 3.1+ Support

✨ Major Features

Custom Tokenizer Support for All Models

Enhanced Phi Model Support

Reranker Performance Improvements

🛡️ Security & Stability

Dependency Updates

Improved Error Messages

🔧 Developer Experience

Logging System

Repository Migration

📊 Testing

🔄 Migration Guide

From v1.2.x to v1.3.0

🐛 Bug Fixes

📝 Documentation

📦 Installation

🔗 Links

Uh oh!

Red Candle v1.2.0 Release

🎯 Major API Standardization

Standardized from_pretrained Interface

Consistent Parameter Naming

Symbol-keyed Hashes

🧪 Testing Infrastructure Overhaul

Migration to RSpec

CI/CD Improvements

🚀 Additional Improvements

📦 Dependency Updates

🔄 Breaking Changes

🔧 Migration Guide

Uh oh!

Red Candle 1.1.0 Release 🚀

🎯 Structured Generation with Outlines

Quick Examples

JSON Schema Constraints

Regular Expression Constraints

🆕 New Model Support

Phi Models

Qwen Models

📊 Structured Generation + New Models

🚀 Performance Tips

📦 Installation

🔧 Breaking Changes

🐛 Bug Fixes

📚 Documentation

🙏 Acknowledgments

Uh oh!

Red Candle 1.0.0 Release 🚀

What is Red Candle?

Features

🤖 Large Language Models (LLMs)

📊 NLP Tools

🚀 Performance

Getting Started

Installation

Requirements

What's Next?

Acknowledgments

Uh oh!

Standardized `from_pretrained` Interface