Skip to content

Add Ollama as a supported provider #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 62 commits into from
Closed
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
af06338
WIP: first bits of Ollama provider, adapted from Gemini
ldmosquera Mar 12, 2025
55151a0
Basic chat functionality
ldmosquera Mar 12, 2025
c7b9860
Basic streaming functionality
ldmosquera Mar 12, 2025
f294070
Basic embedding functionality
ldmosquera Mar 12, 2025
8b9899f
Mention Ollama in getting-started.md
ldmosquera Mar 12, 2025
a0617fe
Rubocop autocorrects
ldmosquera Mar 12, 2025
b65132e
More rubocop appeasement
ldmosquera Mar 12, 2025
88cffc3
WIP: start adding Ollama provider tests
ldmosquera Mar 24, 2025
836e961
Fix embeddings return value
ldmosquera Mar 24, 2025
4d8151d
Rubocop appeasement
ldmosquera Mar 25, 2025
fb9697c
Add VCR cassettes
ldmosquera Mar 25, 2025
1298c19
Resolve FIXMEs
ldmosquera Mar 25, 2025
88ca59e
Hint about need for models.refresh!
ldmosquera Mar 25, 2025
5dac2bc
Expose more model metadata
ldmosquera Mar 25, 2025
099978e
Streamline usage in docs
ldmosquera Mar 26, 2025
d668c75
Remove leftovers
ldmosquera Mar 26, 2025
ad7ce33
Streamline provider to be more like OpenAI's
ldmosquera Mar 26, 2025
0f8684c
Rubocop appeasement
ldmosquera Mar 26, 2025
4cd54fb
WIP: fold back all ollama specs into existing files
ldmosquera Mar 26, 2025
af55015
Small fixes
ldmosquera Mar 26, 2025
b3c0296
Merge branch 'main' into ollama-provider
crmne Mar 26, 2025
5cdbd1a
WIP: closer to a refresh! solution
ldmosquera Mar 26, 2025
a9e9cdc
Remove leftovers
ldmosquera Mar 26, 2025
18ec084
Describe the streaming mystery some more; still no proper solution
ldmosquera Mar 26, 2025
8735f95
Streamline capabilities some more (but still mostly placeholders)
ldmosquera Mar 26, 2025
8d7fecf
Fix un-VCR'd `models.refresh!`
ldmosquera Mar 27, 2025
f377120
Parse errors correctly
ldmosquera Mar 27, 2025
b8d3c9b
Fix streaming
ldmosquera Mar 27, 2025
ce9f430
Settle on llama3.1:8b model
ldmosquera Mar 27, 2025
e0d072f
Implement vision
ldmosquera Mar 26, 2025
5a62188
Implement tool calling
ldmosquera Mar 27, 2025
87f8955
Attempt to parse tool calling markup in text response
ldmosquera Mar 27, 2025
e810a7a
Use low temperature when tools are involved
ldmosquera Mar 27, 2025
68c257d
Passing tools spec
ldmosquera Mar 27, 2025
bb2e1bf
Rubocop appeasement
ldmosquera Mar 27, 2025
c57613c
Refresh cassettes for good measure
ldmosquera Mar 27, 2025
24a8eed
Models involved in tests are no longer tiny :shrug:
ldmosquera Mar 27, 2025
4284eb9
Don't send empty tools property
ldmosquera Mar 28, 2025
79fe8e0
Rubocop appeasement
ldmosquera Mar 28, 2025
2f27a4e
Fix tools
ldmosquera Mar 28, 2025
a3635b5
Add unit test for preprocess_tool_calls
ldmosquera Mar 28, 2025
af04773
No need to replace tool call markup
ldmosquera Mar 28, 2025
b26a2de
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 1, 2025
ba56fa2
Update cassettes
ldmosquera Apr 2, 2025
414d009
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 11, 2025
7b4406f
Revert "Use low temperature when tools are involved"
ldmosquera Apr 11, 2025
a7b40af
Configure Ollama from env var when needed
ldmosquera Apr 11, 2025
b22460b
Update ollama related cassetes
ldmosquera Apr 11, 2025
111bda1
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 17, 2025
758374f
Merge model list
ldmosquera Apr 17, 2025
45aaf1c
HACK: manually add in gpt-4.1-nano to model list response
ldmosquera Apr 17, 2025
cafac09
Use a low temperature for tool specs
ldmosquera Apr 17, 2025
1667851
Refresh cassettes
ldmosquera Apr 17, 2025
bd9138f
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 17, 2025
eb846e5
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 17, 2025
591668c
Fix streaming response token count report
ldmosquera Apr 17, 2025
26be17e
Merge remote-tracking branch 'origin/main' into ollama-provider
ldmosquera Apr 22, 2025
6ad89b3
Adapt to new config API
ldmosquera Apr 22, 2025
0fd1759
Don't assume all providers add an api_key setting
ldmosquera Apr 22, 2025
2fcc890
Update cassettes
ldmosquera Apr 22, 2025
ad1bb2b
Remove stale code from main
ldmosquera Apr 22, 2025
7ab29e0
Appease rubocop
ldmosquera Apr 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions bin/console
Original file line number Diff line number Diff line change
@@ -17,6 +17,7 @@ RubyLLM.configure do |config|
config.bedrock_secret_key = ENV.fetch('AWS_SECRET_ACCESS_KEY', nil)
config.bedrock_region = ENV.fetch('AWS_REGION', nil)
config.bedrock_session_token = ENV.fetch('AWS_SESSION_TOKEN', nil)
config.ollama_api_base_url = ENV.fetch('OLLAMA_API_BASE_URL', nil)
end

IRB.start(__FILE__)
2 changes: 1 addition & 1 deletion docs/guides/getting-started.md
Original file line number Diff line number Diff line change
@@ -132,4 +132,4 @@ You've seen the basics! Now you're ready to explore RubyLLM's features in more d
* [Using Tools]({% link guides/tools.md %}) (Letting AI call your code)
* [Streaming Responses]({% link guides/streaming.md %})
* [Rails Integration]({% link guides/rails.md %})
* [Error Handling]({% link guides/error-handling.md %})
* [Error Handling]({% link guides/error-handling.md %})
3 changes: 1 addition & 2 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -71,7 +71,6 @@ require 'ruby_llm'
RubyLLM.configure do |config|
# Set keys for the providers you need. Using environment variables is best practice.
config.openai_api_key = ENV.fetch('OPENAI_API_KEY', nil)
# Add other keys like config.anthropic_api_key if needed
end
```

@@ -112,4 +111,4 @@ Now that you've installed RubyLLM:

* Read the **[Configuration Guide]({% link configuration.md %})** for all setup options.
* Check out the **[Getting Started Guide]({% link guides/getting-started.md %})** for basic usage examples.
* Explore other **[Guides]({% link guides/index.md %})** for specific features like Chat, Tools, Embeddings, etc.
* Explore other **[Guides]({% link guides/index.md %})** for specific features like Chat, Tools, Embeddings, etc.
4 changes: 3 additions & 1 deletion lib/ruby_llm.rb
Original file line number Diff line number Diff line change
@@ -17,7 +17,8 @@
'api' => 'API',
'deepseek' => 'DeepSeek',
'bedrock' => 'Bedrock',
'openrouter' => 'OpenRouter'
'openrouter' => 'OpenRouter',
'ollama' => 'Ollama'
)
loader.ignore("#{__dir__}/tasks")
loader.ignore("#{__dir__}/ruby_llm/railtie")
@@ -79,6 +80,7 @@ def logger
RubyLLM::Provider.register :anthropic, RubyLLM::Providers::Anthropic
RubyLLM::Provider.register :gemini, RubyLLM::Providers::Gemini
RubyLLM::Provider.register :deepseek, RubyLLM::Providers::DeepSeek
RubyLLM::Provider.register :ollama, RubyLLM::Providers::Ollama
RubyLLM::Provider.register :bedrock, RubyLLM::Providers::Bedrock
RubyLLM::Provider.register :openrouter, RubyLLM::Providers::OpenRouter

1 change: 1 addition & 0 deletions lib/ruby_llm/configuration.rb
Original file line number Diff line number Diff line change
@@ -16,6 +16,7 @@ class Configuration
:anthropic_api_key,
:gemini_api_key,
:deepseek_api_key,
:ollama_api_base_url,
:bedrock_api_key,
:bedrock_secret_key,
:bedrock_region,
47 changes: 47 additions & 0 deletions lib/ruby_llm/providers/ollama.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# frozen_string_literal: true

module RubyLLM
module Providers
# Native Ollama API implementation
module Ollama
extend Provider
extend Ollama::Chat
extend Ollama::Embeddings
extend Ollama::Models
extend Ollama::Streaming
extend Ollama::Media
extend Ollama::Tools

module_function

def api_base(config)
# no default since this is the only configuration for this provider,
# so it must be provided deliberately
config.ollama_api_base_url
end

def headers(_config)
{}
end

def capabilities
Ollama::Capabilities
end

def slug
'ollama'
end

def configuration_requirements
%i[ollama_api_base_url]
end

def parse_error(response)
return if response.body.empty?

body = try_parse_json(response.body)
body['error']
end
end
end
end
63 changes: 63 additions & 0 deletions lib/ruby_llm/providers/ollama/capabilities.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# frozen_string_literal: true

module RubyLLM
module Providers
module Ollama
# Determines capabilities for Ollama
module Capabilities
module_function

# FIXME: none of these facts are available from the Ollama server,
# or from the Ollama library (https://ollama.com/library) in a structured way.

# Returns the context window size (input token limit) for the given model
# @param model_id [String] the model identifier
# @return [Integer] the context window size in tokens
def context_window_for(_model_id)
# FIXME: placeholder
4_192 # Sensible (and conservative) default for unknown models
end

# Returns the maximum output tokens for the given model
# @param model_id [String] the model identifier
# @return [Integer] the maximum output tokens
def max_tokens_for(_model_id)
# FIXME: placeholder
32_768
end

# Determines if the model supports vision (image/video) inputs
# @param model_id [String] the model identifier
# @return [Boolean] true if the model supports vision inputs
def supports_vision?(_model_id)
# FIXME: placeholder
false
end

# Determines if the model supports function calling
# @param model_id [String] the model identifier
# @return [Boolean] true if the model supports function calling
def supports_functions?(_model_id)
# FIXME: placeholder
true
end

# Determines if the model supports JSON mode
# @param model_id [String] the model identifier
# @return [Boolean] true if the model supports JSON mode
def supports_json_mode?(_model_id)
# FIXME: placeholder
false
end

# Returns the type of model (chat, embedding, image)
# @param model_id [String] the model identifier
# @return [String] the model type
def model_type(_model_id)
# FIXME: placeholder
'chat'
end
end
end
end
end
40 changes: 40 additions & 0 deletions lib/ruby_llm/providers/ollama/chat.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# frozen_string_literal: true

module RubyLLM
module Providers
module Ollama
# Chat methods for the Ollama API implementation
module Chat
module_function

def completion_url
'api/chat'
end

def render_payload(messages, tools:, temperature:, model:, stream: false)
{
model: model,
messages: Media.format_messages(messages),
options: {
temperature: temperature
},
stream: stream
}.tap { |h| h.merge!(tools: tools.map { |_, t| tool_for(t) }) if tools.any? }
end

def parse_completion_response(response)
data = Tools.preprocess_tool_calls(response.body)

Message.new(
role: :assistant,
content: data.dig('message', 'content'),
model_id: data['model'],
input_tokens: data['prompt_eval_count'].to_i,
output_tokens: data['eval_count'].to_i,
tool_calls: parse_tool_calls(data.dig('message', 'tool_calls'))
)
end
end
end
end
end
48 changes: 48 additions & 0 deletions lib/ruby_llm/providers/ollama/embeddings.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# frozen_string_literal: true

module RubyLLM
module Providers
module Ollama
# Embeddings methods for the Ollama API integration
module Embeddings
module_function

def embedding_url
'api/embed'
end

def render_embedding_payload(text, model:)
{
model: model,
input: format_text_for_embedding(text)
}
end

def parse_embedding_response(response)
vectors = response.body['embeddings']
model_id = response.body['model']
input_tokens = response.body['prompt_eval_count'] || 0
vectors = vectors.first if vectors.size == 1

Embedding.new(
vectors: vectors,
model: model_id,
# only available when passing a single string input
input_tokens: input_tokens
)
end

private

def format_text_for_embedding(text)
# Ollama supports either a string or a string array here
unless text.is_a?(Array) || text.is_a?(String)
raise NotImplementedException, "unsupported argument for Ollama embedding: #{text.class}"
end

text
end
end
end
end
end
43 changes: 43 additions & 0 deletions lib/ruby_llm/providers/ollama/media.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# frozen_string_literal: true

module RubyLLM
module Providers
module Ollama
# Handles formatting of text or media content for Ollama
module Media
module_function

def format_messages(messages)
messages.map do |msg|
text, images = separate_by_type(msg)

{
role: msg.role.to_s,
content: text
}.tap { |h| h.merge!(images: images) if images.any? }
end
end

def separate_by_type(msg) # rubocop:disable Metrics/MethodLength
text = nil
images = []

if msg.content.is_a?(Array)
msg.content.each do |part|
case part[:type]
when 'text'
text = part[:text]
when 'image'
images << part[:source][:data]
end
end
else
text = msg.content
end

[text, images]
end
end
end
end
end
71 changes: 71 additions & 0 deletions lib/ruby_llm/providers/ollama/models.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# frozen_string_literal: true

module RubyLLM
module Providers
module Ollama
# Models methods for the Ollama API integration
module Models
# Methods needed by Provider - must be public
def models_url
'api/tags'
end

# FIXME: include aliases for tags with the format \d+m or \d+b
# ie. given these models in the server,
# - gemma3:27b
# - gemma3:9b
#
# create an alias gemma3 for gemma3:27b

# NOTE: Unlike other providers for well known APIs with stable model
# offerings, the Ollama provider deals with local servers which
# might have arbitrarily named models or even zero models installed.
#
# Thus, this provider can't ship hardcoded assumptions in models.json
# and thus no Ollama models will be known at runtime, so you'll need a
# `RubyLLM.models.refresh!` to populate your instance's models.

def list_models(connection:)
config = connection.config
response = connection.get('api/tags') do |req|
req.headers.merge!(headers(config))
end

parse_list_models_response(response, slug, capabilities)
end

private

def parse_list_models_response(response, slug, capabilities) # rubocop:disable Metrics/MethodLength,Metrics/AbcSize
(response.body['models'] || []).map do |model|
model_id = model['name']

ModelInfo.new(
id: model_id,
# NOTE: this is date pulled into the Ollama server, not date of introduction of a model
created_at: model['modified_at'],
display_name: model_id,
provider: slug,
type: capabilities.model_type(model_id),
family: model['family'],
context_window: capabilities.context_window_for(model_id),
max_tokens: capabilities.max_tokens_for(model_id),
supports_vision: capabilities.supports_vision?(model_id),
supports_functions: capabilities.supports_functions?(model_id),
supports_json_mode: capabilities.supports_json_mode?(model_id),
input_price_per_million: 0.0,
output_price_per_million: 0.0,
metadata: {
byte_size: model['size']&.to_i,
parameter_size: model.dig('details', 'parameter_size'),
quantization_level: model.dig('details', 'quantization_level'),
format: model.dig('details', 'format'),
parent_model: model.dig('details', 'parent_model')
}
)
end
end
end
end
end
end
Loading