Add Ollama as a supported provider #10

ldmosquera · 2025-03-12T22:21:09Z

As mentioned in issue #2, Ollama support is valuable for users interested in self-hosted/offline inference.

This PR adds initial support for Ollama including chat completions, streaming and embeddings. No tool support yet, needs further investigation.

PR is a rough draft and will likely take some back and forth to get it merge-ready; more comments to follow.

Closes #2

ldmosquera · 2025-03-12T22:30:03Z

This was done copying and adapting from the existing Gemini provider. Tool usage and media was excised for now; I still need to research what Ollama offers and what overlaps within this project. Also, capabilities.rb is full of placeholder stuff until we figure out sane defaults since Ollama allows running arbitrary models instead of a well known list.

Commit 9b387c1 disables all providers by default in models.refresh! unless they are explicitly configured.

The intention is to be able to work on Ollama (or any other specific single provider) without being forced to provide valid keys for EVERY provider, also incurring live calls. In particular, because Ollama doesn't come with any default models and so models.refresh! is mandatory before any usage to populate available models on the server.

As mentioned in the commit message, this might be too intrusive for the intended usage of this project, so an alternative is to only do this when given a specific env var that puts ruby-llm into such a "default offline" mode.

This might interfere with tests and/or the models update rake task, none of which I touched yet since they also require valid keys for all providers.

As for tests, I'd like some guidance into how to implement unit testing and eventually integration testing specifically for Ollama, in a way that does not require configuring and using all APIs (and attending cost).

Since Ollama does not come with default models, I suggest a separate test suite that first ensures that a tiny model is downloaded into the Ollama server via its API.

lib/ruby_llm/provider.rb

jm3 · 2025-03-13T03:11:16Z

was just about to file an issue for this — awesome!

crmne · 2025-03-23T16:01:43Z

Two new features to consider in your provider implementation:

Model Aliases: Please add entries for your provider in aliases.json:

"claude-3-5-sonnet": {
  "anthropic": "claude-3-5-sonnet-20241022",
  "your-provider": "your-provider-specific-id"
}

Provider Selection: Users will be able to specify your provider:

chat = RubyLLM.chat(model: 'claude-3-5-sonnet', provider: 'your-provider')

Docs: https://rubyllm.com/guides/models#using-model-aliases

ldmosquera · 2025-03-23T17:25:23Z

In general, I understand this can be used to specify a provider when the model is served by more than one provider.

But for that, we have to know for sure that the provider supports that model, and yet in the case of Ollama, there are no guarantees that a given server will have any models installed, thus we can't hardcode any static assumptions into the gem. Rather, the list of models would only be available after querying the server at runtime.

I realize this deviates from the general pattern of the gem which aims to provide a good out of the box experience for well known API providers, but by definition Ollama is not that. It can be used with arbitrary models even beyond Ollama's official library - you can pull from HuggingFace or even create your own models with arbitrary names.

I still think supporting Ollama within reasonable limits is a valuable addition to the gem; we just have to figure out those limits.

ldmosquera · 2025-03-24T15:04:37Z

Commit 9b387c1 disables all providers by default in models.refresh! unless they are explicitly configured.
As mentioned in the commit message, this might be too intrusive for the intended usage of this project, so an alternative is to only do this when given a specific env var that puts ruby-llm into such a "default offline" mode.

Any guidance on this?

As for tests, I'd like some guidance into how to implement unit testing and eventually integration testing specifically for Ollama, in a way that does not require configuring and using all APIs (and attending cost).

And on this?

I didn't consider it originally, but I see now that the test suite has no support for being run without valid keys for all providers, which is a barrier for me.

Is there any chance that the test suite can be mocked as much as possible (requests for metadata, etc) such that only tests that NEED to run API inference fail? This is the only way I can see to mitigate the key requirement problem while still running as much of the test suite as possible.

Another option is to create an entirely separate suit of tests just for the Ollama provider, but the reduced integration test coverage is obviously problematic even on paper.

tpaulshippy · 2025-03-24T19:14:03Z

Because of the recently added VCR support I was able to get the tests to pass (with empty strings in all the API key environment variables). Wish the tests did not require these, but it's an easy fix.

ldmosquera · 2025-03-24T19:44:00Z

Excellent! I somehow missed that commit earlier.

I'm now unblocked and will work on tests for this provider.

ldmosquera · 2025-03-25T02:48:39Z

Tests added, but a few provisional hacks are still in place and will likely need revisiting:

37c5fd5
11eedd8 (new since force push)
4f48aae (new since force push)

Taking a step back, all of these hacks have to do with the requirement for valid API keys when using models.refresh!. This is a blocker for developers without keys for everything (an onus which will only grow with each provider contributed in), and it also creates friction for privacy minded users who want to use Ollama locally with ZERO traffic to cloud providers.

It's a bit of a rough marriage, but we can make it work :D We just need to figure out a sane way to conditionally turn off all but one provider.

All current tests pass when relying on VCR cassettes.

crmne · 2025-03-25T10:35:02Z

Valid point @ldmosquera! I'm working on a solution, will commit it on main soon.

crmne · 2025-03-25T10:47:14Z

Added configuration requirements handling in 75f99a1

Each provider now specifies what configuration is required via a simple configuration_requirements method (you will need to implement this in your main provider file) that returns an array of config keys as symbols. The Provider module uses this to:

Determine if a provider is properly configured
Throw an error if you're trying to use that provider without configuration
Include ready-to-paste configuration code in the error message
Skip unconfigured providers during model refresh while preserving their models

Example of the new error messages:

RubyLLM::ConfigurationError: anthropic provider is not configured. Add this to your initialization:

RubyLLM.configure do |config|
  config.anthropic_api_key = ENV['ANTHROPIC_API_KEY']
end

ldmosquera · 2025-03-25T12:27:00Z

Smashing 😎

All major hacks dropped, save for RubyLLM::Provider.disable_all_providers (see below).

spec/ruby_llm/providers/ollama/ollama_spec.rb

lib/ruby_llm/aliases.rb

crmne

The implementation path is straightforward - Ollama is just another provider that needs to:

Implement the standard provider interface
Override specific methods where needed
Use the existing test infrastructure
Get documented like other providers

I think the only change w.r.t. to other providers is that in the documentation we should add that refreshing models should be done prior to using the API.

Happy to help out. The goal is making Ollama work cleanly within RubyLLM's existing patterns.

docs/guides/getting-started.md

lib/ruby_llm/aliases.rb

lib/ruby_llm/provider.rb

lib/ruby_llm/providers/ollama.rb

lib/ruby_llm/providers/ollama/chat.rb

lib/ruby_llm/providers/ollama/embeddings.rb

lib/ruby_llm/providers/ollama/models.rb

lib/ruby_llm/providers/ollama/streaming.rb

spec/ruby_llm/providers/ollama/ollama_spec.rb

Conflicts: docs/guides/getting-started.md lib/ruby_llm.rb lib/ruby_llm/configuration.rb spec/fixtures/vcr_cassettes/models_refresh_updates_models_and_returns_a_chainable_models_instance.yml spec/fixtures/vcr_cassettes/models_refresh_works_as_a_class_method_too.yml spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb

All cassettes for Ollama are up to date; I manually copied the new Bedrock responses for model listings. VCR data will be synthetic until someone can run tests with all providers properly enabled, including a local Ollama server having run the rake task to get test models.

ldmosquera · 2025-04-02T01:03:32Z

Merged main and solved conflicts.

Echoing the commit message in ba56fa2 for visibility:

All cassettes for Ollama are up to date; I manually copied the new
Bedrock responses for model listings.

VCR data will be synthetic until someone can run tests with all
providers properly enabled, including a local Ollama server having run
the rake task to get test models.

frimmy · 2025-04-06T18:25:02Z

@crmne @ldmosquera just dropping appreciation! 🙇 🙏

Conflicts: spec/fixtures/vcr_cassettes/models_refresh_updates_models_and_returns_a_chainable_models_instance.yml spec/fixtures/vcr_cassettes/models_refresh_works_as_a_class_method_too.yml spec/ruby_llm/chat_content_spec.rb spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb spec/ruby_llm/embeddings_spec.rb spec/spec_helper.rb

In retrospect, this is too automagic and ultimately not needed at the library level. This reverts commit e810a7a.

crmne · 2025-04-17T12:04:26Z

@ldmosquera I'd love to see Ollama support merged soon. Do you think you can resolve the conflicts? Once you're ready set this PR as ready for review and I'll take a look.

Conflicts: docs/guides/getting-started.md spec/ruby_llm/chat_content_spec.rb spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb

After tests were changed to use `gpt-4.1-nano` in 7017dcf, these cassettes were not refreshed: - `spec/fixtures/vcr_cassettes/models_refresh_updates_models_and_returns_a_chainable_models_instance.yml` - `spec/fixtures/vcr_cassettes/models_refresh_works_as_a_class_method_too.yml` My branch does `models.refresh!` into a cassette named `spec/fixtures/vcr_cassettes/initial_model_refresh.yml` where I have been copying responses from the above original files, but right now I don't have the model list responses for the new `gpt-4.1-nano` ones and tests can't run with unknown models. The proper solution is to delete all these files then re-run the tests to get fresh data from each source, but meanwhile, this commit jury-rigs nano into the response.

Llama-3.1 has trouble with tool usage with default temperatures; it will sometimes use available tools but most times it will opt not to and hallucinate instead, making the tests brittle. And lower temp mitigates this and will probably reflect real world usage anyway.

ldmosquera · 2025-04-17T14:11:24Z

Ready for review; I imagine there will be some more back and forth before it's mergeable. Apologies and thanks in advance 😬

Conflicts: spec/ruby_llm/chat_tools_spec.rb

Conflicts: docs/installation.md lib/ruby_llm.rb lib/tasks/model_updater.rb spec/ruby_llm/chat_content_spec.rb spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb

ldmosquera · 2025-04-22T12:15:25Z

Merged main; diff to yesterday here.

khasinski · 2025-04-22T15:16:55Z

Just curious - couldn't it be easier to use OpenAI's compatibility layer in Ollama? https://ollama.com/blog/openai-compatibility

The same worked for OpenRouter, so maybe it's also the case here.

crmne · 2025-04-22T15:23:12Z

Hey @khasinski I'm pondering the same. The reason why I preferred a true Ollama implementation are summed up perfectly by Ollama themselves:

This is initial experimental support for the OpenAI API. Future improvements under consideration include:

Embeddings API
Function calling
Vision support

However, we could simply test these assumptions and roll it out as OpenAI compatible if everything works.

Case in point OpenRouter claims that they don't support PDFs in their documentation, but you can check out the tests I just committed in RubyLLM main since they do work!

ldmosquera · 2025-04-22T16:14:43Z

At this point the PR has full coverage for the Ollama API, including repeatable tests plus a rake task that automates downloading the models needed for those tests.

If we were to scale back to simply using Ollama's OpenAI compatibility layer, some Ollama-specific code for parsing responses could be dropped, but other code for querying models would still need to remain. Tests would need to be rethought.

That, along with the fact the Ollama's OAI compatibility is not yet complete as mentioned above, makes think we should continue targeting for Ollama API at least for now. I can commit to watch for incoming Ollama-related issues and provide support.

This could be revisited in the future; over time it would be great if we could have generic live model querying capabilities, so that OpenAI compatibility can be used to point at any compatible self-hosted engines like Ollama, KoboldCPP, LlamaCPP, TabbyAPI and so on.

The messy thing is still dealing with uncertain self-hosted model capabitilies - this is inherent, as people can stand up models with any context size, with or without media projector in the case of GGUF based vision models, etc.

ldmosquera commented Mar 12, 2025

View reviewed changes

lib/ruby_llm/provider.rb Outdated Show resolved Hide resolved

hemanth mentioned this pull request Mar 15, 2025

feat: support for Ollama #18

Closed

khasinski mentioned this pull request Mar 16, 2025

Add OpenRouter as a supported provider #29

Closed

crmne added the new provider New provider integration label Mar 21, 2025

ldmosquera force-pushed the ollama-provider branch from 6bcdfa2 to 31034be Compare March 25, 2025 02:46

ldmosquera force-pushed the ollama-provider branch from 31034be to c8cd658 Compare March 25, 2025 12:26

ldmosquera commented Mar 25, 2025

View reviewed changes

spec/ruby_llm/providers/ollama/ollama_spec.rb Outdated Show resolved Hide resolved

ldmosquera commented Mar 25, 2025

View reviewed changes

spec/ruby_llm/providers/ollama/ollama_spec.rb Outdated Show resolved Hide resolved

ldmosquera force-pushed the ollama-provider branch 3 times, most recently from 62f6859 to 18978b5 Compare March 25, 2025 14:43

ldmosquera commented Mar 25, 2025

View reviewed changes

lib/ruby_llm/aliases.rb Outdated Show resolved Hide resolved

crmne requested changes Mar 26, 2025

View reviewed changes

ldmosquera added 6 commits March 26, 2025 09:39

WIP: first bits of Ollama provider, adapted from Gemini

af06338

Basic chat functionality

55151a0

Basic streaming functionality

c7b9860

Basic embedding functionality

f294070

Mention Ollama in getting-started.md

8b9899f

Rubocop autocorrects

a0617fe

ldmosquera added 2 commits April 1, 2025 22:00

ldmosquera added 4 commits April 11, 2025 18:19

Revert "Use low temperature when tools are involved"

7b4406f

In retrospect, this is too automagic and ultimately not needed at the library level. This reverts commit e810a7a.

Configure Ollama from env var when needed

a7b40af

Update ollama related cassetes

b22460b

ldmosquera added 5 commits April 17, 2025 10:18

Merge remote-tracking branch 'origin/main' into ollama-provider

111bda1

Conflicts: docs/guides/getting-started.md spec/ruby_llm/chat_content_spec.rb spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb

Merge model list

758374f

Refresh cassettes

1667851

ldmosquera marked this pull request as ready for review April 17, 2025 14:10

ldmosquera added 3 commits April 17, 2025 11:53

Merge remote-tracking branch 'origin/main' into ollama-provider

bd9138f

Conflicts: spec/ruby_llm/chat_tools_spec.rb

Merge remote-tracking branch 'origin/main' into ollama-provider

eb846e5

Conflicts: spec/ruby_llm/chat_tools_spec.rb

Fix streaming response token count report

591668c

ldmosquera force-pushed the ollama-provider branch from bf80ff9 to 591668c Compare April 17, 2025 22:09

ldmosquera added 6 commits April 22, 2025 08:45

Merge remote-tracking branch 'origin/main' into ollama-provider

26be17e

Conflicts: docs/installation.md lib/ruby_llm.rb lib/tasks/model_updater.rb spec/ruby_llm/chat_content_spec.rb spec/ruby_llm/chat_spec.rb spec/ruby_llm/chat_streaming_spec.rb spec/ruby_llm/chat_tools_spec.rb

Adapt to new config API

6ad89b3

Don't assume all providers add an api_key setting

0fd1759

Update cassettes

2fcc890

Remove stale code from main

ad1bb2b

Appease rubocop

7ab29e0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ollama as a supported provider #10

Add Ollama as a supported provider #10

ldmosquera commented Mar 12, 2025 •

edited

Loading

ldmosquera commented Mar 12, 2025 •

edited

Loading

jm3 commented Mar 13, 2025

crmne commented Mar 23, 2025

ldmosquera commented Mar 23, 2025

ldmosquera commented Mar 24, 2025

tpaulshippy commented Mar 24, 2025 •

edited

Loading

ldmosquera commented Mar 24, 2025

ldmosquera commented Mar 25, 2025 •

edited

Loading

crmne commented Mar 25, 2025

crmne commented Mar 25, 2025

ldmosquera commented Mar 25, 2025 •

edited

Loading

crmne left a comment

ldmosquera commented Apr 2, 2025 •

edited

Loading

frimmy commented Apr 6, 2025

crmne commented Apr 17, 2025

ldmosquera commented Apr 17, 2025

ldmosquera commented Apr 22, 2025

khasinski commented Apr 22, 2025

crmne commented Apr 22, 2025 •

edited

Loading

ldmosquera commented Apr 22, 2025 •

edited

Loading

Add Ollama as a supported provider #10

Are you sure you want to change the base?

Add Ollama as a supported provider #10

Conversation

ldmosquera commented Mar 12, 2025 • edited Loading

ldmosquera commented Mar 12, 2025 • edited Loading

jm3 commented Mar 13, 2025

crmne commented Mar 23, 2025

ldmosquera commented Mar 23, 2025

ldmosquera commented Mar 24, 2025

tpaulshippy commented Mar 24, 2025 • edited Loading

ldmosquera commented Mar 24, 2025

ldmosquera commented Mar 25, 2025 • edited Loading

crmne commented Mar 25, 2025

crmne commented Mar 25, 2025

ldmosquera commented Mar 25, 2025 • edited Loading

crmne left a comment

Choose a reason for hiding this comment

ldmosquera commented Apr 2, 2025 • edited Loading

frimmy commented Apr 6, 2025

crmne commented Apr 17, 2025

ldmosquera commented Apr 17, 2025

ldmosquera commented Apr 22, 2025

khasinski commented Apr 22, 2025

crmne commented Apr 22, 2025 • edited Loading

ldmosquera commented Apr 22, 2025 • edited Loading

ldmosquera commented Mar 12, 2025 •

edited

Loading

ldmosquera commented Mar 12, 2025 •

edited

Loading

tpaulshippy commented Mar 24, 2025 •

edited

Loading

ldmosquera commented Mar 25, 2025 •

edited

Loading

ldmosquera commented Mar 25, 2025 •

edited

Loading

ldmosquera commented Apr 2, 2025 •

edited

Loading

crmne commented Apr 22, 2025 •

edited

Loading

ldmosquera commented Apr 22, 2025 •

edited

Loading