Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Replace llama.cpp with ollama #3542

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from
Draft

WIP: Replace llama.cpp with ollama #3542

wants to merge 40 commits into from

Conversation

cebtenzzre
Copy link
Member

Summary of changes as of 3/19

new directories:

  • gpt4all-backend: an entirely new backend for GPT4All which includes a REST client for ollama.
  • gpt4all-backend-test: a test for ollama client functionality written before the gpt4all-chat changes.

removed directories:

  • gpt4all-bindings: there is no longer a llama.cpp backend for them to use. no plan for Embed4All users yet.

moved directories:

  • gpt4all-backend -> gpt4all-backend-old: renaming this directory while we transition gpt4all-chat off of it.

new files:

  • deps/CMakeLists.txt: cmake configuration for shared dependencies between gpt4all-chat and gpt4all-backend.
  • gpt4all-backend/deps/CMakeLists.txt: cmake configuration for backend dependencies.
  • gpt4all-backend/include/gpt4all-backend/formatters.h: fmt helpers split off from gpt4all-chat/utils.h.
  • gpt4all-backend/include/gpt4all-backend/generation-params.h: obsolete; these are now defined in gpt4all-chat.
  • gpt4all-backend/include/gpt4all-backend/json-helpers.h: helpers for (de-)serializing Qt types with Boost.JSON.
  • gpt4all-backend/include/gpt4all-backend/ollama-client.h: a REST client for ollama.
  • gpt4all-backend/include/gpt4all-backend/ollama-model.h: obsolete; this is now defined in gpt4all-chat.
  • gpt4all-backend/include/gpt4all-backend/ollama-types.h: (de-)serializale types used in ollama client requests/responses.
  • gpt4all-backend/include/gpt4all-backend/rest.h: helpers for working with REST APIs. used by ollama-client and gpt4all-chat.
  • gpt4all-backend/src/CMakeLists.txt: cmake configuration for backend sources.
  • gpt4all-backend/src/json-helpers.cpp: json-helpers.h implementation.
  • gpt4all-backend/src/ollama-client.cpp: ollama-client.h implementation.
  • gpt4all-backend/src/ollama-types.cpp: ollama-types.h implementation.
  • gpt4all-backend/src/qt-json-stream.{cpp,h}: a QIODevice wrapping a boost::json::value, similar in concept to a std::ostringstream.
  • gpt4all-backend/src/rest.cpp: implementation for rest.h.
  • gpt4all-chat/qml/AddCustomProviderView.qml: a new tab of the "add model" page for adding new custom OpenAI or ollama providers.
  • gpt4all-chat/qml/CustomProviderCard.qml: a card in AddCustomProviderView representing an ollama or OpenAI provider.
  • gpt4all-chat/src/creatable.h: helper for classes derived from std::enable_shared_from_this.
  • gpt4all-chat/src/json-helpers.cpp:
  • gpt4all-chat/src/json-helpers.h: helpers for (de-)serializing Qt types with Boost.JSON which only gpt4all-chat needs.
  • gpt4all-chat/src/llmodel_chat.{cpp,h}: base class for working with an LLM that can generate text.
  • gpt4all-chat/src/llmodel_description.{cpp,h}: base class for working with the description of an ollama or OpenAI model.
  • gpt4all-chat/src/llmodel_ollama.{cpp,h}: classes for working with ollama providers and models.
  • gpt4all-chat/src/llmodel_openai.{cpp,h}: classes for working with OpenAI providers and models. derived from the obsolete chatllm.h.
  • gpt4all-chat/src/llmodel_provider.{cpp,h,inl}: class for representing a model provider (type + name + base URL) which is either builtin or custom, and is serialized to the models directory if needed.
  • gpt4all-chat/src/llmodel_provider_builtins.cpp: hardcoded, ordered list of built-in model providers, taken from qml.
  • gpt4all-chat/src/main.h: exposes a singleton QNetworkAccessManager which should be used for all network requests on the main thread.
  • gpt4all-chat/src/qmlfunctions.{cpp,h}: free functions needed by QML go here, since QML can only call instance methods.
  • gpt4all-chat/src/qmlsharedptr.{cpp,h}: a basic shared pointer for QML. QSharedPointer has no built-in QML equivalent.
  • gpt4all-chat/src/store_base.{cpp,h,inl}: base class for managing a collection of (de-)serialized objects (such as providers) using Boost.JSON.
  • gpt4all-chat/src/store_provider.{cpp,h}: a class that manages a collection of (de-)serialized providers.
  • requirements-docs.txt: defines the dependencies required to build the docs. extracted from the python bindings' pyproject.toml.

removed files:

  • .github/ISSUE_TEMPLATE/bindings-bug.md: removed as the bindings are now gone.
  • gpt4all-chat/src/chatapi.{cpp,h}: replaced by llmodel_openai.{cpp,h}.

changed files:

  • .circleci/config.yml: removed references to the bindings.
  • .circleci/continue_config.yml: removed references to the bindings.
  • .codespellrc: updated whitelist.
  • MAINTAINERS.md: removed references to the bindings.
  • README.md: removed references to llama.cpp and the bindings.
  • common/common.cmake: added color diagnostics for ninja.
  • gpt4all-chat/CMakeLists.txt: modified lists of source files and dependencies.
  • gpt4all-chat/deps/CMakeLists.txt: modified dependencies.
  • gpt4all-chat/qml/AddModelView.qml: added custom providers tab.
  • gpt4all-chat/qml/AddRemoteModelView.qml: replaced static providers with a dynamic list.
  • gpt4all-chat/qml/ApplicationSettings.qml: removed llama.cpp-specific settings.
  • gpt4all-chat/qml/ModelSettings.qml: removed llama.cpp-specific settings.
  • gpt4all-chat/qml/RemoteModelCard.qml: remote providers now use the classes in gpt4all-chat.
  • gpt4all-chat/src/chat.{cpp,h}: removed llama.cpp-specific state.
  • gpt4all-chat/src/chatlistmodel.cpp: bumped .chat file version.
  • gpt4all-chat/src/chatllm.{cpp,h}: partway through replacing llama.cpp use with ChatLLMInstance use.
  • gpt4all-chat/src/chatmodel.h: replaced an #include since the referenced code moved.
  • gpt4all-chat/src/database.cpp: replaced an #include since the referenced code moved.
  • gpt4all-chat/src/embllm.cpp: partway through replacing llama.cpp-specific code.
  • gpt4all-chat/src/jinja_helpers.cpp: unnecessary #include change.
  • gpt4all-chat/src/main.cpp: added a global QNetworkAccessManager, and exposed the ProviderRegistry to QML.
  • gpt4all-chat/src/modellist.{cpp,h}: moved out OpenAI-specific code and started integrating ModelDescription.
  • gpt4all-chat/src/mysettings.{cpp,h}: removed llama.cpp-specific settings, added a hardcoded user agent, and started changing generation params.
  • gpt4all-chat/src/network.cpp: removed llama.cpp-specific analytics.
  • gpt4all-chat/src/server.{cpp,h}: started adapting to chatllm.{cpp,h} changes.
  • gpt4all-chat/src/utils.{h,inl}: added more simple utilities.
  • gpt4all-training/old-README.md: removed references to the bindings.

new deps:

  • deps/qcoro: a library for building C++20 coroutines (async functions) using Qt's event loop.
  • gpt4all-backend/deps/date: time zone aware date parsing required for parsing timestamps in ollama responses. will drop when all platforms support __cpp_lib_chrono >= 201907L.
  • gpt4all-chat/deps/generator: third-party implementation of std::generator. will drop when all platforms support __cpp_lib_generator >= 202207L.

moved deps:

  • gpt4all-chat/deps/fmt -> deps/fmt: fmt is now used by the backend as well.

changed deps:

  • gpt4all-chat/deps/minja: changed to fix #include path of nlohmann/json.

Signed-off-by: Jared Van Bortel <[email protected]>
Signed-off-by: Jared Van Bortel <[email protected]>
Signed-off-by: Jared Van Bortel <[email protected]>
Signed-off-by: Jared Van Bortel <[email protected]>
2025 is too soon to use C++ features from 2020 without running into bugs
in every build tool that touches the project.
@Titaniumtown
Copy link

Is this simply going to use a ollama api endpoint? Or is ollama actually integrated inside of gpt4all in this PR?

@iwr-redmond
Copy link

iwr-redmond commented Mar 22, 2025

I have the same question as @Titaniumtown. I recently catalogued Ollama's recurring issues with non-standard installation processes, and wouldn't like to see GPT4all jump into that quagmire.

@Titaniumtown
Copy link

Thank you @iwr-redmond, issues such as those were going to be my follow up. If somehow ollama can be integrated inside of gpt4all, so it is seemless to the user, I would be in favor. As long as it would be used as simply an abstraction layer to llama.cpp and not an external server you need to connect to.

@iwr-redmond
Copy link

The Ollama devs have decided to shoot a hole in the screen door and abandon llama.cpp in favor of a custom inference engine. I reckon that pushes this PR into wet shoe territory.

@Titaniumtown
Copy link

Oh yikes. Ollama is really going down the drain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants