Closed
Description
We decided to use the gpt4all library for ease of use and modularity with support for our first offline LLM, but the library does not yet support GPU usage. There's a PR Open to add support for some GPUs, but it hasn't yet merged.
We can look into going lower in the stack and experimenting using the ctransformers library directly to run llama v2 with GPUs.