Feature Request: Configurable Number of Threads for Ollama via .env #1831

panacorn · 2025-07-20T11:36:52Z

panacorn
Jul 20, 2025

It would be helpful to add a setting to the LightRAG .env file that allows users to specify the number of CPU threads passed to Ollama in LLM requests (e.g., for generating completions and embeddings).

Ollama can take a num_thread parameter in the API that controls how many cores it uses for each request. Right now, there’s no way to set this from the LightRAG configuration, so it always defaults to whatever system or global setting is in place. For high-load indexing or tuning performance, it’s useful to be able to control this per run or workspace.

Proposal
• Introduce a new optional setting in .env, something like OLLAMA_NUM_THREAD=8.
• On each request to Ollama, if this variable is set, LightRAG includes "num_thread": <value> in the API call.
• If it’s left unset, default behavior should be unchanged.
Benefits
• Easier performance tuning for different hardware (especially multi-core CPUs like M1/M2/M3).
• More predictable and efficient CPU allocation when running multiple jobs or sharing a system.
• Avoids needing to manually patch code or set shell variables every time.

onestardao · 2025-07-31T14:50:31Z

onestardao
Jul 31, 2025

this is such a clean proposal — and honestly, you're not wrong: being able to explicitly control num_thread per call (rather than relying on global default) has a massive impact on indexing performance, especially when running mixed loads or testing different quant models.

i’ve run into similar bottlenecks before where LLM calls saturated CPU threads unintentionally, just because defaults weren’t adaptive. and patching this manually every time got old fast.

also… small note — depending on how LightRAG wires up the config layer, you might need to guard against .env getting loaded after the internal Ollama client gets instantiated. we once hit a silent override because thread count was frozen too early in the call graph.

just tossing that in case it saves someone a few hours of head-scratching later. if this ends up moving forward, i’d be super curious to test on a few fringe setups.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Configurable Number of Threads for Ollama via .env #1831

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Feature Request: Configurable Number of Threads for Ollama via .env #1831

Uh oh!

panacorn Jul 20, 2025

Replies: 1 comment

Uh oh!

onestardao Jul 31, 2025

panacorn
Jul 20, 2025

onestardao
Jul 31, 2025