Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. #2355

quantumalchemy · 2025-01-16T16:32:26Z

Is your feature request related to a problem? Please describe.

All openai compatible api proxies Litellm, llama.cpp, lm studio have support streaming
It would be Great if Letta could support this not just openai only.

Greatest Advantage -- Would cut down the latency for Local small LLMs via
openai compatible api proxies: like Litellm, llama.cpp, llamafile, lm studio..
opensource LLM are getting faster better and smaller
I have been testing this last week with 1.5B model: Dolphin3.0-Qwen2.5-1.5B.i1-Q6_K.gguf working with Letta ***
function calling ect. running on cpu / edge device served up by llamafile - https://github.com/Mozilla-Ocho/llamafile ( llama.cpp)
If we could only get streaming enabled would be fantastic! - Thanks!

sarahwooders · 2025-01-23T03:08:26Z

Yes we are looking into this!

quantumalchemy changed the title ~~Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio ..~~ Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. #2355

Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. #2355

quantumalchemy commented Jan 16, 2025

sarahwooders commented Jan 23, 2025

Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. #2355

Feat: Streaming support for All openai compatible api proxies Litellm, llama.cpp, llamafile, lm studio .. #2355

Comments

quantumalchemy commented Jan 16, 2025

sarahwooders commented Jan 23, 2025