Open
Description
Problem Statement:
Currently, Dapr Agents waits for the full LLM response before returning results. This creates latency and degrades user experience, especially for long-form completions. LLMs inherently generate output token-by-token, and many use cases benefit from streaming responses as they’re produced.
Objective:
Implement streaming support for LLM output in Dapr Agents to improve perceived latency and responsiveness. Ensure consistent behavior across:
Agent types and specifically DaprChatClient (and its internal dependency Dapr Conversation API)
Acceptance Criteria:
- Token-by-token streaming supported for all agent types.
- DaprChatClient exposes streaming API or callback mechanism.
- Dapr Conversation API supports HTTP and gRPC streaming (as applicable).
- Examples and docs updated to demonstrate streaming usage.
Metadata
Metadata
Assignees
Labels
No labels