Enable Streaming LLM Output

**Problem Statement:**
Currently, Dapr Agents waits for the full LLM response before returning results. This creates latency and degrades user experience, especially for long-form completions. LLMs inherently generate output token-by-token, and many use cases benefit from streaming responses as they’re produced.

**Objective:**
Implement streaming support for LLM output in Dapr Agents to improve perceived latency and responsiveness. Ensure consistent behavior across:
Agent types and specifically DaprChatClient (and its internal dependency Dapr Conversation API)

**Acceptance Criteria:**
- Token-by-token streaming supported for all agent types.
- DaprChatClient exposes streaming API or callback mechanism.
- Dapr Conversation API supports HTTP and gRPC streaming (as applicable).
- Examples and docs updated to demonstrate streaming usage.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable Streaming LLM Output #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enable Streaming LLM Output #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions