generated from kyma-project/template-repository
-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Problem
Our Kyma Companion is significantly slower compared to other AI-based tools like Perplexity.
Looking at our current workflow:
_start → InitialSummarization → Gatekeeper → Supervisor
→ { Common | KubernetesAgent | KymaAgent } → Finalizer → _end
We see that the Supervisor orchestrates multiple specialized agents (Common
, KubernetesAgent
, KymaAgent
)
At present, these calls run sequentially, which increases end-to-end latency.
Proposal: Optimization Areas
-
Parallelization of Agents
- Enable the invoke of
Common
,KubernetesAgent
, andKymaAgent
concurrently rather than sequentially.
- Enable the invoke of
-
Streaming Responses
- Begin streaming tokens from
finalizer
back to the user. - Improves perceived speed significantly.
- Begin streaming tokens from
-
Caching Strategies
- Cache frequent results from RAG agent (e.g., generic Kubernetes answers).
-
Monitoring & Metrics
-
Track key latency metrics:
- TPOT (Time per Output Token)
- Agent-level latency breakdowns (
Common
,KubernetesAgent
,KymaAgent
)
-
Steps
- Benchmark current agent call timings across the workflow.
- Prototype parallel agents execution (async agent calls).
- Add token streaming from
finalizer
. - Experiment with caching .
- Measure improvements against baseline latency.
Metadata
Metadata
Assignees
Labels
No labels