Skip to content

Optimize Kyma Companion Workflow Latency #804

@tanweersalah

Description

@tanweersalah

Problem

Our Kyma Companion is significantly slower compared to other AI-based tools like Perplexity.
Looking at our current workflow:

_start → InitialSummarization → Gatekeeper → Supervisor 
        → { Common | KubernetesAgent | KymaAgent } → Finalizer → _end

We see that the Supervisor orchestrates multiple specialized agents (Common, KubernetesAgent, KymaAgent)
At present, these calls run sequentially, which increases end-to-end latency.

Proposal: Optimization Areas

  1. Parallelization of Agents

    • Enable the invoke of Common, KubernetesAgent, and KymaAgent concurrently rather than sequentially.
  2. Streaming Responses

    • Begin streaming tokens from finalizer back to the user.
    • Improves perceived speed significantly.
  3. Caching Strategies

    • Cache frequent results from RAG agent (e.g., generic Kubernetes answers).
  4. Monitoring & Metrics

    • Track key latency metrics:

      • TPOT (Time per Output Token)
      • Agent-level latency breakdowns (Common, KubernetesAgent, KymaAgent)

Steps

  • Benchmark current agent call timings across the workflow.
  • Prototype parallel agents execution (async agent calls).
  • Add token streaming from finalizer.
  • Experiment with caching .
  • Measure improvements against baseline latency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions