Introduction

This README will provide guidance for using the Kagenti project with Llama Stack.

The model could optionally be provided in-cluster OR served external to the cluster. The scenario described here at minimum needs OpenShift AI specifically with the capabilities of Llama Stack.

RHOAI

Ensure the RHOAI operator is installed through Operator hub.

An example model deployment is located at kubernetes/llama3.2-3b/

LLama Stack

By default, OpenShift does not initialize Llama Stack. When deploying the default Data Science Cluster modify the YAML and enable llamastackoperator:

    llamastackoperator:
      managementState: Managed

The operator will become ready once Llama stack is initialized.

To validate Llama Stack is ready and available the following can be done.

oc get llsd -n default
No resources found in default namespace.

oc get po -n redhat-ods-applications | grep llama
llama-stack-k8s-operator-controller-manager-64554d8c6f-6hkp5      1/1     Running   0          4h33m

Configuring Llama Stack

An example LLama Stack deployment exists in kubernetes/llama-stack-dist modify the VLLM_URL and INFERENCE_MODEL with your values before deploying.

      env:
        - name: INFERENCE_MODEL
          value: "llama32-3b"
        - name: VLLM_URL
          value: "https://llama32-3b.serving.svc.cluster.local/v1"

Testing Llama Stack

The LLama Stack endpoint can be tested by doing a port forward and a curl.

kubectl port-forward -n serving svc/lsd-llama32-3b-service 8321:8321
curl -X POST http://localhost:8321/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "llama32-3b",
      "messages": [
        {
          "role": "user",
          "content": "Hello, how are you?"
        }
      ],
      "max_tokens": 100,
      "stream": false
    }'

Kagenti

Kagenti is a Kubernetes-based control plane for AI agents. It provides a framework-neutral, scalable, and secure platform for deploying and orchestrating AI agents.

Prerequisites

Kagenti installed on the cluster (see Kagenti installation guide)
LlamaStack endpoint deployed and accessible (see sections above)

Kagenti UI Access

Once Kagenti is installed, access the UI:

URL: https://kagenti-ui-kagenti-system.apps.<cluster-domain>/
Credentials: Check with your cluster administrator (default: temp-admin / auto-generated password)

Deploying an Agent with LlamaStack

Kagenti supports deploying agents that use any OpenAI-compatible LLM endpoint, including LlamaStack.

Option 1: Deploy via Scripts (Recommended)

For a complete, tested deployment with MCP tools, use the kagenti-llamastack-poc folder:

cd kubernetes/kagenti-llamastack-poc/scripts
chmod +x *.sh
./01-setup.sh        # Setup namespace & permissions
./02-deploy-agent.sh # Build and deploy agent
./03-deploy-tools.sh # Deploy MCP tools (weather, calculator)
./04-patch-mcp-urls.sh # Connect tools to agent
./05-test.sh         # Test everything

See kubernetes/kagenti-llamastack-poc/README.md for full documentation including workarounds and troubleshooting.

Option 2: Deploy via Kagenti UI

Access the Kagenti UI

https://kagenti-ui-kagenti-system.apps.llama.octo-emerging.redhataicoe.com/

Navigate to "Import New Agent"
Configure the agent:
- Namespace: kagenti-system (or create a new one)
- Deployment Method: Build from source
- Repository URL: https://github.com/kagenti/agent-examples
- Subfolder: a2a/generic_agent
- Protocol: A2A
Add environment variables:

Variable Value

LLM_MODEL vllm-inference/llama32-3b

LLM_API_BASE http://lsd-llama32-3b-service.serving.svc.cluster.local:8321/v1

LLM_API_KEY dummy

MCP_TRANSPORT streamable_http
Click "Build New Agent" and wait for deployment to complete.

Option 3: Deploy via kubectl

For manual deployment, see the YAML manifests in kubernetes/kagenti-llamastack-poc/agent/.

Testing the Agent

In the Kagenti UI, navigate to Agent Catalog
Find your deployed agent and click View Details
Use the chat interface at the bottom to test:
```
Hello! Can you tell me about yourself?
```

LlamaStack Endpoint Details

The agent connects to LlamaStack using these settings:

Setting	Value
Service URL	`http://lsd-llama32-3b-service.serving.svc.cluster.local:8321/v1`
Model ID	`vllm-inference/llama32-3b`
API Format	OpenAI-compatible (`/v1/chat/completions`)
Authentication	None required (set `LLM_API_KEY=dummy`)

Troubleshooting

If the agent fails to deploy or respond:

Check LlamaStack is running:

oc get pods -n serving | grep llama
oc get llsd -n serving

Test the LlamaStack endpoint:

oc exec -n serving deployment/lsd-llama32-3b -- curl -s -X POST \
  http://localhost:8321/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "vllm-inference/llama32-3b", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 20}'

Check agent pod logs:

oc logs -n kagenti-system -l app=<agent-name> --tail=100

Verify network connectivity:

oc exec -n kagenti-system deployment/<agent-deployment> -- curl -s http://lsd-llama32-3b-service.serving.svc.cluster.local:8321/v1/models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

RHOAI

LLama Stack

Configuring Llama Stack

Testing Llama Stack

Kagenti

Prerequisites

Kagenti UI Access

Deploying an Agent with LlamaStack

Option 1: Deploy via Scripts (Recommended)

Option 2: Deploy via Kagenti UI

Option 3: Deploy via kubectl

Testing the Agent

LlamaStack Endpoint Details

Troubleshooting

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
kubernetes		kubernetes
README.md		README.md

Variable	Value
`LLM_MODEL`	`vllm-inference/llama32-3b`
`LLM_API_BASE`	`http://lsd-llama32-3b-service.serving.svc.cluster.local:8321/v1`
`LLM_API_KEY`	`dummy`
`MCP_TRANSPORT`	`streamable_http`

redhat-et/kagenti-llama-demo

Folders and files

Latest commit

History

Repository files navigation

Introduction

RHOAI

LLama Stack

Configuring Llama Stack

Testing Llama Stack

Kagenti

Prerequisites

Kagenti UI Access

Deploying an Agent with LlamaStack

Option 1: Deploy via Scripts (Recommended)

Option 2: Deploy via Kagenti UI

Option 3: Deploy via kubectl

Testing the Agent

LlamaStack Endpoint Details

Troubleshooting

Additional Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages