This Helm chart enables the streamlined deployment of LLM models (such as vLLM) with an integrated OpenTelemetry Collector sidecar. It is designed to provide real-time observability by collecting Prometheus-formatted metrics from each model and exporting them to platforms like Dynatrace.
Key Features
- Deploy LLM models using KServe's InferenceService
- Automatically inject an OpenTelemetry Collector as a sidecar for each model
- Scrape and process Prometheus metrics from the model
- Export metrics using OTLP over HTTP to Dynatrace or any OTLP-compatible backend
- Secure integration using Kubernetes secrets for API tokens and endpoints
- Configure otel-secrets with your
Dynatrace endpoint
andAPI_TOKEN
.
Once that's configured, deploy the helm charts.
- GPU nodes
- NFD Operator and NVIDIA Operator installed - https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/install-nfd.html
- OpenShift cluster with Openshift AI configured
- oc CLI configured
cd Dynatrace/deploy/helm
make install NAMESPACE=dynatrace LLM=llama-3-2-3b-instruct LLM_TOLERATION="nvidia.com/gpu"
This would deploy -
- llama-3-2-3b-instruct model
- Sidecar container as part of the LLM deployment
This would export the VLLM metrics to Dynatrace.