Comprehensive demos and examples for the Envoy AI Gateway
Showcasing how to deploy, configure, and use AI Gateway features in Kubernetes environments
- π Multi-Provider Support - Route traffic to OpenAI, AWS Bedrock, Azure OpenAI, and more
- π Token-Based Rate Limiting - Advanced rate limiting based on AI tokens, not just requests
- π Provider Fallback - Automatic failover between AI providers for reliability
- π OpenAI-Compatible API - Drop-in replacement for OpenAI API clients
- π‘οΈ Built on Envoy - Leverages battle-tested Envoy Proxy technology
- π Observability - Rich metrics, tracing, and logging for AI workloads
- π§ Kubernetes Native - Designed for cloud-native environments
- Complete Demo Environments: Ready-to-run demos with automated setup and testing
- Infrastructure Automation: Taskfile-based automation for cluster setup and management
- CI/CD Integration: GitHub Actions workflows for automated testing and validation
- Production-Ready Examples: Real-world configurations and best practices
βββ demos/ # Individual demo environments
β βββ 01-getting-started/ # Basic Envoy AI Gateway setup with LLM-D simulator
β βββ 02-usage-based-rate-limiting/ # Advanced token-based rate limiting
βββ scripts/ # Automation scripts for setup and management
βββ .github/workflows/ # CI/CD workflows for automated testing
βββ Taskfile.yml # Main automation tasks
π‘ Each demo includes its own comprehensive README with detailed setup instructions, configuration options, and usage examples. Always refer to the individual demo README for complete guidance.
A comprehensive introduction to Envoy AI Gateway featuring:
- LLM-D Inference Simulator as a lightweight AI backend
- Qwen3 model configured in echo mode for testing
- Complete API endpoints (chat, models, streaming)
- Automated testing suite with GitHub Actions integration
- Performance tuning (10ms TTFT, 20ms inter-token latency)
π Read the full demo README for step-by-step instructions and detailed configuration.
Advanced token-based rate limiting for AI workloads featuring:
- Token-based rate limiting with different quotas per model (qwen3: 50/hour, gpt-4: 1000/hour, gpt-3.5-turbo: 100/hour)
- Per-user and per-model enforcement using
x-user-id
andx-ai-eg-model
headers - Automatic token tracking from LLM responses with input/output/total token metrics
- Raw metrics collection via
task metrics
with Prometheus-compatible output - Rate limit enforcement with 429 status codes and comprehensive testing
π Read the full demo README for usage-based rate limiting setup and metrics analysis.
Before running any demos, ensure you have:
-
Taskfile - Task runner for automation Installation
sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin/
-
kind - Kubernetes in Docker (installed automatically)
-
kubectl - Kubernetes CLI (installed automatically)
-
helm - Kubernetes package manager (installed automatically)
-
jq - JSON processor (recommended for testing)
-
Docker - Container runtime
Set up the complete Envoy AI Gateway environment:
task setup-all
This will:
- Install all required dependencies (kind, helm, kubectl)
- Create a kind cluster with proper configuration
- Install Envoy Gateway (latest version)
- Install Envoy AI Gateway (latest version)
- Configure AI Gateway integration
- Verify the complete installation
Jump directly into a demo:
cd demos/01-getting-started
# Read the demo README first for detailed instructions
cat README.md
task setup
π Important: Each demo has its own README with specific setup instructions, configuration details, and usage examples. Always check the demo's README before running tasks.
View all available tasks:
task --list
The following environment variables can be customized in Taskfile.yml
:
CLUSTER_NAME
(default:envoy-ai-gateway-demo
)KIND_VERSION
(default:v0.29.0
)ENVOY_GATEWAY_VERSION
(default:v0.0.0-latest
)ENVOY_AI_GATEWAY_VERSION
(default:v0.0.0-latest
)
task setup-all
- Complete environment setup from scratchtask create-cluster
- Create kind cluster with Envoy Gatewaytask install-envoy-gateway
- Install Envoy Gateway onlytask install-envoy-ai-gateway
- Install Envoy AI Gateway onlytask verify-installation
- Verify all components are running
task port-forward
- Port forward to access gateway (localhost:8080)task status
- Check status of all componentstask logs
- View logs from AI Gateway components
task cleanup
- Remove all resources and clustertask reset
- Reset environment for fresh start
task cleanup
- Remove k3s cluster and cleanuptask logs-envoy-gateway
- View Envoy Gateway logstask logs-ai-gateway
- View AI Gateway logstask port-forward
- Port forward to access gateway (localhost:8080)task verify-installation
- Verify installation status
-
Kind cluster creation fails
# Ensure Docker is running and has sufficient resources docker info task cleanup && task create-cluster
-
Gateway installation fails
# Verify cluster readiness kubectl get nodes kubectl get pods -A task verify-installation
-
Port forwarding issues
# Kill existing port forwards and restart pkill -f "kubectl.*port-forward" task port-forward
-
Demo-specific issues
# Check demo logs and status cd demos/01-getting-started task logs task test
- Task Status:
task --status <task-name>
- Component Logs:
kubectl logs -n envoy-gateway-system -l app=envoy-gateway
- Installation Check:
task verify-installation
- Demo Diagnostics: Each demo includes comprehensive logging and testing
- Create directory:
demos/<demo-name>/
- Include required files:
README.md
- Comprehensive demo documentationTaskfile.yml
- Demo-specific automation tasks- Kubernetes manifests and configurations
- Add GitHub Actions workflow:
.github/workflows/demo-<name>.yml
- Test thoroughly with
task test
- Documentation: Each demo should be self-contained with clear README
- Automation: Use Taskfile for all setup, testing, and cleanup operations
- Testing: Include comprehensive test suites with error diagnostics
- CI/CD: Add GitHub Actions workflows for automated validation
- Resource Management: Ensure proper cleanup and resource limits
- Fork the repository
- Create a feature branch:
git checkout -b feature/new-demo
- Develop your demo following the established patterns
- Test locally:
task setup-all && cd demos/<your-demo> && task test
- Ensure CI passes: Check GitHub Actions workflows
- Submit a pull request with detailed description
- Envoy AI Gateway - Official documentation and guides
- Envoy Gateway - Core Envoy Gateway project
- Taskfile - Task runner documentation
- Kind - Kubernetes in Docker
- LLM-D Inference Simulator - Lightweight AI backend for testing
Ready to get started?
Jump into the 01-getting-started demo or run task setup-all
to set up the complete environment! π
- Envoy AI Gateway Docs - Complete documentation and guides
- Getting Started Guide - Quick start tutorial
- Basic Usage - Core concepts and examples
- LLM Provider Integrations - Supported AI services
- Release Notes - Latest updates and features
- GitHub Repository - Source code and issues
- Slack Community - Join the conversation
- Weekly Community Meetings - Thursdays
- GitHub Discussions - Community Q&A
- Envoy Gateway - Core gateway functionality
- Envoy Proxy - The underlying proxy technology
- LLM-D Inference Simulator - Lightweight testing backend