Skip to content

Releases: envoyproxy/ai-gateway

v0.3.0-rc1

15 Aug 04:15
e33a5f3
Compare
Choose a tag to compare
v0.3.0-rc1 Pre-release
Pre-release

Release candidate for v0.3.0!

helm install aieg oci://registry-1.docker.io/envoyproxy/ai-gateway-helm --version v0.3.0-rc1 --namespace envoy-ai-gateway-system --create-namespace

v0.2.1

10 Jun 04:01
36fc248
Compare
Choose a tag to compare

Overview

This patch release v0.2.1 includes fixes to aws signing issue for large request body.

Commits

  • backport: do not sign content-length for AWS (#697)

v0.2.0

05 Jun 21:20
4b5ebbd
Compare
Choose a tag to compare

Envoy AI Gateway v0.2.x

June 5, 2025

Envoy AI Gateway v0.2.0 builds upon the solid foundation of v0.1.0 with focus on expanding provider ecosystem support, improving reliability and performance through architectural changes, and enterprise-grade authentication support for Azure OpenAI.

Azure OpenAI Integration Sidecar Architecture Performance Improvements CLI Tools Model Failover and Retry Certificate Manager Integration

✨ New Features

Azure OpenAI Integration

  • Full Azure OpenAI Support
    • Complete integration with Azure OpenAI services with request/response transformation for the unified OpenAI compatiple completions API.
  • Upstream Authentication for Azure Enterprise Integration
    • Support for accessing Azure via OIDC tokens and Entra ID for enterprise-grade authentication for secure and compliant upstream authentication.
  • Enterprise Proxy URL Support for Azure Authentication
    • Enhanced Azure authentication with proxy URL configuration options for enterprise proxy support.
  • Flexible Token Providers
    • Generalized token provider architecture supporting both client secret and federated token flows

Architecture Improvements

  • Sidecar and UDS External Processor
    • Switched to sidecar deployment model with Unix Domain Sockets for improved performance and resource efficiency
  • Enhanced ExtProc Buffer Limits
    • Increased external processor buffer limits from 32 KiB to 50 MiB for larger AI requests. Users can now configure CPU and memory resource limits via filterConfig.externalProcessor.resources for better resource management.
  • Multiple AIGatewayRoute Support
    • Support for multiple AIGatewayRoute resources per gateway, removing the previous single-route limitation. This enables better organization, scalability, and management of complex routing configurations across teams.
  • Certificate Manager Integration
    • Integrated cert-manager for automated TLS certificate provisioning and rotation for the mutating webhook server that injects AI Gateway sidecar containers into Envoy Gateway pods. This enables enterprise-grade certificate management, eliminating manual certificate handling and improving security.

Cross-Backend Failover and Retry

  • Provider Fallback Logic
    • Priority-based failover system that automatically routes traffic to lower priority AI providers as higher priority endpoints become unhealthy, ensuring high availability and fault tolerance.
  • Backend Retry Support
    • Configurable retry policies for improved reliability and resilience against AI provider transient failures. Features include exponential backoff with jitter, configurable retry triggers (5xx errors, connection failures, rate limiting), customizable retry counts and timeouts, and integration with Envoy Gateway's BackendTrafficPolicy.
  • Weight-Based Routing
    • Enhanced backend routing with weighted traffic distribution, enabling gradual rollouts, cost optimization, and A/B testing across multiple AI providers

Enhanced CLI Tools

  • aigw run Command

    • New CLI command for local development and testing of Envoy AI Gateway resources.
  • Configuration Translation

    • aigw translate for translating Envoy AI Gateway Resources to Envoy Gateway and Kubernetes CRDs.

🔗 API Updates

  • AIGatewayRoute Metadata: Added ownedBy and createdAt fields for better resource tracking.
  • Backend Configuration: Moved Backend configuration back to RouteRule for improved flexibility.
  • OIDC Field Types: Specific typing for OIDC-related configuration fields.
  • Weight Type Changes: Updated Weight field type to match Gateway API specifications.

Deprecations

  • AIServiceBackend.Timeouts: Deprecated in favor of more granular timeout configuration.

🐛 Bug Fixes

  • ExtProc Image Syncing: Fixed issue where external processor image wouldn't sync properly.
  • Router Weight Validation: Fixed negative weight validation in routing logic.
  • Content Body Handling: Fixed empty content body issues causing AWS validation errors.
  • First Match Routing: Fixed router logic to ensure first match wins as expected.

⚠️ Breaking Changes

  • Sidecar Architecture: The switch to sidecar and UDS model may require configuration updates for existing deployments.
  • API Field Changes: Some API fields have been moved or renamed - see migration guide for details. Please review the migration guide for details.
  • Timeout Configuration: Deprecated timeout fields require migration to new configuration format.
  • Routing to Kubernetes Services: Routing to Kubernetes services is not supported in Envoy AI Gateway v0.2.0. This is a known limitation and will be addressed in a future release.

📖 Upgrade Guidance

For users upgrading from v0.1.x to v0.2.0:

  1. Review usage of any deprecated API fields (particularly AIServiceBackend.Timeouts).
  2. Update deployment configurations if using custom replica configurations - the replicas field in AIGatewayFilterConfigExternalProcessor is now deprecated due to the new sidecar architecture.
  3. Remove routing to Kubernetes services - currently, Envoy AI Gateway does not support routing to Kubernetes services. This is a known limitation and will be addressed in a future release.

📦 Dependencies Versions

  • Go 1.24.2 - Updated to latest Go version for improved performance and security.
  • Envoy Gateway v1.4 - Built on Envoy Gateway for proven data plane capabilities.
  • Envoy v1.34 - Leveraging Envoy Proxy's battle-tested networking capabilities.
  • Gateway API v1.3 - Support for latest Gateway API specifications.

🙏 Acknowledgements

This release represents the collaborative effort of our growing community. Special thanks to contributors from Tetrate, Bloomberg, Google, and our independent contributors who made this release possible through their code contributions, testing, feedback, and community participation.

There are those who engage in conversations, provide feedback, and contribute to the project in other ways than code, and we appreciate them greatly. Ideas, suggestions, and feedback are always welcome.

🔮 What's Next (beyond v0.2)

We're already working on exciting features:

  • Google Gemini & Vertex Integration
  • Anthropic Integration
  • Support for the Gateway API Inference Extension
  • Endpoint picker support for Pod routing
  • What else do you want to see? Get involved and open an issue and let us know!

v0.2.0-rc3

05 Jun 21:12
4b5ebbd
Compare
Choose a tag to compare
v0.2.0-rc3 Pre-release
Pre-release

Release candidate

v0.2.0-rc1

02 Jun 22:27
0cd003e
Compare
Choose a tag to compare
v0.2.0-rc1 Pre-release
Pre-release

Release candidate

v0.1.5

04 Jul 17:22
30b4783
Compare
Choose a tag to compare

Overview

This patch release v0.1.5 introduces a fix to syncing the ext proc image.

Commits

extproc: set extProcImage before potentially returning #515

v0.1.4

20 Mar 20:37
d00bf8c
Compare
Choose a tag to compare

Overview

This patch release v0.1.4 introduces fix to aws validation error when assistant content is empty.

Commits

  • translator: skip adding content if assistant content string is empty (#508)

v0.1.3

14 Mar 20:00
2cc198a
Compare
Choose a tag to compare

Overview

This patch release v0.1.3 includes fixes to chat completion streaming and openai assistant content type, and adds genai metrics.

Commits

  • extproc: add genai metrics to track token usage and latency (#459)
  • extproc: properly stream chat completions (#468)
  • feat: openai: allow assistant content to be string (#486)

v0.1.2

05 Mar 23:43
90facee
Compare
Choose a tag to compare

Overview

This patch release v0.1.2 includes fixes to BackendSecurityPolicy requeue bug, docker hub changes, and extproc image tag sync.

Commits

  • controller: fix syncing extproc image tag (#447)
  • controller: update rotate function to return expiration time (#455)
  • ci: update dockerhub user (#441)

v0.1.1

28 Feb 18:42
3ea774e
Compare
Choose a tag to compare

Overview

This patch release v0.1.1 includes fixes to AWS tooling call, new integration tests for AWS tools, and minor GitHub action validation.

Commits

e2e: add agent test using tool - #426
chore: allows backport prefix in commit title - #443