-
Notifications
You must be signed in to change notification settings - Fork 647
Description
Description:
BackendTrafficPolicy with Local Rate Limits having clientSelectors with type RegularExpression breaks overall gateway configuration and renders all apps attached to the gateway broken.
We configured Envoy Gateway with GatewayClases in mergedGateway mode and have hundreds of Gateways, xRoute and BackendTrafficPolicy objects attached to single set of Envoy Proxies. We recently started encountering a very strange bug which was making apps return 503 errors.
Further investigation revealed that this issue starts happening periodically (several times a day around the same time - likely when EG resyncs configs to Envoy Proxies) and these log entries produced by envoy-gateway pods prior to apps getting into broken state:
2026-01-12 09:51:02.672
[2026-01-12 09:51:02.672][1][warning][config] [source/common/protobuf/message_validator_impl.cc:64] Unknown field: type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}
2026-01-12 09:51:02.694
[2026-01-12 09:51:02.694][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:138] gRPC config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) mwam-aimee-swe-test/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.694
[2026-01-12 09:51:02.694][1][warning][config] [source/extensions/config_subscription/grpc/delta_subscription_state.cc:296] delta config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) mwam-aimee-swe-test/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.697
2026-01-12T09:51:02.697Z DEBUG xds cache/snapshotcache.go:368 handling v3 xDS resource request, response_nonce 4, nodeID envoy-merged-clusterip-eg-fe86af92-659ff4dcfc-t9c2l, node_version v1.33.1, resource_names_subscribe [], resource_names_unsubscribe [], type_url type.googleapis.com/envoy.config.listener.v3.Listener, errorCode 13, errorMessage Error adding/updating listener(s) <namespace>/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.697
2026-01-12T09:51:02.697Z ERROR xds cache/snapshotcache.go:376 Envoy rejected the last update with code 13 and message Error adding/updating listener(s) <namespace>/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
What's peculiar, is that the referenced HTTPRoute does not have a BTP attached, nor it has any rate limits configured. We identified that there were only 5 BTPs which included the following config:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
labels:
app: offending-app
name: offending-app
namespace: some-namespace
spec:
rateLimit:
local:
rules:
- clientSelectors:
- headers:
- invert: true
name: User-Agent
type: RegularExpression
value: kube-probe/.*
- invert: true
name: User-Agent
type: RegularExpression
value: Prometheus/.*
- invert: true
name: User-Agent
type: RegularExpression
value: Blackbox Exporter/.*
limit:
requests: 12
unit: Second
type: Local
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: offending-app
So the BTP above breaks configs for all apps by making Envoy Gateway generate incorrect/incomplete config for Envoy Proxies.
As a remediation - restarting Envoy Gateway (killing pods) helps.
Repro steps:
Create a BTP as described above and attach to HTTPRoute and Gateway with GatewayClass having mergedGateway set to true. mergedGateway may not be necessary, but it amplifies and makes this harder to debug as configs from working apps get broken.
Environment:
Envoy Gateway v1.6.1 as well as v1.5.4
Logs:
In the description