Skip to content

Local Rate Limit breaks Gateway configuration resulting in apps returning 503 #7957

@sy-be

Description

@sy-be

Description:
BackendTrafficPolicy with Local Rate Limits having clientSelectors with type RegularExpression breaks overall gateway configuration and renders all apps attached to the gateway broken.

We configured Envoy Gateway with GatewayClases in mergedGateway mode and have hundreds of Gateways, xRoute and BackendTrafficPolicy objects attached to single set of Envoy Proxies. We recently started encountering a very strange bug which was making apps return 503 errors.

Further investigation revealed that this issue starts happening periodically (several times a day around the same time - likely when EG resyncs configs to Envoy Proxies) and these log entries produced by envoy-gateway pods prior to apps getting into broken state:

2026-01-12 09:51:02.672	
[2026-01-12 09:51:02.672][1][warning][config] [source/common/protobuf/message_validator_impl.cc:64] Unknown field: type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}
2026-01-12 09:51:02.694	
[2026-01-12 09:51:02.694][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:138] gRPC config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) mwam-aimee-swe-test/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.694	
[2026-01-12 09:51:02.694][1][warning][config] [source/extensions/config_subscription/grpc/delta_subscription_state.cc:296] delta config for type.googleapis.com/envoy.config.listener.v3.Listener rejected: Error adding/updating listener(s) mwam-aimee-swe-test/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.697	
2026-01-12T09:51:02.697Z	DEBUG	xds	cache/snapshotcache.go:368	handling v3 xDS resource request, response_nonce 4, nodeID envoy-merged-clusterip-eg-fe86af92-659ff4dcfc-t9c2l, node_version v1.33.1, resource_names_subscribe [], resource_names_unsubscribe [], type_url type.googleapis.com/envoy.config.listener.v3.Listener, errorCode 13, errorMessage Error adding/updating listener(s) <namespace>/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields
2026-01-12 09:51:02.697	
2026-01-12T09:51:02.697Z	ERROR	xds	cache/snapshotcache.go:376	Envoy rejected the last update with code 13 and message Error adding/updating listener(s) <namespace>/<gateway>-merged/<httproute>-http: Protobuf message (type envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit(root) with unknown field set {18}) has unknown fields

What's peculiar, is that the referenced HTTPRoute does not have a BTP attached, nor it has any rate limits configured. We identified that there were only 5 BTPs which included the following config:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  labels:
    app: offending-app
  name: offending-app
  namespace: some-namespace
spec:
  rateLimit:
    local:
      rules:
        - clientSelectors:
            - headers:
                - invert: true
                  name: User-Agent
                  type: RegularExpression
                  value: kube-probe/.*
                - invert: true
                  name: User-Agent
                  type: RegularExpression
                  value: Prometheus/.*
                - invert: true
                  name: User-Agent
                  type: RegularExpression
                  value: Blackbox Exporter/.*
          limit:
            requests: 12
            unit: Second
    type: Local
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: offending-app

So the BTP above breaks configs for all apps by making Envoy Gateway generate incorrect/incomplete config for Envoy Proxies.

As a remediation - restarting Envoy Gateway (killing pods) helps.

Repro steps:
Create a BTP as described above and attach to HTTPRoute and Gateway with GatewayClass having mergedGateway set to true. mergedGateway may not be necessary, but it amplifies and makes this harder to debug as configs from working apps get broken.

Environment:
Envoy Gateway v1.6.1 as well as v1.5.4

Logs:
In the description

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions