Skip to content

Backend Config Snippet and Server Lost During Ingress Recreation/Update #768

@mdecalf

Description

@mdecalf

Backend Config Snippet and Server Lost During Ingress Recreation/Update

Description

When recreating an Ingress resource (creating a duplicate with a different name, then deleting the original for example) or mass pod restart, HAProxy Ingress Controller intermittently loses the haproxy.org/backend-config-snippet annotation content during configuration reload and also server. This results in backends being left with only disabled dummy servers and empty config snippet sections, causing service unavailability even when failover servers are configured.

This issue seem similar to

#699
#702

I have the issue since more than 9 month and at least since version 3.1.0

I am not able to find the cause in the controller code

Environment

  • HAProxy Ingress Controller Version: v3.1.15
  • HAProxy Ingress Controller chart Version: 1.47.3
  • Kubernetes Version: 1.34
  • HAProxy Ingress Deployment Type: DaemonSet

Ingress Configuration Example

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: service-ingress
  namespace: application-namespace
  annotations:
    cert-manager.io/cluster-issuer: selfsigned
    haproxy.org/backend-config-snippet: |
      option httpchk
      http-check send meth GET uri /health ver HTTP/1.1 hdr Host api.example.com
      option allbackups
      server backup-dc1-01 backup-dc1-01.external.svc.cluster.local:443 backup ssl verify none
      server backup-dc1-02 backup-dc1-02.external.svc.cluster.local:443 backup ssl verify none
      server backup-dc2-01 backup-dc2-01.external.svc.cluster.local:443 backup ssl verify none
      server backup-dc2-02 backup-dc2-02.external.svc.cluster.local:443 backup ssl verify none
spec:
  ingressClassName: haproxy
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /api/v1
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 9876
  tls:
  - hosts:
    - api.example.com
    secretName: api-tls

Expected Behavior

When an Ingress is deleted and replaced with a new Ingress having identical rules and annotations (including haproxy.org/backend-config-snippet), the HAProxy backend configuration should maintain the custom backend snippet with health checks and failover servers.

Expected backend configuration:

backend namespace_svc_service-name_http
  mode http
  balance roundrobin
  option forwardfor
  no option abortonclose
  default-server check
  ###_config-snippet_### BEGIN
  option httpchk
  http-check send meth GET uri /health ver HTTP/1.1 hdr Host api.example.com
  option allbackups
  server backup-server-01 backup-server-01.external.svc.cluster.local:443 backup ssl verify none
  server backup-server-02 backup-server-02.external.svc.cluster.local:443 backup ssl verify none
  ###_config-snippet_### END
  server SRV_1 10.0.0.1:9876 check weight 100
  server SRV_2 10.0.0.2:9876 check weight 100

Actual Behavior

After the Ingress recreation and deletion sequence, the backend configuration intermittently loses the custom snippet content, leaving only disabled dummy servers with an empty config snippet section:

backend namespace_svc_service-name_http
  mode http
  balance roundrobin
  option forwardfor
  no option abortonclose
  default-server check
  ###_config-snippet_### BEGIN
  ###_config-snippet_### END
  server SRV_1 127.0.0.1:1 disabled
  server SRV_2 127.0.0.1:1 disabled
  server SRV_3 127.0.0.1:1 disabled

Weird stuff is also sometime in the haproxy.cfg in the pod, I got the correct backend but HAproxy doesn't seem reload it so in the stat I have 0 server 0 backup

Steps to Reproduce

I see it in different scenario, during batch apply on large ingress number in the same update (change path on +10 ingress for example or mass pod restart)

But a quite consistent way to reproduce it is this one:

  1. Create an Ingress resource with haproxy.org/backend-config-snippet annotation containing health checks and backup servers

  2. Ensure the backing service has pods that may become temporarily unreachable

  3. Clone the Ingress with a different name but identical rules and annotations:

kubectl get ingress original-ingress -n namespace -o yaml | \
  sed 's/original-ingress/temporary-ingress/g' | \
  sed '/^  resourceVersion:/d' | \
  sed '/^  uid:/d' | \
  sed '/^  creationTimestamp:/d' | \
  sed '/^  generation:/d' | \
  sed '/^  selfLink:/d' | \
  kubectl apply -f -
  1. Delete the original Ingress:

kubectl -n namespace delete ingress original-ingress

  1. Inspect the HAProxy configuration after reload

  2. Issue occurs randomly but reproducibly with multiple attempts

Additional Context

The issue appears to be a race condition or state management problem during the configuration reload triggered by Ingress resource changes. The backend snippet is correctly applied during initial Ingress creation but may be lost during subsequent reconfigurations when backing pods are restaring.

Logs

Nothing relevant in the log except that I see

Backup Server xxx-api_http/xxx-ingress-01 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 293ms. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

And no

Backup Server xxx-api_http/xxx-ingress-01 is UP, reason: Layer7 check passed, code: 204, check duration: 41ms. 2 active and 4 backup servers online. 0 sessions requeued, 0 total in queue.

As expected after the change

Workaround

Restart haproxy pod and everything goes back, server and backup

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions