Skip to content

Flux hangs waiting on rolling out kustomizations #5601

@engnatha

Description

@engnatha

Describe the bug

We have found that having multiple kustomizations that wait on kustomizations that wait on resources to be reconciled substantially slows down performance of the controllers. There are no obvious resource bottlenecks in CPU or memory when this behavior is observed.

Steps to reproduce

Following the recommended structure for a monorepo, our team started to develop a pattern where'd do something like

apps
  foo
    init
      configs.yaml
      kustomization.yaml
    meta
      meta.yaml
    service
      deployment.yaml
      kustomization.yaml

An example meta.yaml file here would look like

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: foo-init
  namespace: flux-system
spec:
  targetNamespace: foo
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/foo/init
  interval: 5m
  timeout: 2m
  prune: true
  wait: true
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: foo-service
  namespace: flux-system
spec:
  targetNamespace: foo
  sourceRef:
    kind: GitRepository
    name: flux-system
  dependsOn:
    - name: foo-init
  path: ./apps/foo/service
  interval: 5m
  timeout: 2m
  prune: true
  wait: true

Then at a high level we'd maintain an apps.yaml file closer to our flux system component that would look like

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: foo-app
  namespace: flux-system
spec:
  sourceRef:
    kind: GitRepository
    name: flux-system
  path: ./apps/foo/meta
  interval: 5m
  timeout: 2m
  prune: true
  wait: true

We found that having this wait: true in the top level apps.yaml spec was substantially locking up our reconciliation cycles. Removing these checks made substantially improved performance while changing nothing about the downstream resources that were reconciled. The fluxcd prometheus metrics show this dramatic change in behavior. On 10/13/2025 12:00PM PDT, the manifests were updated to remove these extra wait settings. It takes some time for the system to fully pick up the changes but then reconciliation loops are instantaneous back into milliseconds rather than pegged at the 5 minute timeouts from before.

Image

Expected behavior

In a healthy cluster, resources should be able to wait transitively as much as they want.

Screenshots and recordings

No response

OS / Distro

N/A

Flux version

v2.7.0

Flux check

► checking prerequisites
✗ flux 2.7.0 <2.7.2 (new CLI version is available, please upgrade)
✔ Kubernetes 1.33.5-gke.1080000 >=1.32.0-0
► checking version in cluster
✔ distribution: flux-v2.7.0
✔ bootstrapped: true
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v1.4.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.7.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.7.1
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.7.0
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta3
✔ buckets.source.toolkit.fluxcd.io/v1
✔ externalartifacts.source.toolkit.fluxcd.io/v1
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1
✔ helmreleases.helm.toolkit.fluxcd.io/v2
✔ helmrepositories.source.toolkit.fluxcd.io/v1
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1
✔ providers.notification.toolkit.fluxcd.io/v1beta3
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

Github Enterprise

Container Registry provider

Self-hosted Mirror

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions