Skip to content

fluxcd works incorrectly with statefulsets that include openshift image trigger #5616

@drook

Description

@drook

Describe the bug

Env:
We are running FluxCD v2.3.0 inside an Openshift 4.15.0-0.okd-2024-02-23-163410.

Suppose we have the following k8s StatefulSet:

kind: StatefulSet
apiVersion: apps/v1
metadata:
  annotations:
    image.openshift.io/triggers: '[{"from":{"kind":"ImageStreamTag","name":"clm-runtime:latest","namespace":"eco-platform-stage"},"fieldPath":"spec.template.spec.containers[?(@.name==\"runtime-inst\")].image","pause":"true"}]'
  name: runtime-inst
  labels:
    app: runtime-inst
    app.kubernetes.io/component: runtime-inst
    app.kubernetes.io/name: runtime-inst
spec:
  serviceName: runtime-inst
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  replicas: 2
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      app: runtime-inst
  template:
    metadata:
      labels:
        app: runtime-inst
    spec:
      containers:
        - name: runtime-inst
          image: clm-runtime:latest
          ports:
            - containerPort: 8080
              protocol: TCP
            - containerPort: 8000
              protocol: TCP
          env:
            - name: RUNTIME_INSTANCE_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: IS_RUNTIME_GATEWAY
              value: '0'
          envFrom:
            - configMapRef:
                name: runtime-config
            - secretRef:
                name: runtime-secrets
          resources:
            limits:
              cpu: 900m
              memory: 900Mi
            requests:
              cpu: 600m
              memory: 700Mi
          imagePullPolicy: IfNotPresent
      restartPolicy: Always

Defined trigger sets the image: clm-runtime part to the image:clm-runtime@sha256: representation, then nearest reconciliation sees the difference and triggers the pod restart with image: clm-runtime. Then the trigger changes the image back to sha256-representation, and we enter infinite cycle.

Weird things:

  • only last pod restarts.
  • this happens only to StatefulSets, Deployments work just fine.

Workaround:
Either disable the trigger or pause reconciliantion.

Steps to reproduce

  1. Install flux
  2. Create a kustomization including a statefulSet and a trigger on image change.
  3. Watch the last pod restarts each minute.

Expected behavior

Flux should not treat the sha256-hash image representation as a major difference for the StatefulSet like it does with Deployments.

Screenshots and recordings

Image

OS / Distro

N/A

Flux version

v2.3.0

Flux check

flux check
► checking prerequisites
✗ flux 2.5.1 <2.7.3 (new CLI version is available, please upgrade)
✗ Kubernetes version v1.28.2-3580+6216ea1e51a212-dirty does not match >=1.30.0-0
► checking version in cluster
✔ distribution: flux-v2.3.0
✔ bootstrapped: false
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v1.0.1
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.38.0
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.32.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.3.0
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.3.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.3.0
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta3
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1
✔ helmreleases.helm.toolkit.fluxcd.io/v2
✔ helmrepositories.source.toolkit.fluxcd.io/v1
✔ imagepolicies.image.toolkit.fluxcd.io/v1beta2
✔ imagerepositories.image.toolkit.fluxcd.io/v1beta2
✔ imageupdateautomations.image.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta3
✔ receivers.notification.toolkit.fluxcd.io/v1
✗ check failed

Git provider

Gitlab

Container Registry provider

Openshift integrated registry

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions