Skip to content

Karpenter scaling unproportionately high nodes #2123

Open
@Raj-Popat

Description

@Raj-Popat

Description

Observed Behavior:
Karpenter attempts to bring up disproportionate amount of nodes for pods
While scheduling a single pod with nodeaffinity,antiaffinity and topology spread constraints with one pending pod karpenter tries to create ~10 nodes

Expected Behavior: I already understand karpenter trests preferred affinity as required and it should scale up nodes equal to the number of replicas of a deployment in worst case scenario and then consolidate the empty nodes but bringing up 10 nodes for a single pod is just not acceptable

Reproduction Steps (Please include YAML):
Deployment definition

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  namespace: default
  labels:
    app: nginx
    app-id: f12345678
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
        app-id: f12345678
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node-role.xxx.xxx.io/worker
                operator: In
                values:
                - "true"
              - key: xxx.xxx.io/provisioner
                operator: NotIn
                values:
                - karpenter
              - key: compute-optimized
                operator: In
                values:
                - "true"
            - matchExpressions:
              - key: node-role.xxx.xxx.io/stable
                operator: In
                values:
                - "true"
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app-id
                  operator: In
                  values:
                  - f12345678
                - key: app
                  operator: In
                  values:
                  - nginx
              topologyKey: kubernetes.io/hostname
            weight: 50
      containers:
      - name: nginx
        image: nginx:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 128Mi
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 10
          timeoutSeconds: 1
          successThreshold: 1
          failureThreshold: 3
      tolerations:
      - key: xxx.xxx.io/pool
        operator: Equal
        value: stable
      - key: node.kubernetes.io/not-ready
        operator: Exists
        effect: NoExecute
        tolerationSeconds: 300
      - key: node.kubernetes.io/unreachable
        operator: Exists
        effect: NoExecute
        tolerationSeconds: 300
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: nginx
            app-id: f12345678
        maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
      - labelSelector:
          matchLabels:
            app: nginx
            app-id: f12345678
        maxSkew: 3
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
      securityContext:
        fsGroup: 1001
      serviceAccountName: default
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      terminationGracePeriodSeconds: 30

nodepool definition

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  labels:
    xxx.io/stage: dev
  name: stable-multi-az-xxlarge
spec:
  disruption:
    budgets:
    - nodes: 100%
      reasons:
      - Empty
      - Drifted
    - nodes: "10"
    consolidateAfter: 2m
    consolidationPolicy: WhenEmptyOrUnderutilized
  limits:
    cpu: "80"
    memory: 320Gi
  template:
    metadata:
      labels:
        acquia.io/stage: dev
        compute-optimized: "true"
        xxx.xxx.io/ami-family: AL2
        xxx.xxx.io/cpu-type: mixed
        xxx.xxx.io/owned: "true"
        xxx.xxx.io/provisioner: karpenter
        node-role.xxx.xxx.io/stable: "true"
    spec:
      expireAfter: 720h
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: stable-multi-az-xxlarge
      requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - t3a.2xlarge
        - t3.2xlarge
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - on-demand
      startupTaints:
      - effect: NoSchedule
        key: xxx.xxx.io/wgcni-not-ready
        value: "true"
      taints:
      - effect: NoSchedule
        key: xxx.xxx.io/pool
        value: stable
      terminationGracePeriod: 1h

Karpenter creating 10 nodes for 1 replica
Image

Versions:

  • Chart Version: 1.0.5
  • karpenter Version : 1.2.0
  • Kubernetes Version (kubectl version): 1.31
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions