Improving the default PDB implementation. #8780

naemono · 2025-07-31T15:26:59Z

Resolves #2936

What is this change?

This adds an Enterprise feature which allows the ECK operator to create multiple PDBs which map to grouped roles within nodesets (ideally a PDB per nodeset of role 'X'). For non-enterprise customers, the behavior remains the same, which creates a single default pdb for the whole cluster allowing 1 disruption when the cluster is green.

Examples

Simple example (pdb per nodeset with single roles)

# Manifest
kind: Elasticsearch
spec:
  version: 8.12.1
  nodeSets:
    - name: masters
      count: 3
      config:
        node.roles: ["master"]
    - name: data
      count: 3
      config:
        node.roles: ["data", "ingest"]

# PDBs created
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: cluster-name
  name: cluster-name-es-default-master
spec:
  maxUnavailable: 1
  selector:
    matchExpressions:
    -  key: elasticsearch.k8s.elastic.co/cluster-name
        operator: "In"
        values: ["cluster-name"]
    -  key: elasticsearch.k8s.elastic.co/statefulset-name
        operator: "In"
        values: ["cluster-name-es-masters"]

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: cluster-name
  name: cluster-name-es-default-data
spec:
  maxUnavailable: 1
  selector:
    matchExpressions:
    -  key: elasticsearch.k8s.elastic.co/cluster-name
        operator: "In"
        values: ["cluster-name"]
    -  key: elasticsearch.k8s.elastic.co/statefulset-name
        operator: "In"
        values: ["cluster-name-es-data"]

Complex example (mixed roles across nodeSets)

# Manifest
kind: Elasticsearch
spec:
  version: 8.12.1
  nodeSets:
    - name: masters
      count: 3
      config:
        node.roles: ["master", "data", "ingest"]
    - name: data-warm
      count: 3
      config:
        node.roles: ["data_warm"]
    - name: data-cold
      count: 3
      config:
        node.roles: ["data_cold"]
    - name: data-frozen
      count: 3
      config:
        node.roles: ["data_frozen"]
    - name: coordinating
      count: 1
      config:
        node.roles: []

# PDBs created
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: cluster-name
  name: cluster-name-es-default-master
spec:
  maxUnavailable: 1
  selector:
    matchExpressions:
    -  key: elasticsearch.k8s.elastic.co/cluster-name
        operator: "In"
        values: ["cluster-name"]
    -  key: elasticsearch.k8s.elastic.co/statefulset-name
        operator: "In"
        # All of these statefulsets share data* roles, so they must be grouped together.
        values: ["cluster-name-es-masters", "cluster-name-es-data-warm", "cluster-name-es-data-cold"]

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: cluster-name
  name: cluster-name-es-default-data_frozen
spec:
  maxUnavailable: 1
  selector:
    matchExpressions:
    -  key: elasticsearch.k8s.elastic.co/cluster-name
        operator: "In"
        values: ["cluster-name"]
    -  key: elasticsearch.k8s.elastic.co/statefulset-name
        operator: "In"
        values: ["cluster-name-es-data_frozen"]

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: cluster-name
  name: cluster-name-es-default-coord
spec:
  maxUnavailable: 1
  selector:
    matchExpressions:
    -  key: elasticsearch.k8s.elastic.co/cluster-name
        operator: "In"
        values: ["cluster-name"]
    -  key: elasticsearch.k8s.elastic.co/statefulset-name
        operator: "In"
        values: ["cluster-name-es-coordinating"]

Implementation notes for review

I admittedly went back and forth on the implementation, specifically the method of grouping statefulSets by their associated roles. Initially when thinking through how I would group these nodeSets together I logically land on a recursive-style algorithm (traverse all sts' roles, and find all other sts' with this role, and traverse that sts' roles), which I don't want to bring into this code base, so I began investigating other potential options for this grouping. I seemed to recall something along the lines of connected points on a graph, which led me to the current implementation which is building an adjacency list, then using depth first search to build the connected "groups" of statefulsets. I keep feeling that there's a simpler method of doing this grouping, so suggestions are strongly welcome in this area. resolved in review process
~~See the note about pdb name overlaps. I'd love some comments as to whether it's an actual concern, and potential paths to resolve.~~ resolved in the review process.

TODO

full testing in a cluster
unit tests
Documentation (I will open a separate PR in our docs repository for the public facing changes)
Test older versions of Elasticsearch with the old "roles" style.

Testing

With a yellow cluster, data pdb works as expected (0 disruptions allowed), along with coordinating, and master allowing a single disruption.

❯ kc get es -n elastic
NAME       HEALTH   NODES   VERSION   PHASE   AGE
pdb-test   yellow   6       9.0.4     Ready   6d22h

❯ kc get pdb -n elastic
NAME                         MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
pdb-test-es-default-coordinating    N/A             1                 1                     6d22h
pdb-test-es-default-data     N/A             0                 0                     6d22h
pdb-test-es-default-master   N/A             1                 1                     6d22h

Forcing the cluster red, things are also acting as expected and allowing 0 disruptions:

❯ kc get es -n elastic
NAME       HEALTH   NODES   VERSION   PHASE             AGE
pdb-test   red      4       9.0.4     ApplyingChanges   6d22h

❯ kc get pdb -n elastic
NAME                         MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
pdb-test-es-default-coordinating    N/A             0                 0                     6d22h
pdb-test-es-default-data     N/A             0                 0                     6d22h
pdb-test-es-default-master   N/A             0                 0                     6d22h

Signed-off-by: Michael Montgomery <[email protected]>

Adjusting tests Signed-off-by: Michael Montgomery <[email protected]>

Optimize disruption func. Signed-off-by: Michael Montgomery <[email protected]>

Signed-off-by: Michael Montgomery <[email protected]>

Fixing tests Signed-off-by: Michael Montgomery <[email protected]>

Adding additional documentation. Signed-off-by: Michael Montgomery <[email protected]>

Signed-off-by: Michael Montgomery <[email protected]>

prodsecmachine · 2025-07-31T15:27:17Z

🎉 Snyk checks have passed. No issues have been found so far.

✅ security/snyk check is complete. No issues have been found. (View Details)

✅ license/snyk check is complete. No issues have been found. (View Details)

Copilot

Pull Request Overview

This PR adds Enterprise-licensed role-specific PodDisruptionBudgets (PDBs) to improve cluster operations for ECK operator users. For non-enterprise users, the default single PDB behavior remains unchanged.

Introduces role-based PDB grouping where StatefulSets sharing Elasticsearch node roles are grouped into the same PDB
Uses graph algorithms (adjacency list + DFS) to identify connected role groups
Maintains health-aware disruption limits based on specific role requirements

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
pkg/controller/elasticsearch/pdb/roles.go	Core implementation for role-specific PDB creation and reconciliation
pkg/controller/elasticsearch/pdb/roles_test.go	Comprehensive test coverage for role-specific PDB functionality
pkg/controller/elasticsearch/pdb/dfs.go	Graph algorithms for grouping StatefulSets by shared roles
pkg/controller/elasticsearch/pdb/dfs_test.go	Test coverage for grouping algorithms
pkg/controller/elasticsearch/pdb/reconcile.go	Updated main reconcile function with enterprise license check
pkg/controller/elasticsearch/pdb/reconcile_test.go	Updated test helper function and legacy test fix
pkg/controller/common/statefulset/fixtures.go	Extended test fixtures to support additional Elasticsearch node roles

Comments suppressed due to low confidence (1)

pkg/controller/elasticsearch/pdb/roles.go:81

There is an extra tab character at the end of the test name string, which should be removed for consistency.

		// Determine the roles for this group

pkg/controller/elasticsearch/pdb/roles.go

pkg/controller/elasticsearch/pdb/dfs.go

pkg/controller/elasticsearch/pdb/roles_test.go

Signed-off-by: Michael Montgomery <[email protected]>

naemono · 2025-07-31T15:37:48Z

buildkite test this

Signed-off-by: Michael Montgomery <[email protected]>

pkg/apis/elasticsearch/v1/elasticsearch_types.go

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

pebrc · 2025-08-13T07:53:56Z

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

+// allowedDisruptionsForRole returns the maximum number of pods that can be disrupted for a given role.
+func allowedDisruptionsForRole(
+	es esv1.Elasticsearch,
+	role esv1.NodeRole,


I think this logic needs to take all roles in the group into account.

Yes, I think this is the case in certain scenarios, which I recently made changes to cover. I'd like your input on whether I've covered the needed cases though @pebrc , thanks.

pebrc · 2025-08-13T07:57:07Z

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

+
+		// Determine the most conservative role to use when determining the maxUnavailable setting.
+		// If group has no roles, it's a coordinating ES role.
+		primaryRole := getPrimaryRoleForPDB(groupRoles)


Why don't we flip the first two priorities on L37 so that the role most sensitive to disruption comes first. I know I argued that masters should be first but if we want to drive the logic to determine the allowed disruptions from that priority list as well, I think I would change my mind.

I've adjusted this list of priority roles, but are you wanting specific changes to how we are determining the allowed disruptions for each type of role?

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

review updates. Signed-off-by: Michael Montgomery <[email protected]>

Signed-off-by: Michael Montgomery <[email protected]>

Add unit tests. Signed-off-by: Michael Montgomery <[email protected]>

Signed-off-by: Michael Montgomery <[email protected]>

Fix tests issue found by linter. Signed-off-by: Michael Montgomery <[email protected]>

pebrc

I started looking again but was not able to finish in the time I had. Will come back to this.

pebrc · 2025-08-18T09:21:57Z

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

+// getPrimaryRoleForPDB returns the primary role from a set of roles for PDB naming and grouping.
+// Data roles are most restrictive (require green health), so they take priority.
+// All other roles have similar disruption rules (require yellow+ health).
+func getPrimaryRoleForPDB(roles sets.Set[esv1.NodeRole]) esv1.NodeRole {


I don't think we need this function? groupBySharedRoles gives you the statefulsets already grouped by their primary roles?

pebrc · 2025-08-18T13:42:14Z

pkg/controller/elasticsearch/pdb/reconcile_with_roles.go

+	for _, sts := range statefulSets {
+		if isSensitiveToDisruptions(sts) && commonsts.GetReplicas(sts) == 1 {
+			return 0
+		}
+	}


This logic cannot be by stateful set imo. It needs to take desired number pods of a certain Elasticsearch role into account across all relevant stateful sets. Relating to the discussion out of band: there needs to be reasonable expectation of availability that we can protect with a PDB to warrant 0 allowed disruptions. E.g. at least two data nodes of each (non-frozen) data role type (so that there is a place for primary and replica shards to go).
At least three master nodes etc (so that two remainig can take over while the third is disrupted).

Protecting single node tiers with a PDB with 0 allowed disruptions would mean preventing any k8s upgrade or evictions to proceed forever, making the k8s cluster unmanageable, imo.

naemono added 18 commits July 23, 2025 13:39

Nearly functional implementation.

280634c

Signed-off-by: Michael Montgomery <[email protected]>

Adding additional tests.

50c1aaa

Signed-off-by: Michael Montgomery <[email protected]>

Move to dfs.

85c661a

Adjusting tests Signed-off-by: Michael Montgomery <[email protected]>

Restore old disruption behavior.

e8367cb

Optimize disruption func. Signed-off-by: Michael Montgomery <[email protected]>

Fix get most conservative role

0d65a39

Signed-off-by: Michael Montgomery <[email protected]>

Optimization

657121e

Fixing tests Signed-off-by: Michael Montgomery <[email protected]>

Adding additional unit tests.

d9dcc1e

Adding additional documentation. Signed-off-by: Michael Montgomery <[email protected]>

Simplify the sorting.

4fa170a

Signed-off-by: Michael Montgomery <[email protected]>

Simplify further.

b3a166b

Signed-off-by: Michael Montgomery <[email protected]>

Comments update; wrap the error.

a4951be

Signed-off-by: Michael Montgomery <[email protected]>

Remove some comments.

ec17cd3

Signed-off-by: Michael Montgomery <[email protected]>

Optimizations

33fe7d0

Signed-off-by: Michael Montgomery <[email protected]>

Break the dfs tasks into smaller funcs

fc7059d

Signed-off-by: Michael Montgomery <[email protected]>

revert license adjustment

51aab4b

Signed-off-by: Michael Montgomery <[email protected]>

remove comment

30e223f

Signed-off-by: Michael Montgomery <[email protected]>

adjust var name

b59475a

Signed-off-by: Michael Montgomery <[email protected]>

updating comments.

f6aa60e

Signed-off-by: Michael Montgomery <[email protected]>

comment update.

6d70499

Signed-off-by: Michael Montgomery <[email protected]>

naemono requested review from pebrc, barkbay, rhr323 and kvalliyurnatt July 31, 2025 15:26

naemono added the >feature Adds or discusses adding a feature to the product label Jul 31, 2025

naemono requested a review from Copilot July 31, 2025 15:29

Copilot AI reviewed Jul 31, 2025

View reviewed changes

pkg/controller/elasticsearch/pdb/roles.go Outdated Show resolved Hide resolved

pkg/controller/elasticsearch/pdb/dfs.go Outdated Show resolved Hide resolved

pkg/controller/elasticsearch/pdb/roles_test.go Outdated Show resolved Hide resolved

remove tab

2d15efd

Signed-off-by: Michael Montgomery <[email protected]>

naemono added 2 commits July 31, 2025 11:52

pre-allocate empty slices of slices.

ac580ec

Signed-off-by: Michael Montgomery <[email protected]>

fix lint issues.

a4aad89

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 8, 2025 13:28 View deployment

make generate

cc76108

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 8, 2025 13:36 View deployment

naemono added 2 commits August 8, 2025 08:43

Fix some linting issues

6ddc18c

Signed-off-by: Michael Montgomery <[email protected]>

Fix notice

4954437

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 8, 2025 13:51 View deployment

Move to using types, not strings.

dcbb908

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 8, 2025 14:08 View deployment

uncomment a test

c8d8099

Signed-off-by: Michael Montgomery <[email protected]>

naemono requested review from pebrc and barkbay August 8, 2025 16:12

github-actions bot deployed to docs-preview August 8, 2025 16:12 View deployment

pebrc reviewed Aug 13, 2025

View reviewed changes

Removing v1b1 pdb logic.

512fb2e

review updates. Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 13, 2025 22:29 View deployment

Fix unit tests

d37c3d0

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 14, 2025 13:34 View deployment

Fix issue when defining max disruptions allowed.

f81768e

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 14, 2025 15:36 View deployment

fixing unit tests

357be7c

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 14, 2025 18:48 View deployment

Ensure all roles vs coordinating roles is treated properly

b6a45c4

Add unit tests. Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 15, 2025 14:06 View deployment

make generate

5e1d2f5

Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 15, 2025 14:14 View deployment

remove unneeded version file.

8bd29c0

Fix tests issue found by linter. Signed-off-by: Michael Montgomery <[email protected]>

github-actions bot deployed to docs-preview August 15, 2025 14:44 View deployment

naemono requested a review from pebrc August 16, 2025 16:42

pebrc reviewed Aug 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improving the default PDB implementation. #8780

Improving the default PDB implementation. #8780

Uh oh!

naemono commented Jul 31, 2025 •

edited

Loading

Uh oh!

prodsecmachine commented Jul 31, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naemono commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pebrc Aug 13, 2025

Uh oh!

naemono Aug 14, 2025

Uh oh!

pebrc Aug 13, 2025

Uh oh!

naemono Aug 14, 2025

Uh oh!

Uh oh!

pebrc left a comment

Uh oh!

pebrc Aug 18, 2025

Uh oh!

pebrc Aug 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Improving the default PDB implementation. #8780

Are you sure you want to change the base?

Improving the default PDB implementation. #8780

Uh oh!

Conversation

naemono commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is this change?

Examples

Implementation notes for review

TODO

Testing

Uh oh!

prodsecmachine commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎉 Snyk checks have passed. No issues have been found so far.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naemono commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pebrc Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

naemono Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

pebrc Aug 13, 2025

Choose a reason for hiding this comment

Uh oh!

naemono Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pebrc left a comment

Choose a reason for hiding this comment

Uh oh!

pebrc Aug 18, 2025

Choose a reason for hiding this comment

Uh oh!

pebrc Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

naemono commented Jul 31, 2025 •

edited

Loading

prodsecmachine commented Jul 31, 2025 •

edited

Loading

pebrc Aug 18, 2025 •

edited

Loading