-
Notifications
You must be signed in to change notification settings - Fork 768
Improving the default PDB implementation. #8780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Adjusting tests Signed-off-by: Michael Montgomery <[email protected]>
Optimize disruption func. Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Fixing tests Signed-off-by: Michael Montgomery <[email protected]>
Adding additional documentation. Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Enterprise-licensed role-specific PodDisruptionBudgets (PDBs) to improve cluster operations for ECK operator users. For non-enterprise users, the default single PDB behavior remains unchanged.
- Introduces role-based PDB grouping where StatefulSets sharing Elasticsearch node roles are grouped into the same PDB
- Uses graph algorithms (adjacency list + DFS) to identify connected role groups
- Maintains health-aware disruption limits based on specific role requirements
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
pkg/controller/elasticsearch/pdb/roles.go | Core implementation for role-specific PDB creation and reconciliation |
pkg/controller/elasticsearch/pdb/roles_test.go | Comprehensive test coverage for role-specific PDB functionality |
pkg/controller/elasticsearch/pdb/dfs.go | Graph algorithms for grouping StatefulSets by shared roles |
pkg/controller/elasticsearch/pdb/dfs_test.go | Test coverage for grouping algorithms |
pkg/controller/elasticsearch/pdb/reconcile.go | Updated main reconcile function with enterprise license check |
pkg/controller/elasticsearch/pdb/reconcile_test.go | Updated test helper function and legacy test fix |
pkg/controller/common/statefulset/fixtures.go | Extended test fixtures to support additional Elasticsearch node roles |
Comments suppressed due to low confidence (1)
pkg/controller/elasticsearch/pdb/roles.go:81
- There is an extra tab character at the end of the test name string, which should be removed for consistency.
// Determine the roles for this group
Signed-off-by: Michael Montgomery <[email protected]>
buildkite test this |
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
// allowedDisruptionsForRole returns the maximum number of pods that can be disrupted for a given role. | ||
func allowedDisruptionsForRole( | ||
es esv1.Elasticsearch, | ||
role esv1.NodeRole, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic needs to take all roles in the group into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this is the case in certain scenarios, which I recently made changes to cover. I'd like your input on whether I've covered the needed cases though @pebrc , thanks.
|
||
// Determine the most conservative role to use when determining the maxUnavailable setting. | ||
// If group has no roles, it's a coordinating ES role. | ||
primaryRole := getPrimaryRoleForPDB(groupRoles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we flip the first two priorities on L37 so that the role most sensitive to disruption comes first. I know I argued that master
s should be first but if we want to drive the logic to determine the allowed disruptions from that priority list as well, I think I would change my mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adjusted this list of priority roles, but are you wanting specific changes to how we are determining the allowed disruptions for each type of role?
review updates. Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Add unit tests. Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Fix tests issue found by linter. Signed-off-by: Michael Montgomery <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started looking again but was not able to finish in the time I had. Will come back to this.
// getPrimaryRoleForPDB returns the primary role from a set of roles for PDB naming and grouping. | ||
// Data roles are most restrictive (require green health), so they take priority. | ||
// All other roles have similar disruption rules (require yellow+ health). | ||
func getPrimaryRoleForPDB(roles sets.Set[esv1.NodeRole]) esv1.NodeRole { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this function? groupBySharedRoles
gives you the statefulsets already grouped by their primary roles?
for _, sts := range statefulSets { | ||
if isSensitiveToDisruptions(sts) && commonsts.GetReplicas(sts) == 1 { | ||
return 0 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic cannot be by stateful set imo. It needs to take desired number pods of a certain Elasticsearch role into account across all relevant stateful sets. Relating to the discussion out of band: there needs to be reasonable expectation of availability that we can protect with a PDB to warrant 0 allowed disruptions. E.g. at least two data nodes of each (non-frozen) data role type (so that there is a place for primary and replica shards to go).
At least three master nodes etc (so that two remainig can take over while the third is disrupted).
Protecting single node tiers with a PDB with 0 allowed disruptions would mean preventing any k8s upgrade or evictions to proceed forever, making the k8s cluster unmanageable, imo.
Resolves #2936
What is this change?
This adds an Enterprise feature which allows the ECK operator to create multiple PDBs which map to grouped roles within nodesets (ideally a PDB per nodeset of role 'X'). For non-enterprise customers, the behavior remains the same, which creates a single default pdb for the whole cluster allowing 1 disruption when the cluster is green.
Examples
Simple example (pdb per nodeset with single roles)
Complex example (mixed roles across nodeSets)
Implementation notes for review
I admittedly went back and forth on the implementation, specifically the method of grouping statefulSets by their associated roles. Initially when thinking through how I would group these nodeSets together I logically land on a recursive-style algorithm (traverse all sts' roles, and find all other sts' with this role, and traverse that sts' roles), which I don't want to bring into this code base, so I began investigating other potential options for this grouping. I seemed to recall something along the lines of connected points on a graph, which led me to the current implementation which is building an adjacency list, then using depth first search to build the connected "groups" of statefulsets. I keep feeling that there's a simpler method of doing this grouping, so suggestions are strongly welcome in this area.resolved in review processSee the note about pdb name overlaps. I'd love some comments as to whether it's an actual concern, and potential paths to resolve.resolved in the review process.TODO
Testing
With a yellow cluster, data pdb works as expected (0 disruptions allowed), along with coordinating, and master allowing a single disruption.
Forcing the cluster red, things are also acting as expected and allowing 0 disruptions: