Deprecate Leader for life based leader election #117

varshaprasad96 · 2023-07-10T16:46:44Z

Feature Request

Is your feature request related to a problem? Please describe.
Currently, the repository contains a leader-for-life based election model that ensures that a single leader is elected for life during a HA state.

To briefly describe what leader for life and leader for lease approaches are:
(1) Leader for life: A leader pod is selected for life until its garbage collected. This ensures there is only one leader at any instant of time.
(2) Leader for lease: Leader election happens periodically at a defined interval which can be tweaked. When the current leader is not able to renew the lease, a new leader is elected.

More details on both of these approaches are available here.

Approach (2) is implemented upstream, in client-go and is scaffolded by default for the past 30+ releases of SDK, since it is implemented as a part of setting up the manager in controller-runtime. It guarantees faster election of leaders, less downtime, recovery from disconnected/frozen node failures. However, it does not eliminate the split brain scenario - where more than a single leader is available at an instant of time.

Approach (1) on the other hand, was developed long before we had a leader-for-lease implemented upstream. Though it solves the split brain scenario, it does not guarantee recovery from node failure nor faster recovery. We also have issues with integrating it to controller-runtime (#48). Neither is it being maintained nor used as widely as leader for lease.

Here is a detailed comment explaining the preference of (2) over the other, wherein users would prefer a faster recovery even though there is split brain scenario intermittently, rather than an implementation that does not guarantee faster recovery.

Describe the solution you'd like
Since leader for life approach is not being widely used, neither works with controller-runtime seamlessly, it is better to adopt a well tested upstream library than to depend on what is currently available as an option.

The solution for this is:

To deprecate and remove leader for life in future releases of Operator SDK.
To bring this up upstream (in controller-runtime), for easier integration. This had already been brought up upstream (FR: Add leader-for-life implementation for leader-election kubernetes-sigs/controller-runtime#1963) but there was no response on the same.

everettraven mentioned this issue Apr 18, 2024

Bump k8s to 1.29 #163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate Leader for life based leader election #117

Deprecate Leader for life based leader election #117

varshaprasad96 commented Jul 10, 2023

Deprecate Leader for life based leader election #117

Deprecate Leader for life based leader election #117

Comments

varshaprasad96 commented Jul 10, 2023

Feature Request