-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
We recently faced the problem that the OnRevokedEvent was lost, when using spring-cloud-kubernetes-fabric8-leader.
Our setup consists of 3 kubernetes pods with one leader. While our configuration is certainly not perfect (leadership is sometimes revoked while a pod is still running), it mostly works.
During the recent incident, the old leader was revoked because the readiness-check failed but the pod didn't shut down. Shortly after, another pod got elected as the new leader and started to do the leader-specific work. Meanwhile, the old leader didn't receive the OnRevokedEvent and kept on thinking that he was still the leader.
As a result, we had two leaders running until somebody noticed and manually shut down the "phantom-leader".
Sadly I can't reproduce the problem and the logs don't show any anomalies. Is this a known problem or how can I prevent this from happening?