-
Notifications
You must be signed in to change notification settings - Fork 15
Description
We recently spent several hours debugging why Cluster.Bootstrap wasn't forming a cluster in Kubernetes. The issue turned out to be a hostname mismatch between the self contact point and what Kubernetes Discovery was returning, but the logs didn't make this clear.
Current confusing logs:
When bootstrap fails to match the self node, you see repeated messages like:
[INFO] Located service members based on: [Lookup(drawtogether, management, tcp)]: [ResolvedTarget(10.1.2.138, 8558, 10.1.2.138)], filtered to [10.1.2.138:8558]
[INFO] Contact point [akka.tcp://DrawTogether@drawtogether-0.drawtogether:5055] returned [0] seed-nodes []
[INFO] Discovered [1] contact points, confirmed [0], which is less than the required [3], retrying
The problem: The self contact point was using hostname http://drawtogether-0.drawtogether:8558/ but Discovery was returning IP 10.1.2.138:8558. The HostMatches logic failed, but there's no log indicating WHY the match failed or WHAT the self contact point hostname was.
The only place the self contact point appears is:
[INFO] Using self contact point address: http://drawtogether-0.drawtogether:8558/
But this is logged very early during startup, making it easy to miss when looking at bootstrap coordination logs.
Suggested improvements:
- In SelfAwareJoinDecider when HostMatches fails, log a DEBUG or INFO message showing:
- The self hostname being tested
- The target hostname/IP from discovery
- Why the match failed (exact comparison failed, IP address extraction failed, etc.)
Example:
[DEBUG] Self contact point hostname 'drawtogether-0.drawtogether' does not match discovered target '10.1.2.138' (IP extraction yielded '')
-
In BootstrapCoordinator when filtering resolved targets, log which targets were filtered out and why (e.g., "excluding self because HostMatches returned true" vs "keeping target because it's not self").
-
Consider logging Http.HostName separately from the full SelfBaseUri, so users can see exactly what hostname Akka.Management is using for self-identification:
[INFO] Akka.Management HTTP endpoint bound to 0.0.0.0:8558, advertising as hostname 'drawtogether-0.drawtogether'
[INFO] Using self contact point address: http://drawtogether-0.drawtogether:8558/
These logging improvements would have immediately revealed our configuration issue (we were explicitly setting Http.HostName to the StatefulSet DNS name instead of letting it auto-detect the pod IP).