Replies: 1 comment
-
This is not specific to k3s. The logic in question is built in to Kubernetes. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
🛠 Background
I’m running a K3s cluster version v1.31.6+k3s1 (6ab750f) with Flannel as the CNI (vxlan backend) and SQLite as the default datastore. Each Node gets a /24 Pod CIDR (e.g., 10.42.X.0/24) from the default 10.42.0.0/16 cluster CIDR. My setup includes an on-demand Desktop Node that joins and leaves the cluster frequently (simulating edge device behavior—think IoT or intermittent connections).
Here’s what I’ve noticed:
Every time the Desktop Node rejoins, its Pod IP allocation keeps creeping up.
Example: svclb-* Pods go from 10.42.3.2 → 10.42.3.5 over time, without reusing old IPs right away.
The Node sticks to its assigned CIDR (10.42.3.0/24 so far), but I’m worried about what happens when all 256 IPs in this /24 get exhausted.
This matters because I’m using Tailscale with advertise-routes=10.42.3.0/24 to route traffic to these Pods.
If the CIDR changes, I’d have to manually update the routes, which isn’t sustainable for a dynamic setup.
🔹 So far, K3S has NOT changed my /24 allocation, but I want to confirm its behavior once all available Pod IPs are used.
🚨 Problem
If the /24 CIDR (e.g., 10.42.3.0/24) runs out of IPs:
Will K3s allocate a new /24 subnet (like 10.42.8.0/24) for this Node?
If it does, Tailscale’s static advertise-routes=10.42.3.0/24 becomes invalid, and Pods on the new subnet won’t be reachable until I update the routes manually.
For edge deployments with on-demand Nodes (frequent join/leave cycles), this could break connectivity and add operational overhead.
Pod IPs keep incrementing without immediate reuse, which makes me wonder how K3S (or Flannel) handles this under the hood.
🔹 Currently, I haven’t hit this limit yet, but I’d like to understand K3S’s behavior before this becomes a problem.
🔍 My Wild Guess
Here’s a random thought (just me guessing, don’t take it too seriously XD):
Could this be related to SQLite not cleaning up old Pod IP records?
Maybe when the Node goes offline, the IPs it used stay “reserved” in the database, so Flannel skips them and picks new ones instead.
This might also be related to Flannel’s Lease expiration mechanism—does it delay IP reuse?
I don’t know if this theory is right, but I’m curious if the datastore or lease settings play a role in how Pod IPs are allocated over time.
🔍 Key Questions
1️⃣ What happens when a Node’s /24 CIDR is fully allocated?
Does K3S reuse old IPs or grab a new /24 from 10.42.0.0/16?
2️⃣ Is there a built-in mechanism to recycle Pod IPs for on-demand Nodes that leave and rejoin?
3️⃣ Can I force K3s/Flannel to reuse IPs within the same /24 instead of expanding the CIDR?
4️⃣ Would tweaking Flannel config (e.g., lease management) or K3s CIDR settings (e.g., --cluster-cidr) help control this behavior?
📌 Related Observations
I’ve checked similar Issues like #5885 and #4682, where folks hit IP exhaustion and had to manually clean up CNI files.
But those don’t focus on on-demand Nodes or explain the long-term behavior.
This feels critical for edge/IoT setups where Nodes aren’t always online.
I’m not planning to dig into the code myself, just throwing this out there for insights!
📌 How to Reproduce
1️⃣ Set up a single-node K3s with Flannel and SQLite.
2️⃣ Add an on-demand Node (e.g., a Desktop), run some Pods.
3️⃣ Shut it down, restart, and rejoin the cluster a few times.
4️⃣ Watch the Pod IPs (kubectl get pods -o wide)—do they keep going up?
📌 Expected Behavior
Pod IPs should be reused within the assigned /24, keeping things predictable for external routing (like Tailscale).
📌 Current Behavior
IPs increment without reuse, potentially exhausting the /24 and risking a new CIDR allocation.
Any thoughts or clarifications would be awesome—thanks for the help! 🙌
Beta Was this translation helpful? Give feedback.
All reactions