Prioritize lower DeviceNumber in ipv4 assignment #3300
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
Improvement
Which issue does this PR fix?
I didn't see a related github issue; I've provided context here:
Currently
(*DataStore).AssignPodIPv4Address()
iterates through theeniPool
map when searching for an IP address to assign to the pod. Because map iteration order is random, this means that new assignments will tend to be distributed evenly across all ENIs with sufficient available IPs.When
WARM_ENI_TARGET
is set (and not{WARM,MINIMUM}_IP_TARGET
), IPs are only returned to the subnet when the entire ENI can be removed, which can only happen when it's completely unused.At work, our workloads mean that even distribution of new assignments makes it very unlikely for any ENI to become unused, once added. This in turn means that ipamd almost never returns IPs to the subnet, causing us to have many more IPs allocated than can ever be used in practice.
What does this PR do / Why do we need it?
This PR changes the iteration order over ENIs in
(*DataStore.AssignPodIPv4Address()
from random to in order of increasing device number.This is worthwhile because ipamd is otherwise unlikely to free IPs under the behavior from
WARM_ENI_TARGET
.That said, there's potentially good reasons to maintain randomness here -- either by default or opt-in. For example, random assignment to ENIs may better distribute load, helping with throughput.
Please let me know if you'd like to see this made configurable - I wanted to start simple for now.
Testing done on this change
Output from
go test $(go list ./... | grep -v '/amazon-vpc-cni-k8s/test/integration/')
Will this PR introduce any new dependencies?
No
Will this break upgrades or downgrades? Has updating a running cluster been tested?
I don't expect it will cause issues with upgrades/downgrades, although we haven't yet tried it on a running cluster.
Does this change require updates to the CNI daemonset config files to work?
No
Does this PR introduce any user-facing change?
This PR intends to have a user-visible effect, but shouldn't have any change in functionality. I'll leave it up to your judgement.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.