-
Notifications
You must be signed in to change notification settings - Fork 782
Enhanced Subnet Discovery #3319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
5eccde1
to
9b00b7d
Compare
47077a1
to
36fde46
Compare
40fe808
to
d48bfb8
Compare
// When subnet discovery is disabled, check if primary subnet is excluded | ||
excluded, checkErr := cache.isPrimarySubnetExcluded() | ||
if checkErr != nil { | ||
// If we can't determine exclusion status, log warning and proceed | ||
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr) | ||
} else if excluded { | ||
// Primary subnet is explicitly excluded | ||
err = errors.New("primary subnet is tagged with kubernetes.io/role/cni=0 and subnet discovery is disabled - no valid subnets available for ENI creation") | ||
log.Error(err.Error()) | ||
return "", err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deals with misconfiguration scenario where customer tags subnet for exclusion but does not have subnet discovery enabled.
// Even in fallback, check if primary subnet is excluded | ||
excluded, checkErr := cache.isPrimarySubnetExcluded() | ||
if checkErr != nil { | ||
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr) | ||
} else if excluded { | ||
// Primary subnet is explicitly excluded | ||
err = errors.New("primary subnet is tagged with kubernetes.io/role/cni=0 - no valid subnets available for ENI creation") | ||
log.Error(err.Error()) | ||
return "", err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with the case where describe subnet call fails, but customer has also opted to exclude the primary subnet.
e3939b6
to
3b148b7
Compare
// Skip excluded ENIs when calculating available IPs for pod allocation | ||
if eni.IsExcludedForPodIPs { | ||
continue | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We skip the excluded ENI (which would the Primary ENI in this case practically speaking) from IP stat calculation.
Doing it here would automatically take care of warm target calculation in other places such as isDatastorePoolTooLow
, shouldRemoveExtraENIs
.
239365a
to
62e7ac9
Compare
f0faefa
to
3c0f314
Compare
ca35915
to
dc96a75
Compare
dc67e7f
to
21a4854
Compare
pkg/awsutils/awsutils.go
Outdated
if err != nil { | ||
awsAPIErrInc("DiscoverCustomSecurityGroups", err) | ||
log.Warnf("Failed to discover custom security groups: %v", err) | ||
log.Info("Falling back to using primary security groups for ENIs in secondary subnets") | ||
|
||
// Clear custom security groups cache to trigger fallback behavior | ||
cache.customSecurityGroups.Set([]string{}) | ||
|
||
// Apply primary security groups to ENIs in secondary subnets as fallback | ||
return cache.applyFallbackSecurityGroups(store) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fallback to primary security group and show logs to customer if we fail to describe any secondary security groups.
We do the same fallback for primary subnets.
13fbe14
to
77d99e0
Compare
add testing doc here
add integration test screenshot here
add edge cases here
What type of PR is this?
Which issue does this PR fix?:
What does this PR do / Why do we need it?:
Testing done on this change:
Will this PR introduce any new dependencies?:
Will this break upgrades or downgrades? Has updating a running cluster been tested?:
Does this change require updates to the CNI daemonset config files to work?:
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.