Skip to content

Enhanced Subnet Discovery #3319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Conversation

dshehbaj
Copy link
Member

@dshehbaj dshehbaj commented Jun 20, 2025

add testing doc here

add integration test screenshot here

add edge cases here

What type of PR is this?

Which issue does this PR fix?:

What does this PR do / Why do we need it?:

Testing done on this change:

Will this PR introduce any new dependencies?:

Will this break upgrades or downgrades? Has updating a running cluster been tested?:

Does this change require updates to the CNI daemonset config files to work?:

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch 10 times, most recently from 5eccde1 to 9b00b7d Compare June 23, 2025 07:08
@dshehbaj dshehbaj changed the title pkg awsutils: check for primary/secondary subnet exclusion Enhanced Subnet Discovery Jun 23, 2025
@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch 19 times, most recently from 47077a1 to 36fde46 Compare June 30, 2025 06:10
@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch 4 times, most recently from 40fe808 to d48bfb8 Compare July 1, 2025 05:12
Comment on lines +1019 to +1179
// When subnet discovery is disabled, check if primary subnet is excluded
excluded, checkErr := cache.isPrimarySubnetExcluded()
if checkErr != nil {
// If we can't determine exclusion status, log warning and proceed
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr)
} else if excluded {
// Primary subnet is explicitly excluded
err = errors.New("primary subnet is tagged with kubernetes.io/role/cni=0 and subnet discovery is disabled - no valid subnets available for ENI creation")
log.Error(err.Error())
return "", err
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deals with misconfiguration scenario where customer tags subnet for exclusion but does not have subnet discovery enabled.

Comment on lines +973 to +1122
// Even in fallback, check if primary subnet is excluded
excluded, checkErr := cache.isPrimarySubnetExcluded()
if checkErr != nil {
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr)
} else if excluded {
// Primary subnet is explicitly excluded
err = errors.New("primary subnet is tagged with kubernetes.io/role/cni=0 - no valid subnets available for ENI creation")
log.Error(err.Error())
return "", err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealing with the case where describe subnet call fails, but customer has also opted to exclude the primary subnet.

@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch from e3939b6 to 3b148b7 Compare July 1, 2025 05:59
Comment on lines +846 to +850
// Skip excluded ENIs when calculating available IPs for pod allocation
if eni.IsExcludedForPodIPs {
continue
}

Copy link
Member Author

@dshehbaj dshehbaj Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We skip the excluded ENI (which would the Primary ENI in this case practically speaking) from IP stat calculation.

Doing it here would automatically take care of warm target calculation in other places such as isDatastorePoolTooLow, shouldRemoveExtraENIs.

@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch from 239365a to 62e7ac9 Compare July 1, 2025 06:44
@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch from f0faefa to 3c0f314 Compare July 1, 2025 07:05
@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch 4 times, most recently from ca35915 to dc96a75 Compare July 1, 2025 08:44
@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch from dc67e7f to 21a4854 Compare July 1, 2025 08:53
Comment on lines 562 to 572
if err != nil {
awsAPIErrInc("DiscoverCustomSecurityGroups", err)
log.Warnf("Failed to discover custom security groups: %v", err)
log.Info("Falling back to using primary security groups for ENIs in secondary subnets")

// Clear custom security groups cache to trigger fallback behavior
cache.customSecurityGroups.Set([]string{})

// Apply primary security groups to ENIs in secondary subnets as fallback
return cache.applyFallbackSecurityGroups(store)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fallback to primary security group and show logs to customer if we fail to describe any secondary security groups.

We do the same fallback for primary subnets.

@dshehbaj dshehbaj force-pushed the shehbaj/enhanced-subnet-discovery branch from 13fbe14 to 77d99e0 Compare July 1, 2025 09:41
@dshehbaj dshehbaj marked this pull request as ready for review July 1, 2025 10:26
@dshehbaj dshehbaj requested a review from a team as a code owner July 1, 2025 10:26
@dshehbaj dshehbaj marked this pull request as draft July 1, 2025 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant