Skip to content

Unnecessary access to v1.Pod lead to timeout. #4955

@qin-nz

Description

@qin-nz

What happened:

my config:

        --source=service
        --service-type-filter=LoadBalancer

my k8s cluster has too many pods, and apiserver can NOT return in 60 seconds.

It will error at:

time="2024-12-16T12:43:02Z" level=info msg="Created Kubernetes client https://10.0.0.1:443"
time="2024-12-16T12:44:02Z" level=fatal msg="failed to sync *v1.Pod: context deadline exceeded"

What you expected to happen:

Because I specifiy source=service and service-type-filter=LoadBalancer, So it should NEVER accecss api to v1.Pod.

But actually, NewServiceSource call waitForCacheSync which will get all pods regardless of service-type-filter.

func NewServiceSource(ctx context.Context, kubeClient kubernetes.Interface, namespace, annotationFilter string, fqdnTemplate string, combineFqdnAnnotation bool, compatibility string, publishInternal bool, publishHostIP bool, alwaysPublishNotReadyAddresses bool, serviceTypeFilter []string, ignoreHostnameAnnotation bool, labelSelector labels.Selector, resolveLoadBalancerHostname bool) (Source, error) {
tmpl, err := parseTemplate(fqdnTemplate)
if err != nil {
return nil, err
}
// Use shared informers to listen for add/update/delete of services/pods/nodes in the specified namespace.
// Set resync period to 0, to prevent processing when nothing has changed
informerFactory := kubeinformers.NewSharedInformerFactoryWithOptions(kubeClient, 0, kubeinformers.WithNamespace(namespace))
serviceInformer := informerFactory.Core().V1().Services()
endpointsInformer := informerFactory.Core().V1().Endpoints()
podInformer := informerFactory.Core().V1().Pods()
nodeInformer := informerFactory.Core().V1().Nodes()
// Add default resource event handlers to properly initialize informer.
serviceInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
},
},
)
endpointsInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
},
},
)
podInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
},
},
)
nodeInformer.Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: func(obj interface{}) {
},
},
)
informerFactory.Start(ctx.Done())
// wait for the local cache to be populated.
if err := waitForCacheSync(context.Background(), informerFactory); err != nil {
return nil, err
}

So it became timeout because of hard code time.

ctx, cancel := context.WithTimeout(ctx, 60*time.Second)

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • External-DNS version (use external-dns --version): v0.14.2
  • DNS provider: rfc2136
  • Others:

** Releated issues**:

Metadata

Metadata

Labels

help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions