[BUG] Datadog Agent Network Monitoring prevents AWS VPC CNI interface cleanup in EKS - causes pod IPv4 connectivity failures

## **Problem Description**

The Datadog Agent with **network monitoring enabled** prevents proper cleanup of network interfaces (veth pairs) when pods are deleted in AWS EKS clusters, leading to massive interface accumulation and intermittent IPv4 connectivity failures for pods.

## **Environment Details**

- **Platform:** Amazon EKS
- **Kubernetes Version:** 1.33
- **Node OS:** Bottlerocket OS 1.42.0 (aws-k8s-1.33)
- **Cluster IP address family**: IPv6 
- **Instance Type:** c7i.4xlarge
- **AWS VPC CNI Version:** v1.19.6-eksbuild.1
- **Datadog Agent Version:** 7.70.2
- **Datadog Agent Configuration:** Network monitoring enabled

## **Technical Symptoms Discovered**

### **Expected Behavior:**
- Node with 35 active pods should have ~35-70 veth interfaces (one pair per pod)
- When pods are deleted, AWS VPC CNI should clean up corresponding veth interfaces

### **Actual Behavior:**
- Node with 35 active pods accumulated **2000+ veth interfaces**  
- Stale veth interfaces persist after pod deletion

### **Connectivity Impact:**
- **Intermittent IPv4 connectivity failures** for new pods
- Pods cannot reach external services (DNS resolution works, but TCP connections fail)
- IPv6 connectivity remains unaffected

```bash
# Healthy node (without issue):
$ ip addr show | grep ": veth" | wc -l
2  # Expected: matches number of active pods

# Affected node (with Datadog network monitoring):
$ ip addr show | grep ": veth" | wc -l  
2000+  # Problem: massive interface accumulation
```

1. **Without Datadog network monitoring:** CNI cleanup works properly
2. **With Datadog network monitoring enabled:** veth interfaces accumulate indefinitely  
3. **Hypothesis:** Network monitoring hooks prevent AWS VPC CNI from properly cleaning up network interfaces during pod deletion

## **Steps to Reproduce**

1. Deploy AWS EKS cluster with Bottlerocket nodes
2. Install Datadog Agent with network monitoring enabled
3. Deploy and delete pods repeatedly over several days
4. Monitor interface count: `ip addr show | grep ": veth" | wc -l`
5. Observe interface count growing and not decreasing when pods are deleted

Disabling Datadog network monitoring immediately resolved the issue: the unused network interfaces were deleted after the agent was terminated.

Please let me know what additional diagnostic information would be helpful for investigating this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Datadog Agent Network Monitoring prevents AWS VPC CNI interface cleanup in EKS - causes pod IPv4 connectivity failures #41350

Problem Description

Environment Details

Technical Symptoms Discovered

Expected Behavior:

Actual Behavior:

Connectivity Impact:

Steps to Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Datadog Agent Network Monitoring prevents AWS VPC CNI interface cleanup in EKS - causes pod IPv4 connectivity failures #41350

Description

Problem Description

Environment Details

Technical Symptoms Discovered

Expected Behavior:

Actual Behavior:

Connectivity Impact:

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions