-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Hello,
I have issues with pods traffic targeting external endpoints, (eg. google.com) being timed out due to DNS not being resolved properly.
More specifically it seems SNAT is not in place, hence my pods never get the response back.
Expected Behavior
Pods external traffic is being SNAT-ed.
Current Behavior
Pods can't send traffic to public IP endpoints.
Steps to Reproduce (for bugs)
-
Running a home lab cluster with eBPF enabled, calico installed with the default:
CALICO_VERSION="3.31.3"
https://raw.githubusercontent.com/projectcalico/calico/v$CALICO_VERSION/manifests/operator-crds.yaml
https://raw.githubusercontent.com/projectcalico/calico/v$CALICO_VERSION/manifests/tigera-operator.yaml
https://raw.githubusercontent.com/projectcalico/calico/v$CALICO_VERSION/manifests/custom-resources-bpf.yaml -
custom-resources-bpf.yamlupdated withipPools.cidrto10.244.0.0/16to match kubeadm init's--pod-network-cidr. -
Deploy test nginx, crictl exec -it to it and try
curl -L google.com, fails with
root@nginx-deployment-54bb44699-48ts9:/# curl -L google.com
curl: (6) Could not resolve host: google.com
Context
Pods don't have essentially access to external/public Internet.
Tigera operator reports all resources as ready ✔️
NAME AVAILABLE PROGRESSING DEGRADED SINCE
apiserver True False False 103m
calico True False False 39m
goldmane True False False 28m
ippools True False False 107m
kubeproxy-monitor True False False 107m
whisker True False False 102m
CoreDNS up and running ✔️
kubectl get pods -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-587b887f6f-n9zsz 1/1 Running 1 (30m ago) 57m
coredns-587b887f6f-vmn9r 1/1 Running 1 (30m ago) 57m
Node to pod, pod to pod and service to pod all work ✔️
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 118m
nginx-service ClusterIP 10.104.200.193 <none> 80/TCP 107m
curl 10.104.200.193
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
...
curl -L google.com works in global/system shell and docker based containers ✔️
CoreDNS reports errors ❌
kubectl logs -n kube-system deploy/coredns
Found 2 pods, using pod/coredns-587b887f6f-n9zsz
maxprocs: Leaving GOMAXPROCS=16: CPU quota undefined
.:53
[INFO] plugin/reload: Running configuration SHA512 = 1b226df79860026c6a52e67daa10d7f0d57ec5b023288ec00c5e05f93523c894564e15b91770d3a07ae1cfbe861d15b37d4a0027e69c546ab112970993a3b03b
CoreDNS-1.13.1
linux/amd64, go1.25.2, 1db4568
[ERROR] plugin/errors: 2 1458823090375742670.8608639235073720918. HINFO: read udp 10.244.206.152:53973->195.29.247.161:53: i/o timeout
[ERROR] plugin/errors: 2 1458823090375742670.8608639235073720918. HINFO: read udp 10.244.206.152:58302->195.29.247.162:53: i/o timeout
[ERROR] plugin/errors: 2 1458823090375742670.8608639235073720918. HINFO: read udp 10.244.206.152:45678->195.29.247.162:53: i/o timeout
[ERROR] plugin/errors: 2 1458823090375742670.8608639235073720918. HINFO: read udp 10.244.206.152:43656->195.29.247.162:53: i/o timeout
[ERROR] plugin/errors: 2 1458823090375742670.8608639235073720918. HINFO: read udp 10.244.206.152:40956->195.29.247.162:53: i/o timeout
...
ISP's DNS server 195.29.247.161.53 can't answer to 10.x.x.x, no SNAT ❌
sudo tcpdump -i any udp port 53 -n
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
13:25:14.439619 cali50d19c830a9 In IP 10.244.206.152.57987 > 195.29.247.161.53: 49524+ NS? . (17)
13:25:14.471654 calib39cd32d877 In IP 10.244.206.153.37939 > 195.29.247.162.53: 36685+ NS? . (17)
13:25:14.834065 cali50d19c830a9 In IP 10.244.206.152.47250 > 195.29.247.162.53: 24035+ NS? . (17)
13:25:14.878193 calib39cd32d877 In IP 10.244.206.153.35148 > 195.29.247.161.53: 53973+ NS? . (17)
13:25:15.941229 cali50d19c830a9 In IP 10.244.206.152.44558 > 195.29.247.161.53: 40457+ NS? . (17)
As shown in the tcpdump log, outbound UDP 53 packets retain the Pod IP 10.244.206.152 when egressing the node. This indicates that the Calico eBPF data plane is failing to perform SNAT (Masquerade) despite natOutgoing: true being set in the IPPool.
I have tried to replace forward . /etc/resolv.conf in coredns with eg. 1.1.1.1 but the outcome is same.
kubectl get felixconfiguration default -o yaml
apiVersion: projectcalico.org/v3
kind: FelixConfiguration
metadata:
annotations:
operator.tigera.io/bpfEnabled: "true"
creationTimestamp: "2026-01-06T10:24:39Z"
generation: 1
name: default
resourceVersion: "6026"
uid: 46711f8c-f919-48b2-a02e-d90cc26c578c
spec:
bpfConnectTimeLoadBalancing: TCP
bpfEnabled: true
bpfExternalServiceMode: Tunnel
bpfHostNetworkedNATWithoutCTLB: Enabled
bpfLogLevel: ""
floatingIPs: Disabled
healthPort: 9099
logSeverityScreen: Info
nftablesMode: Enabled
reportingInterval: 0s
vxlanPort: 4789
vxlanVNI: 4096
kubectl get ippools default-ipv4-ippool -o yaml
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2026-01-06T10:24:37Z"
generation: 1
labels:
app.kubernetes.io/managed-by: tigera-operator
name: default-ipv4-ippool
resourceVersion: "8945"
uid: ba9a2550-5ad1-4fbd-9f4d-a1b6969df1d3
spec:
allowedUses:
- Workload
- Tunnel
assignmentMode: Automatic
blockSize: 26
cidr: 10.244.0.0/16
ipipMode: Never
natOutgoing: true
nodeSelector: all()
vxlanMode: Always
Your Environment
- Calico version: 3.31.3
- Calico dataplane: bpf
- Orchestrator version: kubernetes 1.35.0
- Operating System and version: Linux 6.14.0-37-generic, Ubuntu 24.04.3 LTS
Happy to provide more, thank you in advance.