You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When resolving DNS names that have a huge IP address pool (eg on AWS), the usual behaviour of DNS servers is to return a limited subset of IPs (say, 8 IP addresses) at random. When asked again, another set of random IPs is returned.
Currently the DNS cache is implemented in a way where the content of a DNS reply completely overwrites the cache entry for this hostname. This can lead to a race condition with concurrent DNS queries: When two pods query the same hostname in parallel, and the reply to the second pod's DNS query comes between the reply to the first pod's DNS query and the actual request to the target IP by the first pod, the IP may already be overwritten in the DNS cache and therefore also the firewall ruleset, making the request from the first pod fail.
A possible solution would be:
Store each IP in the DNS cache with its associated TTL value, add new IPs from subsequent replies instead of replacing the whole entry, and age out old IPs when they exceed their TTL.
The text was updated successfully, but these errors were encountered:
When resolving DNS names that have a huge IP address pool (eg on AWS), the usual behaviour of DNS servers is to return a limited subset of IPs (say, 8 IP addresses) at random. When asked again, another set of random IPs is returned.
Currently the DNS cache is implemented in a way where the content of a DNS reply completely overwrites the cache entry for this hostname. This can lead to a race condition with concurrent DNS queries: When two pods query the same hostname in parallel, and the reply to the second pod's DNS query comes between the reply to the first pod's DNS query and the actual request to the target IP by the first pod, the IP may already be overwritten in the DNS cache and therefore also the firewall ruleset, making the request from the first pod fail.
A possible solution would be:
Store each IP in the DNS cache with its associated TTL value, add new IPs from subsequent replies instead of replacing the whole entry, and age out old IPs when they exceed their TTL.
The text was updated successfully, but these errors were encountered: