Skip to content

💡PROPOSAL: session-sharing egress NAT #302

@ymmt2005

Description

@ymmt2005

What

Although Coil implements a high-available egress NAT, the connection
tracking states are lost when one of the egress NAT Pods is gone.

Linux tracks connection status by conntrack tables in netfilter, and we can read and edit
conntrack tables via netlink. There is even a program called conntrackd to export and
synchronize conntrack data between two servers.

With this capability, Coil can keep connections on egress NAT between Pod restarts.

How

To switch all connections from one NAT pod to another, Coil has to do a few things.

  • The new Pod should take over the global IP address of the old Pod.
  • Coil should stop advertising the global IP on the node of the old Pod and start it on the node of the new Pod.

This means that Coil should not assign the global IP address to the Pod.
Instead, Coil should assign a normal cluster-internal IP address to NAT Pods
and give them extra global IP addresses for NAT use. Those global IP addresses
float between NAT Pods, so we can call them floating addresses.

Below is a summary of the necessary changes.
We need a detailed design doc still.

  • Define a pool of floating addresses for egress NAT.
  • Assign floating addresses to egress NAT Pods and program routing.
  • Reprogram routing when the owner of a floating address is changed.
    • One idea is to change the Service endpoints.
    • Another idea is to get rid of Service for egress Pods and program routing in each client Pod.
  • Appropriately advertise floating addresses for the current owner Pods.
  • Implement some fast health-checking for failed Pods.
    • Often used are VRRP or BFD, but we can use any protocol.
  • Synchronize the conntrack status between egress NAT Pods

Checklist

  • Finish implementation of the issue
  • Test all functions
  • Have enough logs to trace activities
  • Notify developers of necessary actions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions