Description
Kube-OVN Version
v1.12.28
Kubernetes Version
v1.31.2
Operation-system/Kernel Version
TencentOS Server 4.2
6.6.47-12.tl4.x86_64
Description
This issue contains 2 problems.
I have a cluster with 10 nodes, 260 subnets in 1 vpc, ~5k ports. Today, I discovered that my ovs-ovn
on some nodes was killed due to OOM. Therefore, I increased the memory limit and restarted the kube-ovn-controller
.
Then I found my Work Queue Latency has remained at a very high level. (>10min)
I noticed that the controller was continuously performing "add policy route" operations in the logs at a VERY SLOW pace (approximately 1-3 seconds per entry). This's the first problem.

I understand that after restarting the KubeOVN controller, it needs to traverse all 10 nodes and 260 subnets. I expected the number of add policy route
operations to be ~2600.
[root@vm-master-1 a]# cat 2.log | grep 'add policy route' | wc -l
3558
However, after waiting for a long time, I found that this number far exceeded than it, and there appeared to be a large number of duplicate operations. (Same node, same subnet, but executed twice)
[root@vm-master-1 a]# cat 2.log | grep 'add policy route' | grep 'net.a'
I1212 17:05:13.323328 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.1_ip4, action reroute, extrenalID map[node:node-1 subnet:net-a vendor:kube-ovn]
I1212 17:11:55.754750 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node-2 subnet:net-a vendor:kube-ovn]
I1212 17:14:03.134696 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.4_ip4, action reroute, extrenalID map[node:node-4 subnet:net-a vendor:kube-ovn]
I1212 17:19:21.932002 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node-3 subnet:net-a vendor:kube-ovn]
I1212 17:21:50.341122 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-a vendor:kube-ovn]
I1212 17:23:18.262599 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.9_ip4, action reroute, extrenalID map[node:node-9 subnet:net-a vendor:kube-ovn]
I1212 17:31:00.166875 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.5_ip4, action reroute, extrenalID map[node:node-5 subnet:net-a vendor:kube-ovn]
I1212 17:33:28.833554 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.8_ip4, action reroute, extrenalID map[node:node-8 subnet:net-a vendor:kube-ovn]
I1212 17:34:44.164926 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.1_ip4, action reroute, extrenalID map[node:node-1 subnet:net-a vendor:kube-ovn]
I1212 17:34:46.367902 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.2_ip4, action reroute, extrenalID map[node:node-2 subnet:net-a vendor:kube-ovn]
I1212 17:34:49.141808 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.3_ip4, action reroute, extrenalID map[node:node-3 subnet:net-a vendor:kube-ovn]
I1212 17:34:51.828600 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.vm.master.1_ip4, action reroute, extrenalID map[node:vm-master-1 subnet:net-a vendor:kube-ovn]
I1212 17:34:54.537169 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.4_ip4, action reroute, extrenalID map[node:node-4 subnet:net-a vendor:kube-ovn]
I1212 17:34:57.445022 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.8_ip4, action reroute, extrenalID map[node:node-8 subnet:net-a vendor:kube-ovn]
I1212 17:35:00.533108 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.5_ip4, action reroute, extrenalID map[node:node-5 subnet:net-a vendor:kube-ovn]
I1212 17:35:03.406251 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.6_ip4, action reroute, extrenalID map[node:node-6 subnet:net-a vendor:kube-ovn]
I1212 17:35:06.634137 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.7_ip4, action reroute, extrenalID map[node:node-7 subnet:net-a vendor:kube-ovn]
I1212 17:35:09.738121 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.9_ip4, action reroute, extrenalID map[node:node-9 subnet:net-a vendor:kube-ovn]
I1212 17:36:34.646594 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.6_ip4, action reroute, extrenalID map[node:node-6 subnet:net-a vendor:kube-ovn]
I1212 17:44:43.345692 7 subnet.go:2524] add policy route for router: ovn-cluster, match ip4.src == $net.a.node.7_ip4, action reroute, extrenalID map[node:node-7 subnet:net-a vendor:kube-ovn]
Now I'm unable to create new subnets, so I plan to wait overnight and check again the next day to see if the operations have completed. If more information is needed, please contact me.
Steps To Reproduce
/
Current Behavior
/
Expected Behavior
/