Skip to content

Commit 7ebbd8e

Browse files
kanakajonsmock
authored andcommitted
Add ethtool link prop. Disable TX offload (unless --keep-veth-offload)
Add a new ethtool property that can be either a string or a string array. The syntax of each ethtool property is basically the ethtool command line without the "devname". So for the ethtool command: ethtool --offload eth0 rx off the link equivalent would be: {dev: eth0, ethtool: ["--offload rx off"], ...} Disable TX offload for veth container interfaces unless `--keep-veth-offload` is enabled. Veth interfaces are a little weird in that they have the TX offload setting enabled by default. However, this is basically ignored (veth interfaces don't have hardware to offload to). Usually this is fine and is the fastest setting. However, if you have a some sort of software networking in the middle of veth links then the kernel will lose track of the fact that it does not need to validate checksums and having TX offload will cause the veth interface at the far end to drop packets due to bad checksums (because they were never done). The safer behavior is disable TX offload for veth links by default so that the kernel will do software checksum generation even for veth links. This changes a the default behavior and for most testing/simulation situation should be a non-issue, however it could some performance implications for high throughput scenarios. For this reason the `--keep-veth-offload` parameter is provided for restoring the previous default behavior. If you want to keep the original behavior but want to disable TX offload for certain veth interfaces you can use `--keep-veth-offload` and then add `ethtool: "--offload tx off"` for the specific interfaces that should have TX offload disabled.
1 parent bafef51 commit 7ebbd8e

File tree

5 files changed

+39
-14
lines changed

5 files changed

+39
-14
lines changed

README.md

+7
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ The following table describes the link properties:
146146
| mode | 5 | string | | virt intf mode |
147147
| vlanid | vlan | number | | VLAN ID |
148148
| forward | veth | strings 6 8 | | forward conlink ports 7 |
149+
| ethtool | veth | strings 8 | | ethtool settings |
149150

150151
- 1 - veth, dummy, vlan, ipvlan, macvlan, ipvtap, macvtap
151152
- 2 - defaults to outer compose service
@@ -188,6 +189,12 @@ For publicly publishing a port, the conlink container needs to be on
188189
a docker network and the `conlink_port` should match the target port
189190
of a docker published port (for the conlink container).
190191

192+
For the `ethtool` property, refer to the `ethtool` man page. The
193+
syntax for each ethtool setting is basically the ethtool command line
194+
arguments without the "devname. So the equivalent of the ethtool
195+
command `ethtool --offload eth0 rx off` would be link configuration
196+
`{dev: eth0, ethtool: ["--offload rx off"], ...}`.
197+
191198
### Bridges
192199

193200
The bridge settings currently only support the "mode" setting. If

examples/test7-compose.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ services:
2929
mac: 00:0a:0b:0c:0d:01
3030
mtu: 4111
3131
netem: "rate 10mbit delay 40ms"
32+
ethtool: "--offload rx off"
3233
- bridge: s2
3334
ip: 100.0.1.1/16
3435
dev: eth1

link-add.sh

+16-10
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,10 @@ usage () {
4242
echo >&2 " --remote REMOTE - Remote address for geneve/vxlan types"
4343
echo >&2 " --vni VNI - Virtual Network Identifier for geneve/vxlan types"
4444
echo >&2 ""
45-
echo >&2 " --netem NETEM - tc qdisc netem OPTIONS (man 8 netem) (can repeat)"
45+
echo >&2 " --netem NETEM - tc qdisc netem OPTIONS (can repeat)"
46+
echo >&2 " (man 8 netem)"
47+
echo >&2 " --ethtool 'ARG OPTS' - ethtool ARG INTF0 OPTS (can repeat)"
48+
echo >&2 " (man 8 ethtool)"
4649
echo >&2 " --nat TARGET - Stateless NAT traffic to/from TARGET"
4750
echo >&2 " (in primary/PID0 netns)"
4851
echo >&2 ""
@@ -58,16 +61,16 @@ setup_if() {
5861
local IF=$1 NS=$2 MAC=$3 IP=$4 MTU=$5 ROUTES=$6 routes=
5962
echo >&2 "ROUTES: ${ROUTES}"
6063
while read rt; do
61-
[ "${rt}" ] && routes="${routes}\nroute add ${rt} dev ${IF}"
62-
done < <(echo -e "${ROUTES}")
64+
routes="${routes}route add ${rt} dev ${IF}\n"
65+
done < <(echo -en "${ROUTES}")
6366

6467
info "Setting ${IP:+IP ${IP}, }${MAC:+MAC ${MAC}, }${MTU:+MTU ${MTU}, }${ROUTES:+ROUTES '${ROUTES//$'\n'/,}', }up state"
6568
ip -netns ${NS} --force -b - <<EOF
6669
${IP:+addr add ${IP} dev ${IF}}
6770
${MAC:+link set dev ${IF} address ${MAC}}
6871
${MTU:+link set dev ${IF} mtu ${MTU}}
6972
link set dev ${IF} up
70-
$(echo -e "${routes}")
73+
$(echo -en "${routes}")
7174
EOF
7275
}
7376

@@ -82,7 +85,7 @@ IPTABLES() {
8285
VERBOSE=${VERBOSE:-}
8386
PID1=${PID1:-<SELF>} IF1=${IF1:-eth0}
8487
IP0= IP1= MAC0= MAC1= ROUTES0= ROUTES1= MTU=
85-
MODE= VLANID= REMOTE= VNI= NETEM= NAT=
88+
MODE= VLANID= REMOTE= VNI= NETEM= NAT= ETHTOOL=
8689
positional=
8790
while [ "${*}" ]; do
8891
param=$1; OPTARG=$2
@@ -94,10 +97,10 @@ while [ "${*}" ]; do
9497
--ip1) IP1="${OPTARG}"; shift ;;
9598
--mac|--mac0) MAC0="${OPTARG}"; shift ;;
9699
--mac1) MAC1="${OPTARG}"; shift ;;
97-
--route|--route0) ROUTES0="${ROUTES0}\n${OPTARG}"; shift ;;
98-
--route1) ROUTES1="${ROUTES1}\n${OPTARG}"; shift ;;
100+
--route|--route0) ROUTES0="${ROUTES0}${OPTARG}\n"; shift ;;
101+
--route1) ROUTES1="${ROUTES1}${OPTARG}\n"; shift ;;
99102
--mtu) MTU="${OPTARG}"; shift ;;
100-
103+
--ethtool) ETHTOOL="${ETHTOOL}${OPTARG}\n"; shift ;;
101104
--mode) MODE="${OPTARG}"; shift ;;
102105
--vlanid) VLANID="${OPTARG}"; shift ;;
103106

@@ -111,8 +114,6 @@ while [ "${*}" ]; do
111114
esac
112115
shift
113116
done
114-
ROUTES0="${ROUTES0#\\n}"
115-
ROUTES1="${ROUTES1#\\n}"
116117
set -- ${positional}
117118
TYPE=$1 PID0=$2 IF0=$3
118119

@@ -195,6 +196,11 @@ if [ "${NETEM}" ]; then
195196
tc -netns ${NS0} qdisc add dev ${IF0} root netem ${NETEM}
196197
fi
197198

199+
while read arg opts; do
200+
info "Applying ethtool ${arg} ${IF0} ${opts} (in ${NS0})"
201+
ip netns exec ${NS0} ethtool ${arg} ${IF0} ${opts}
202+
done < <(echo -en "${ETHTOOL}")
203+
198204
if [ "${NAT}" ]; then
199205
info "Adding NAT rule to ${NAT}"
200206
IPTABLES ${NS0} PREROUTING -t nat -i ${IF0} -j DNAT --to-destination ${NAT}

schema.yaml

+3
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,9 @@ properties:
5353
netem:
5454
oneOf: [{type: string},
5555
{type: array, items: {type: string}}]
56+
ethtool:
57+
oneOf: [{type: string},
58+
{type: array, items: {type: string}}]
5659

5760
bridges:
5861
type: array

src/conlink/core.cljs

+12-4
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,8 @@ General Options:
3232
[default: auto] [env: CONLINK_BRIDGE_MODE]
3333
--default-mtu MTU Default link MTU (for non *vlan types)
3434
[default: 65535]
35+
--keep-veth-offload Do not add '--offload tx off' as the first
36+
ethtool setting for container veth interfaces
3537
--network-file NETWORK-FILE... Network config file
3638
--compose-file COMPOSE-FILE... Docker compose file with network config
3739
--compose-project NAME Docker compose project name for resolving
@@ -57,7 +59,8 @@ General Options:
5759
" --system-id=random --no-mlockall --delete-bridges"))
5860

5961
(def VLAN-TYPES #{:vlan :macvlan :macvtap :ipvlan :ipvtap})
60-
(def LINK-ADD-OPTS [:ip :mac :route :mtu :nat :netem :mode :vlanid :remote :vni])
62+
(def LINK-ADD-OPTS [:ip :mac :route :mtu :nat :netem :ethtool
63+
:mode :vlanid :remote :vni])
6164
(def INTF-MAX-LEN 15)
6265
(def DOCKER-INTF "DOCKER-ETH0")
6366

@@ -112,9 +115,9 @@ General Options:
112115
- mac: random MAC starting with first octet of 'c2'
113116
- mtu: --default-mtu (for non *vlan type)
114117
- base: :conlink for veth type, :host for *vlan types, :local otherwise"
115-
[{:as link :keys [type base bridge ip route forward netem]} bridges opts]
118+
[{:as link :keys [type bridge ip route forward netem ethtool]} bridges opts]
116119
(let [{:keys [docker-eth0? docker-eth0-address]} @ctx
117-
{:keys [default-mtu]} opts
120+
{:keys [default-mtu keep-veth-offload]} opts
118121
type (keyword (or type "veth"))
119122
dev (get link :dev "eth0")
120123
mac (get link :mac (random-mac))
@@ -126,12 +129,17 @@ General Options:
126129
route (if (string? route) [route] route)
127130
forward (if (string? forward) [forward] forward)
128131
netem (if (string? netem) [netem] netem)
132+
ethtool-pre (if (and (= :veth type) (not keep-veth-offload))
133+
["--offload tx off"]
134+
[])
135+
ethtool (into ethtool-pre (if (string? ethtool) [ethtool] ethtool))
129136
link (merge
130137
link
131138
{:type type
132139
:dev dev
133140
:base base
134-
:mac mac}
141+
:mac mac
142+
:ethtool ethtool}
135143
(when bridge
136144
{:bridge bridge})
137145
(when (not (VLAN-TYPES type))

0 commit comments

Comments
 (0)