Connectivity between peers not establishing while using password(unable to decrypt TCP msg) #3579
Description
What you expected to happen?
Encryption among the peers is not establishing
What happened?
[root@ip-172-31-31-31 bin]# ./weave status connections
-> 172.31.7.55:6783 failed read tcp4 172.31.31.31:45224->172.31.7.55:6783: read: connection reset by peer, retry: 2019-01-07 21:53:10.408201032 +0000 UTC m=+24.508360179
-> 172.31.5.162:6783 failed Unable to decrypt TCP msg, retry: 2019-01-07 21:53:11.333699597 +0000 UTC m=+25.433858745
How to reproduce it?
I have deployed Weave net on my kubernetes using the below commands:(Cloud: AWS, Kubernetes using "kubeadm init" )
$echo "s3cr3tp4ssw0rd" > /var/lib/weave/weave-passwd
$ kubectl create secret -n kube-system generic weave-passwd --from-file=/var/lib/weave/weave-passwd
$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')&password-secret=weave-passwd"
$kubectl exec -n kube-system weave-net-bl2t9 -c weave -- /home/weave/weave --local status Version: 2.5.0 (up to date; next check at 2019/01/08 02:50:14)
Service: router
Protocol: weave 1..2
Name: 1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal)
Encryption: enabled
PeerDiscovery: enabled
Targets: 2
Connections: 2 (1 established, 1 failed)
Peers: 2 (with 2 established connections)
TrustedSubnets: none
Service: ipam
Status: ready
Range: 10.32.0.0/12
DefaultSubnet: 10.32.0.0/12
[root@ip-172-31-7-55 lib]# kubectl exec -n kube-system weave-net-bl2t9 -c weave -- /home/weave/weave --local status peers
1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal)
-> 172.31.7.55:6783 b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal) established
b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
<- 172.31.5.162:38442 1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal) established
172.31.7.55 is k8 Master & 172.31.5.162 is my k8 node.. I do see the connection is established and encryption is enabled.
I have a 3rd machine which is not part of cluster, i'm trying to add that host as peer to my k8 cluster, so that my 3rd machine can communicate with pods inside the k8 cluster
On 3rd machine tried, weave launch --password s3cr3tp4ssw0rd 172.31.7.55 172.31.5.162
The connection status
[root@ip-172-31-31-31 bin]# ./weave status connections
-> 172.31.7.55:6783 failed read tcp4 172.31.31.31:45224->172.31.7.55:6783: read: connection reset by peer, retry: 2019-01-07 21:53:10.408201032 +0000 UTC m=+24.508360179
-> 172.31.5.162:6783 failed Unable to decrypt TCP msg, retry: 2019-01-07 21:53:11.333699597 +0000 UTC m=+25.433858745
Anything else we need to know?
Cloud Provider: AWS, Installed kubernetes with Kubeadm init.
The same scenario works fine if i don't use password while installing weave net on Kubernetes.
Versions:
[root@ip-172-31-7-55 ec2-user]# kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
$ weave version(on kubernetes cluster)
[root@ip-172-31-7-55 ec2-user]# kubectl exec -n kube-system weave-net-bl2t9 -c weave -- /home/weave/weave --local version
weave 2.5.0
$ weave version(on external machine which is not part of cluster)
[root@ip-172-31-31-31 bin]# weave version
weave script 2.5.0
weave 2.5.0
$ docker version
[root@ip-172-31-7-55 ec2-user]# docker version
Client:
Version: 18.06.1-ce
$ uname -a
[root@ip-172-31-7-55 ec2-user]# uname -a
Linux ip-172-31-7-55.us-east-2.compute.internal 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ip-172-31-7-55 ec2-user]#
Logs:
$ docker logs weave
or, if using Kubernetes:
$ kubectl logs -n kube-system <weave-net-pod> weave
[root@ip-172-31-7-55 ec2-user]# kubectl logs -n kube-system weave-net-bl2t9 weave
DEBU: 2019/01/09 17:32:16.464373 [kube-peers] Checking peer "1a:07:85:a5:40:3f" against list &{[{b2:8a:35:56:2a:7b ip-172-31-7-55.us-east-2.compute.internal} {1a:07:85:a5:40:3f ip-172-31-5-162.us-east-2.compute.internal}]}
INFO: 2019/01/09 17:32:16.622696 Command line options: map[db-prefix:/weavedb/weave-net docker-api: ipalloc-init:consensus=2 metrics-addr:0.0.0.0:6782 port:6783 ipalloc-range:10.32.0.0/12 name:1a:07:85:a5:40:3f datapath:datapath host-root:/host http-addr:127.0.0.1:6784 nickname:ip-172-31-5-162.us-east-2.compute.internal no-dns:true conn-limit:100 expect-npc:true]
INFO: 2019/01/09 17:32:16.622762 weave 2.5.0
INFO: 2019/01/09 17:32:16.895685 Bridge type is bridged_fastdp
INFO: 2019/01/09 17:32:16.895708 Communication between peers via untrusted networks is encrypted.
INFO: 2019/01/09 17:32:17.026082 Our name is 1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal)
INFO: 2019/01/09 17:32:17.026143 Launch detected - using supplied peer list: [172.31.5.162 172.31.7.55]
INFO: 2019/01/09 17:32:17.042085 Checking for pre-existing addresses on weave bridge
INFO: 2019/01/09 17:32:17.069078 [allocator 1a:07:85:a5:40:3f] Initialising with persisted data
INFO: 2019/01/09 17:32:17.069149 Sniffing traffic on datapath (via ODP)
INFO: 2019/01/09 17:32:17.087375 ->[172.31.5.162:6783] attempting connection
INFO: 2019/01/09 17:32:17.087591 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:17.087810 ->[172.31.5.162:48691] connection accepted
INFO: 2019/01/09 17:32:17.103378 ->[172.31.5.162:48691|1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal)]: connection shutting down due to error: cannot connect to ourself
INFO: 2019/01/09 17:32:17.103500 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:17.103688 ->[172.31.5.162:6783|1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal)]: connection shutting down due to error: cannot connect to ourself
INFO: 2019/01/09 17:32:17.109161 Listening for HTTP control messages on 127.0.0.1:6784
INFO: 2019/01/09 17:32:17.109390 Listening for metrics requests on 0.0.0.0:6782
INFO: 2019/01/09 17:32:17.615539 [kube-peers] Added myself to peer list &{[{b2:8a:35:56:2a:7b ip-172-31-7-55.us-east-2.compute.internal} {1a:07:85:a5:40:3f ip-172-31-5-162.us-east-2.compute.internal}]}
DEBU: 2019/01/09 17:32:17.623279 [kube-peers] Nodes that have disappeared: map[]
10.44.0.0
172.31.5.162
172.31.7.55
DEBU: 2019/01/09 17:32:17.752525 registering for updates for node delete events
INFO: 2019/01/09 17:32:19.792914 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:19.793932 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:21.620021 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:21.620810 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:27.513054 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:27.513697 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:31.610389 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:31.611315 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:39.978738 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:39.979409 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:32:52.366322 ->[172.31.7.55:6783] attempting connection
INFO: 2019/01/09 17:32:52.366971 ->[172.31.7.55:6783] error during connection attempt: dial tcp4 :0->172.31.7.55:6783: connect: connection refused
INFO: 2019/01/09 17:33:05.257310 ->[172.31.7.55:36581] connection accepted
INFO: 2019/01/09 17:33:05.270237 ->[172.31.7.55:36581|b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)]: connection ready; using protocol version 2
INFO: 2019/01/09 17:33:05.270378 overlay_switch ->[b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)] using fastdp
INFO: 2019/01/09 17:33:05.270404 ->[172.31.7.55:36581|b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)]: connection added (new peer)
INFO: 2019/01/09 17:33:05.270811 Setting up IPsec between 1a:07:85:a5:40:3f(ip-172-31-5-162.us-east-2.compute.internal) and b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:33:05.271479 ipsec: InitSALocal: 172.31.7.55 -> 172.31.5.162 :6784 0x50edbb71
INFO: 2019/01/09 17:33:05.475358 fastdp ->[172.31.7.55:6784|b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)]: IPSec init SA remote
INFO: 2019/01/09 17:33:05.475407 ipsec: InitSARemote: 172.31.5.162 -> 172.31.7.55 :6784 0xa4e45d2a
INFO: 2019/01/09 17:33:05.477284 ->[172.31.7.55:36581|b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)]: connection fully established
INFO: 2019/01/09 17:33:05.479690 EMSGSIZE on send, expecting PMTU update (IP packet was 60052 bytes, payload was 60044 bytes)
INFO: 2019/01/09 17:33:05.480649 sleeve ->[172.31.7.55:6783|b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)]: Effective MTU verified at 8915
INFO: 2019/01/09 17:33:05.575684 Discovered remote MAC aa:ef:ad:0c:d2:cd at b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:33:05.722914 Discovered remote MAC b2:8a:35:56:2a:7b at b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:33:05.928096 Discovered remote MAC b2:78:6c:90:e6:5d at b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:33:06.188884 Discovered remote MAC 02:d0:ec:62:39:d3 at b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:33:09.432256 Discovered remote MAC 1a:32:d9:d8:4e:46 at b2:8a:35:56:2a:7b(ip-172-31-7-55.us-east-2.compute.internal)
INFO: 2019/01/09 17:53:31.315785 ->[172.31.31.31:36157] connection accepted
INFO: 2019/01/09 17:53:31.318902 ->[172.31.31.31:36157] connection shutting down due to error during handshake: Unable to decrypt TCP msg
INFO: 2019/01/09 17:53:33.133544 ->[172.31.31.31:34837] connection accepted
INFO: 2019/01/09 17:53:33.134817 ->[172.31.31.31:34837] connection shutting down due to error during handshake: Unable to decrypt TCP msg
INFO: 2019/01/09 17:53:35.501722 ->[172.31.31.31:39123] connection accepted
INFO: 2019/01/09 17:53:35.503059 ->[172.31.31.31:39123] connection shutting down due to error during handshake: Unable to decrypt TCP msg
INFO: 2019/01/09 17:53:42.198403 ->[172.31.31.31:41634] connection accepted
INFO: 2019/01/09 17:53:42.199743 ->[172.31.31.31:41634] connection shutting down due to error during handshake: Unable to decrypt TCP msg
<!-- (If output is long, please consider a Gist.) -->
<!-- Anything interesting or unusual output by the below, potentially relevant, commands?
$ journalctl -u docker.service --no-pager
$ journalctl -u kubelet --no-pager
$ kubectl get events
-->
## Network:
<!-- If your problem has anything to do with one network endpoint not being able to contact another, please run the following commands -->
$ ip route
$ ip -4 -o addr
$ sudo iptables-save