Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rejected connection, Error EOF when using etcdctl with multiple endpoints #12426

Closed
thechristschn opened this issue Oct 27, 2020 · 2 comments
Closed
Labels

Comments

@thechristschn
Copy link

I have a three node etcd cluster, which is working fine in general. But if I execute etcdctl member list with multiple endpoints, the etcd nodes log in irregular intervals the following errors at warning level:

{"level":"warn","ts":"2020-10-27T08:38:25.033Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:40772","server-name":"","error":"EOF"}

The error doesn't occur on every execution. If I use just two endpoints, the message still appears sometimes. If I use just one endpoint, the error won't show up. It doesn't matter which endpoint I use and also the order doesn't matter. The etcdctl command itself returns fine in every case.

The etcd cluster has no data and no other load. The only command which is executed on the cluster, is the etcdctl member list command. If I stop my tests, the log stays clean.

If I enable debug log, I get additional messages, which I guess are expected:

"/etcdserverpb.Cluster/MemberList","request count":-1,"request size":-1,"response count":-1,"response size":-1,"request content":""}

I know, there are similar issue about this topic. But I think, none of them covers this specific issue enough or got closed without fixing the problem. The tls setup itself seems to work perfectly fine, otherwise I think I would get deterministic error messages. All three nods themself are working without any problems.

I know about #9165, which is more related to logs from "caller":"grpclog/grpclog.go:51" and I think won't cover this problem.

My best guess is, that one node responds faster than the others and etcdctl closes the connection to the remaining ones. And if the timing is bad, the error is logged.

Environment:
OS: Debian Buster
etcd Version: 3.5.0-pre
Git SHA: eee8dec (master from yesterday)
Go Version: go1.15.3
Go OS/Arch: linux/amd64

How to reproduce:
I created three etcd nodes on different virtual machines. I used certificates created after this manual: etcd/hack/tls-setup/README.md

The startup command for the first node:

--initial-advertise-peer-urls=https://10.88.105.183:2380 \
--listen-peer-urls=https://10.88.105.183:2380 \
--listen-client-urls=https://10.88.105.183:2379,https://127.0.0.1:2379 \
--advertise-client-urls=https://10.88.105.183:2379 \
--initial-cluster-token=etcd-cluster-1 \
--initial-cluster=infra0=https://10.88.105.183:2380,infra1=https://10.88.104.50:2380,infra2=https://10.88.105.173:2380 \
--initial-cluster-state=new \
--client-cert-auth=true \
--cert-file=infra0.pem \
--key-file=infra0-key.pem \
--trusted-ca-file=ca.pem \
--peer-client-cert-auth=true \
--peer-cert-file=peer-infra0.pem \
--peer-key-file=peer-infra0-key.pem \
--peer-trusted-ca-file=../certs/ca.pem

Bash loop from an external system, using the client certificate which was created for the first node. Loop was executed from another system than the etcd servers:

while true; do ./etcdctl --debug --endpoints https://10.88.105.183:2379,https://10.88.104.50:2379,https://10.88.105.173:2379 --cert infra0.pem --key infra0-key.pem --cacert ca.pem member list; sleep 1; done

Resulting Logs on the three etcd nodes from 8:38:00 to 8:40:00, with the above loop.
First Node:

{"level":"warn","ts":"2020-10-27T08:38:26.108Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:40780","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:57.907Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:40940","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:06.332Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:40986","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:25.435Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:41094","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:26.510Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:41098","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:50.794Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:41234","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:56.128Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:41262","server-name":"","error":"EOF"}

Second Node:

{"level":"warn","ts":"2020-10-27T08:38:20.860Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53218","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:25.033Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53242","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:26.108Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53248","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:45.231Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53346","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:48.414Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53362","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:57.907Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53412","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:01.003Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53432","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:07.403Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53468","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:16.963Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53520","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:25.435Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53568","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:26.510Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53572","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:34.895Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53618","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:56.128Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:53732","server-name":"","error":"EOF"}

Third node:

{"level":"warn","ts":"2020-10-27T08:38:20.860Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52228","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:38:48.414Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52374","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:01.003Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52444","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:06.332Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52472","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:07.403Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52474","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:16.963Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52526","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:23.300Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52566","server-name":"","error":"EOF"}
{"level":"warn","ts":"2020-10-27T08:39:50.794Z","caller":"embed/config_logging.go:198","msg":"rejected connection","remote-addr":"10.23.247.41:52716","server-name":"","error":"EOF"}
@thechristschn thechristschn changed the title Rejected connection, Error EOF with multiple endpoints Rejected connection, Error EOF when using etcdctl with multiple endpoints Oct 27, 2020
@arcreigh
Copy link

I think there is a larger issue at hand are you on 3.4.13?
I cant get my cluster to communicate at all. Not even with http. Localhost while the port is listening is not responding to any queries. etcdctl times out.

@stale
Copy link

stale bot commented Jan 25, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants