Investigate/document noisy gRPC logs #9165

gyuho · 2018-01-18T03:24:23Z

Quite a few people have asked about this logs (related etcd-io/etcdlabs#281 and #8822).

2017-11-10 18:02:24.981969 I | etcdserver/api/v3rpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2391->127.0.0.1:43020: read: connection reset by peer
2017-11-10 18:02:24.982017 I | etcdserver/api/v3rpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2397->127.0.0.1:37926: read: connection reset by peer
2017-11-10 18:02:26.985608 I | etcdserver/api/v3rpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2391->127.0.0.1:43088: read: connection reset by peer

In most cases, this is logging from gRPC-side. Logging format is overriden with capnslog in old versions of etcd, and confusing to a lot of people.

Check if it still happens (e.g. 2 member and another member joining). And clearly document what it is.

The text was updated successfully, but these errors were encountered:

xiang90 · 2018-01-18T21:06:11Z

also see https://github.com/zero-os/0-stor/issues/478

SaranBalaji90 · 2018-01-18T22:37:59Z

We did some investigations around this and this is what I have so far. (I found grpc-go github issue that moves these logs into different level in their latest version but I'm not sure why are we seeing multiple connections being initiated/terminated)

Issue
Etcd logs have quite a few lines that says "http2Server.HandleStreams failed to read frame."

When does this happen?
When we initiate client connection through https://github.com/coreos/etcd/tree/master/clientv3. This doesn't happen all the time. Happens very rarely on cluster with size 1 or 2 (atleast in my testing)

Environment:
etcd Version: 3.0.17
OS/Arch: linux/amd64
initial cluster size: 3
ClientV3 built on the latest code (grpc versions between clientv3 and etcd server are different I think.)

Log content:
api: enabled capabilities for version 2.3
2018-01-18 21:08:42.291730 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57382: read: connection reset by peer
2018-01-18 21:08:42.300708 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57406: read: connection reset by peer
2018-01-18 21:08:42.338178 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57508: read: connection reset by peer
2018-01-18 21:08:42.350635 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57546: read: connection reset by peer
2018-01-18 21:08:42.352319 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57552: read: connection reset by peer
2018-01-18 21:08:42.363271 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57590: read: connection reset by peer
2018-01-18 21:08:42.363966 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57592: read: connection reset by peer
2018-01-18 21:08:42.364715 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57594: read: connection reset by peer
2018-01-18 21:08:42.376637 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57630: read: connection reset by peer
2018-01-18 21:08:42.377825 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57634: read: connection reset by peer
2018-01-18 21:08:42.386868 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57666: read: connection reset by peer
2018-01-18 21:08:42.435807 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57806: read: connection reset by peer
2018-01-18 21:08:42.446440 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57840: read: connection reset by peer
2018-01-18 21:08:42.447943 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57844: read: connection reset by peer
2018-01-18 21:08:42.460177 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57886: read: connection reset by peer
2018-01-18 21:08:42.461543 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:57890: read: connection reset by peer
2018-01-18 21:08:42.499513 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58006: read: connection reset by peer
2018-01-18 21:08:42.504792 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58022: read: connection reset by peer
2018-01-18 21:08:42.515685 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58052: read: connection reset by peer
2018-01-18 21:08:42.549694 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58146: read: connection reset by peer
2018-01-18 21:08:42.635488 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58414: read: connection reset by peer
2018-01-18 21:08:42.677853 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58538: read: connection reset by peer
2018-01-18 21:08:42.720503 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58670: read: connection reset by peer
2018-01-18 21:08:42.730826 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58702: read: connection reset by peer
2018-01-18 21:08:42.742906 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58744: read: connection reset by peer
2018-01-18 21:08:42.745616 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58752: read: connection reset by peer
2018-01-18 21:08:42.765168 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58818: read: connection reset by peer
2018-01-18 21:08:42.767629 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58826: read: connection reset by peer
2018-01-18 21:08:42.789137 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58874: read: connection reset by peer
2018-01-18 21:08:42.802547 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58912: read: connection reset by peer
2018-01-18 21:08:42.804985 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58920: read: connection reset by peer
2018-01-18 21:08:42.807661 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58930: read: connection reset by peer
2018-01-18 21:08:42.810472 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58938: read: connection reset by peer
2018-01-18 21:08:42.825231 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:58982: read: connection reset by peer
2018-01-18 21:08:42.960636 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59398: read: connection reset by peer
2018-01-18 21:08:42.966399 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59416: read: connection reset by peer
2018-01-18 21:08:42.976604 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59444: read: connection reset by peer
2018-01-18 21:08:42.983215 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59468: read: connection reset by peer
2018-01-18 21:08:43.017511 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59574: read: connection reset by peer
2018-01-18 21:08:43.044907 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59646: read: connection reset by peer
2018-01-18 21:08:43.047128 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59652: read: connection reset by peer
2018-01-18 21:08:43.077899 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59758: read: connection reset by peer
2018-01-18 21:08:43.092882 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59806: read: connection reset by peer
2018-01-18 21:08:43.094104 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:59810: read: connection reset by peer
2018-01-18 21:08:43.194593 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60128: read: connection reset by peer
2018-01-18 21:08:43.213599 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60188: read: connection reset by peer
2018-01-18 21:08:43.227734 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60230: read: connection reset by peer
2018-01-18 21:08:43.228917 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60234: read: connection reset by peer
2018-01-18 21:08:43.241335 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60276: read: connection reset by peer
2018-01-18 21:08:43.272092 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60362: read: connection reset by peer
2018-01-18 21:08:43.380183 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:60712: read: connection reset by peer
2018-01-18 21:08:43.499279 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:32840: read: connection reset by peer
2018-01-18 21:08:43.524769 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:32904: read: connection reset by peer
2018-01-18 21:08:43.629359 I | rafthttp: peer d98998e6a7a13831 became active
2018-01-18 21:08:43.629392 I | rafthttp: established a TCP streaming connection with peer d98998e6a7a13831 (stream Message reader)
2018-01-18 21:08:43.629973 I | rafthttp: established a TCP streaming connection with peer d98998e6a7a13831 (stream MsgApp v2 reader)
2018-01-18 21:08:43.631156 I | rafthttp: established a TCP streaming connection with peer d98998e6a7a13831 (stream MsgApp v2 writer)
2018-01-18 21:08:43.631357 I | rafthttp: established a TCP streaming connection with peer d98998e6a7a13831 (stream Message writer)
2018-01-18 21:08:43.680305 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:33370: read: connection reset by peer
2018-01-18 21:08:43.681609 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:33374: read: connection reset by peer
2018-01-18 21:08:43.713480 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:33478: read: connection reset by peer
2018-01-18 21:08:43.742455 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:33560: read: connection reset by peer
2018-01-18 21:08:43.881445 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:33976: read: connection reset by peer
2018-01-18 21:08:43.900498 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34034: read: connection reset by peer
2018-01-18 21:08:43.925009 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34112: read: connection reset by peer
2018-01-18 21:08:43.934783 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34146: read: connection reset by peer
2018-01-18 21:08:44.030562 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34426: read: connection reset by peer
2018-01-18 21:08:44.051082 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34488: read: connection reset by peer
2018-01-18 21:08:44.076709 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34568: read: connection reset by peer
2018-01-18 21:08:44.078130 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34572: read: connection reset by peer
2018-01-18 21:08:44.078860 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34574: read: connection reset by peer
2018-01-18 21:08:44.089118 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34604: read: connection reset by peer
2018-01-18 21:08:44.117976 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34696: read: connection reset by peer
2018-01-18 21:08:44.132960 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34738: read: connection reset by peer
2018-01-18 21:08:44.144879 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34776: read: connection reset by peer
2018-01-18 21:08:44.151599 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34782: read: connection reset by peer
2018-01-18 21:08:44.152799 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34786: read: connection reset by peer
2018-01-18 21:08:44.169405 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34836: read: connection reset by peer
2018-01-18 21:08:44.177764 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34864: read: connection reset by peer
2018-01-18 21:08:44.183164 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34876: read: connection reset by peer
2018-01-18 21:08:44.184378 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34880: read: connection reset by peer
2018-01-18 21:08:44.188836 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:34896: read: connection reset by peer
2018-01-18 21:08:44.373573 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35454: read: connection reset by peer
2018-01-18 21:08:44.447788 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35682: read: connection reset by peer
2018-01-18 21:08:44.459727 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35724: read: connection reset by peer
2018-01-18 21:08:44.464944 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35738: read: connection reset by peer
2018-01-18 21:08:44.493687 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35836: read: connection reset by peer
2018-01-18 21:08:44.496813 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35848: read: connection reset by peer
2018-01-18 21:08:44.500040 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35862: read: connection reset by peer
2018-01-18 21:08:44.502110 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:35868: read: connection reset by peer
2018-01-18 21:08:44.546725 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36018: read: connection reset by peer
2018-01-18 21:08:44.563149 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36052: read: connection reset by peer
2018-01-18 21:08:44.565780 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36060: read: connection reset by peer
2018-01-18 21:08:44.568634 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36070: read: connection reset by peer
2018-01-18 21:08:44.613707 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36218: read: connection reset by peer
2018-01-18 21:08:44.726163 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36556: read: connection reset by peer
2018-01-18 21:08:44.728357 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36564: read: connection reset by peer
2018-01-18 21:08:44.803674 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36802: read: connection reset by peer
2018-01-18 21:08:44.826062 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:36872: read: connection reset by peer
2018-01-18 21:08:44.914641 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37134: read: connection reset by peer
2018-01-18 21:08:45.031013 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37482: read: connection reset by peer
2018-01-18 21:08:45.128432 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37786: read: connection reset by peer
2018-01-18 21:08:45.134559 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37806: read: connection reset by peer
2018-01-18 21:08:45.154051 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37864: read: connection reset by peer
2018-01-18 21:08:45.154819 I | v3rpc/grpc: transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:2379->127.0.0.1:37866: read: connection reset by peer
membership: updated the cluster version from 2.3 to 3.0

How is the client connection initiated?
clientv3.New(clientv3.Config{
Endpoints: []string{"$endPoint"},
DialTimeout: 3 * time.Second,
})

Which API did you try?
/etcdserverpb.Cluster/MemberList (from tcpdump)

What was used to investigate?
tcpdump, strace

Tcpdump content: (This pasted content is just on one packet)
Cmd executed: "tcpdump -qxn -i lo port 2379 -s 0 -w /tmp/tcpdump.log"

40864 5.691315 127.0.0.1 127.0.0.1 TCP 74 33872 → 2379 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=4294896844 TSecr=0 WS=128
40865 5.691323 127.0.0.1 127.0.0.1 TCP 74 2379 → 33872 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=4294896844 TSecr=4294896844 WS=128
40866 5.691332 127.0.0.1 127.0.0.1 TCP 66 33872 → 2379 [ACK] Seq=1 Ack=1 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844
40870 5.691442 127.0.0.1 127.0.0.1 HTTP2 90 Magic
40871 5.691447 127.0.0.1 127.0.0.1 TCP 66 2379 → 33872 [ACK] Seq=1 Ack=25 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844
40872 5.691456 127.0.0.1 127.0.0.1 HTTP2 75 SETTINGS
40873 5.691461 127.0.0.1 127.0.0.1 TCP 66 2379 → 33872 [ACK] Seq=1 Ack=34 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844
40874 5.691513 127.0.0.1 127.0.0.1 HTTP2 75 SETTINGS
40875 5.691530 127.0.0.1 127.0.0.1 HTTP2 167 HEADERS, DATA
40876 5.691535 127.0.0.1 127.0.0.1 TCP 66 33872 → 2379 [ACK] Seq=135 Ack=10 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844
40877 5.691572 127.0.0.1 127.0.0.1 HTTP2 75 SETTINGS
40878 5.691581 127.0.0.1 127.0.0.1 HTTP2 79 WINDOW_UPDATE
40879 5.691662 127.0.0.1 127.0.0.1 HTTP2 130 HEADERS
40880 5.691686 127.0.0.1 127.0.0.1 TCP 66 33872 → 2379 [ACK] Seq=144 Ack=87 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844
40881 5.691735 127.0.0.1 127.0.0.1 HTTP2 75 SETTINGS
40885 5.691888 127.0.0.1 127.0.0.1 TCP 66 33872 → 2379 [RST, ACK] Seq=144 Ack=96 Win=43776 Len=0 TSval=4294896844 TSecr=4294896844

In Frame 40875
Header: :path: /etcdserverpb.Cluster/MemberList
Name Length: 5
Name: :path
Value Length: 32
Value: /etcdserverpb.Cluster/MemberList
Representation: Literal Header Field with Incremental Indexing - Indexed Name
Index: 5

In Frame 40879
Header: grpc-status: 14
Name Length: 11
Name: grpc-status
Value Length: 2
Value: 14
Representation: Literal Header Field with Incremental Indexing - New Name
Header: grpc-message: etcdserver: not capable
Name Length: 12
Name: grpc-message
Value Length: 23
Value: etcdserver: not capable
Representation: Literal Header Field with Incremental Indexing - New Name

Strace
Cmd executed: strace -p PID -f -ttT -s200 -o /tmp/stracktrace.log
On client side we saw the close() being made but on the server side it received ECONNRESET. Not sure why this happened but still have to dig into this.

xiang90 · 2018-01-18T23:39:07Z

@SaranBalaji90

if you leave the cluster idle without any client interaction, will you still see the error logging? i am asking since the broken streams might be caused by the internal grpc-gateway.

SaranBalaji90 · 2018-01-19T00:39:22Z

We don't see these logs when cluster is idle.

But @eswarbala pointed out this issue #7065.

I'm invoking Member list API in clientV3 and for all these clusters, api 3.0 is enabled after third node joins. Therefore until then we will see these logs because grpc-go might be retrying since it got status 14 which says service is unavailable.

Does this make sense on why this is happening?

gyuho · 2018-01-19T00:50:26Z

grpc-go might be retrying since it got status 14 which says service is unavailable.

What was this clientv3 version? Retry logic has changed a lot since 3.0.0.

SaranBalaji90 · 2018-01-19T01:13:45Z

I built clientv3 by invoking whatever we have here (https://github.com/coreos/etcd/tree/master/clientv3) go get github.com/... should be the latest version.

gyuho · 2018-01-19T01:13:52Z

Actually for this specific API (MemberList), both 3.0 and latest (3.3+) retry on etcdserver: not capable error. So, it is expected that we will see a bunch of those errors until 3.0 api is enabled.

xiang90 · 2018-01-19T01:34:13Z

but why the retry will trigger errors like read: connection reset by peer?

gyuho · 2018-01-19T01:36:14Z

Simplest solution would fetch /version endpoint until server sets cluster version

// not ready
{"etcdserver":"3.0.0","etcdcluster":"not_decided"}

// ready
{"etcdserver":"3.0.0","etcdcluster":"3.0.0"}

xiang90 · 2018-01-19T01:36:16Z

also it seems that the retry is pretty aggressive. for not capable error at least, we should not retry that aggressive, since we know it is because of the cluster is not fully up. retry immediately will not resolve that since it is different than a network error where a immediate retry has a high chance to just work.

gyuho · 2018-01-19T01:39:45Z

but why the retry will trigger errors like read: connection reset by peer?

Probably from creating/closing a client for every unary RPC call? That error message shouldn't appear from retries with same connection, but I haven't investigated further on this.

xiang90 · 2018-01-19T01:40:39Z

closing a client should not generate that error... the tcp connection should be closed cleanly.

gyuho · 2018-01-19T01:43:16Z

According to grpc/grpc-go#1062 (comment), it's expected, given that we closes gRPC connection properly.

xiang90 · 2018-01-19T01:47:52Z

k. it seems gRPC already make that error at debug level.

from our side, we can retry not capable less aggressive, and we do not need to switch to another endpoint for not capable error too.

gyuho · 2018-01-19T01:48:26Z

Good idea. Opened another issue to track it.

GlenDC · 2018-01-23T20:32:34Z

It would also be great if we can set the etcd logger which doesn't set the global GRPC logger as this gets really confusing and doesn't make any sense. As now we see logs coming from non-etcd GRPC connections with our etcd database tag, all because the global grpc logger is overwritten by setting the logger via etcd.

Why do you not simply allow to set the logger which is used as middleware? This way the global loggers aren't affected. Or perhaps this is already supported and I simply have to create my ETCD client differently? If not, feel free to provide me with some pointers and I wouldn't mind bringing this support to ETCD.

The logging middleware I was talking about can be found here: https://godoc.org/github.com/grpc-ecosystem/go-grpc-middleware/logging By default they only support GRPC and Zap, but nothing stops you from allowing a user to provide any generic logger I would think.

SaranBalaji90 · 2018-02-04T08:39:14Z

Also one thing I haven't investigated yet is why we are seeing reset signal on server side when we think we close the client properly (grpc/grpc-go#1062 (comment) as per this comment we should see similiar message whenever client connection is closed but doesn't look like that's the case).

paulcaskey · 2020-06-18T18:01:49Z

This is still an issue. This should not be closed, I think: #10364 Zap logger prints connection reset by peer but capnslog doesn't.

It's going to be noticed more, now that v3.4 warns if you DON'T use "zap" logger, and claims the older method is deprecated. I upgraded from 3.3 to 3.4, switched to zap as told, and now my logs are flooded with this message.

Can't this error just be logged at level=debug instead of level=warn?

paulcaskey · 2020-06-18T18:11:38Z

The 3.4 FAQ also needs upating on this. Both of these sentences in the answer are now completely wrong? It's logging at "warn" level, not debug.

Only old versions of gRPC log this. etcd >=v3.2.13 by default log this with DEBUG level, thus only visible with --debug flag enabled.

gyuho · 2020-06-18T18:12:19Z

@paulcaskey ~~Could you create a new issue? Thanks!~~ Nvm. It wasn't closed. Yes, we will take a look at this.

dilyevsky · 2020-09-17T18:13:45Z

We're seeing this in etcd v3.4.7. More often under heavy load

serathius · 2022-06-13T11:25:02Z

Stream logs were disabled.

gyuho mentioned this issue Jan 18, 2018

Documentation: add TCP RST flag on stream close #9168

Merged

gyuho mentioned this issue Jan 19, 2018

clientv3: improve retry policy #9171

Closed

SaranBalaji90 mentioned this issue Feb 4, 2018

etcdserver: http2Server.HandleStreams failed to read frame #7926

Closed

gyuho added the stage/investigating label Feb 25, 2018

gyuho mentioned this issue Apr 2, 2018

Why is etcd db continuous growing? #9507

Closed

xxx7xxxx mentioned this issue Jan 3, 2019

Zap logger prints connection reset by peer but capnslog doesn't #10364

Closed

gyuho added this to the etcd-v3.5 milestone Jun 18, 2020

tangcong mentioned this issue Oct 20, 2020

embed: only log stream error with debug level #12406

Merged

thechristschn mentioned this issue Oct 27, 2020

Rejected connection, Error EOF when using etcdctl with multiple endpoints #12426

Closed

neolit123 mentioned this issue Jul 9, 2021

pull-kubernetes-integration timing out kubernetes/kubernetes#103512

Closed

serathius removed this from the etcd-v3.5 milestone Dec 15, 2021

serathius added stage/tracked and removed stage/investigating labels Jun 13, 2022

serathius closed this as completed Jun 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate/document noisy gRPC logs #9165

Investigate/document noisy gRPC logs #9165

gyuho commented Jan 18, 2018 •

edited

Loading

xiang90 commented Jan 18, 2018

SaranBalaji90 commented Jan 18, 2018 •

edited

Loading

xiang90 commented Jan 18, 2018

SaranBalaji90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

SaranBalaji90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018 •

edited

Loading

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018 •

edited

Loading

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

GlenDC commented Jan 23, 2018 •

edited

Loading

SaranBalaji90 commented Feb 4, 2018

paulcaskey commented Jun 18, 2020 •

edited

Loading

paulcaskey commented Jun 18, 2020

gyuho commented Jun 18, 2020 •

edited

Loading

dilyevsky commented Sep 17, 2020

serathius commented Jun 13, 2022

Investigate/document noisy gRPC logs #9165

Investigate/document noisy gRPC logs #9165

Comments

gyuho commented Jan 18, 2018 • edited Loading

xiang90 commented Jan 18, 2018

SaranBalaji90 commented Jan 18, 2018 • edited Loading

xiang90 commented Jan 18, 2018

SaranBalaji90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

SaranBalaji90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018 • edited Loading

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018 • edited Loading

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

xiang90 commented Jan 19, 2018

gyuho commented Jan 19, 2018

GlenDC commented Jan 23, 2018 • edited Loading

SaranBalaji90 commented Feb 4, 2018

paulcaskey commented Jun 18, 2020 • edited Loading

paulcaskey commented Jun 18, 2020

gyuho commented Jun 18, 2020 • edited Loading

dilyevsky commented Sep 17, 2020

serathius commented Jun 13, 2022

gyuho commented Jan 18, 2018 •

edited

Loading

SaranBalaji90 commented Jan 18, 2018 •

edited

Loading

gyuho commented Jan 19, 2018 •

edited

Loading

gyuho commented Jan 19, 2018 •

edited

Loading

GlenDC commented Jan 23, 2018 •

edited

Loading

paulcaskey commented Jun 18, 2020 •

edited

Loading

gyuho commented Jun 18, 2020 •

edited

Loading