-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate/document noisy gRPC logs #9165
Comments
We did some investigations around this and this is what I have so far. (I found grpc-go github issue that moves these logs into different level in their latest version but I'm not sure why are we seeing multiple connections being initiated/terminated) Issue When does this happen? Environment: Log content: How is the client connection initiated? Which API did you try? What was used to investigate? Tcpdump content: (This pasted content is just on one packet) 40864 5.691315 127.0.0.1 127.0.0.1 TCP 74 33872 → 2379 [SYN] Seq=0 Win=43690 Len=0 MSS=65495 SACK_PERM=1 TSval=4294896844 TSecr=0 WS=128 In Frame 40875 In Frame 40879 Strace |
if you leave the cluster idle without any client interaction, will you still see the error logging? i am asking since the broken streams might be caused by the internal grpc-gateway. |
We don't see these logs when cluster is idle. But @eswarbala pointed out this issue #7065. I'm invoking Member list API in clientV3 and for all these clusters, api 3.0 is enabled after third node joins. Therefore until then we will see these logs because grpc-go might be retrying since it got status 14 which says service is unavailable. Does this make sense on why this is happening? |
What was this clientv3 version? Retry logic has changed a lot since 3.0.0. |
I built clientv3 by invoking whatever we have here (https://github.com/coreos/etcd/tree/master/clientv3) go get github.com/... should be the latest version. |
Actually for this specific API ( |
but why the retry will trigger errors like |
Simplest solution would fetch // not ready
{"etcdserver":"3.0.0","etcdcluster":"not_decided"}
// ready
{"etcdserver":"3.0.0","etcdcluster":"3.0.0"} |
also it seems that the retry is pretty aggressive. for |
Probably from creating/closing a client for every unary RPC call? That error message shouldn't appear from retries with same connection, but I haven't investigated further on this. |
closing a client should not generate that error... the tcp connection should be closed cleanly. |
According to grpc/grpc-go#1062 (comment), it's expected, given that we closes gRPC connection properly. |
k. it seems gRPC already make that error at debug level. from our side, we can retry |
Good idea. Opened another issue to track it. |
It would also be great if we can set the etcd logger which doesn't set the global GRPC logger as this gets really confusing and doesn't make any sense. As now we see logs coming from non-etcd GRPC connections with our etcd database tag, all because the global grpc logger is overwritten by setting the logger via etcd. Why do you not simply allow to set the logger which is used as middleware? This way the global loggers aren't affected. Or perhaps this is already supported and I simply have to create my ETCD client differently? If not, feel free to provide me with some pointers and I wouldn't mind bringing this support to ETCD. The logging middleware I was talking about can be found here: https://godoc.org/github.com/grpc-ecosystem/go-grpc-middleware/logging By default they only support GRPC and Zap, but nothing stops you from allowing a user to provide any generic logger I would think. |
Also one thing I haven't investigated yet is why we are seeing reset signal on server side when we think we close the client properly (grpc/grpc-go#1062 (comment) as per this comment we should see similiar message whenever client connection is closed but doesn't look like that's the case). |
This is still an issue. This should not be closed, I think: #10364 Zap logger prints It's going to be noticed more, now that v3.4 warns if you DON'T use "zap" logger, and claims the older method is deprecated. I upgraded from 3.3 to 3.4, switched to zap as told, and now my logs are flooded with this message. Can't this error just be logged at level=debug instead of level=warn? |
The 3.4 FAQ also needs upating on this. Both of these sentences in the answer are now completely wrong? It's logging at "warn" level, not debug.
|
@paulcaskey |
We're seeing this in etcd v3.4.7. More often under heavy load |
Stream logs were disabled. |
Quite a few people have asked about this logs (related etcd-io/etcdlabs#281 and #8822).
In most cases, this is logging from gRPC-side. Logging format is overriden with
capnslog
in old versions of etcd, and confusing to a lot of people.Check if it still happens (e.g. 2 member and another member joining). And clearly document what it is.
The text was updated successfully, but these errors were encountered: