Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clickhousemetricswrite re-uses closed connections under load #448

Open
Bewalticus opened this issue Nov 7, 2024 · 2 comments
Open

clickhousemetricswrite re-uses closed connections under load #448

Bewalticus opened this issue Nov 7, 2024 · 2 comments

Comments

@Bewalticus
Copy link

When our docker standalone setup is under load and metrics export encounters some "context deadline exceeded" errors from the DB then we also encounter some re-use of closed connections by the clickhousemetricswrite:

signoz-otel-collector  | {"level":"info","ts":1730974515.0031528,"caller":"internal/retry_sender.go:118","msg":"Exporting failed. Will retry the request after interval.","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","error":"context deadline exceeded","interval":"3.98555793s"}
signoz-otel-collector  | {"level":"info","ts":1730974523.9898968,"caller":"internal/retry_sender.go:118","msg":"Exporting failed. Will retry the request after interval.","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","error":"read: read tcp 172.27.0.6:60724->172.27.0.5:9000: use of closed network connection","errorVerbose":"read:\n    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull\n        /home/runner/go/pkg/mod/github.com/!sig!noz/[email protected]/proto/reader.go:62\n  - read tcp 172.27.0.6:60724->172.27.0.5:9000: use of closed network connection","interval":"7.631739029s"}
signoz-otel-collector  | {"level":"info","ts":1730974536.6228452,"caller":"internal/retry_sender.go:118","msg":"Exporting failed. Will retry the request after interval.","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","error":"read: read tcp 172.27.0.6:47404->172.27.0.5:9000: use of closed network connection","errorVerbose":"read:\n    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull\n        /home/runner/go/pkg/mod/github.com/!sig!noz/[email protected]/proto/reader.go:62\n  - read tcp 172.27.0.6:47404->172.27.0.5:9000: use of closed network connection","interval":"9.057163987s"}
signoz-otel-collector  | {"level":"info","ts":1730974550.6819594,"caller":"internal/retry_sender.go:118","msg":"Exporting failed. Will retry the request after interval.","kind":"exporter","data_type":"metrics","name":"clickhousemetricswrite","error":"read: read tcp 172.27.0.6:36234->172.27.0.5:9000: use of closed network connection","errorVerbose":"read:\n    github.com/ClickHouse/ch-go/proto.(*Reader).ReadFull\n        /home/runner/go/pkg/mod/github.com/!sig!noz/[email protected]/proto/reader.go:62\n  - read tcp 172.27.0.6:36234->172.27.0.5:9000: use of closed network connection","interval":"19.462073548s"}

This effectively blocks ingestion of metrics. It can be solved by restarting the collector but is not feasible in production.

@srikanthccv
Copy link
Member

Restarting is not the solution to the problem. The reasons why this happens are highlighted here #247 (comment).

@mannprerak2
Copy link

Facing a similar issue.

@srikanthccv Do you think reducing the send_batch_size in the batch processor before the clickhousemetricswrite exporter would help here?

I don't even see CPU ever going above 80% ever (Averaging 50).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants