Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some distributed_logs_v2 queries get low performance #6700

Open
kobecal opened this issue Dec 23, 2024 · 1 comment
Open

some distributed_logs_v2 queries get low performance #6700

kobecal opened this issue Dec 23, 2024 · 1 comment

Comments

@kobecal
Copy link
Contributor

kobecal commented Dec 23, 2024

In what situation are you experiencing subpar performance?

Same query builder params. After upgrading to new ClickHouse schema, some queries in my dashboard get timeout issue. Here's my example.

How to reproduce

My query builder params:
{ "start": 1734315554000, "end": 1734920354000, "step": 1980, "variables": {}, "formatForWeb": false, "compositeQuery": { "queryType": "builder", "panelType": "graph", "fillGaps": false, "builderQueries": { "A": { "dataSource": "logs", "queryName": "A", "aggregateOperator": "count", "aggregateAttribute": { "dataType": "", "id": "------false", "isColumn": false, "isJSON": false, "key": "", "type": "" }, "timeAggregation": "rate", "spaceAggregation": "sum", "functions": [], "filters": { "items": [ { "id": "535d4714", "key": { "dataType": "string", "id": "service.name--string--resource--true", "isColumn": true, "isJSON": false, "key": "service.name", "type": "resource" }, "op": "=", "value": "my-service" } ], "op": "AND" }, "expression": "B", "disabled": true, "stepInterval": 43200, "having": [], "limit": null, "orderBy": [], "groupBy": [ { "dataType": "string", "isColumn": true, "isJSON": false, "key": "event.name", "type": "tag", "id": "event.name--string--tag--true" } ], "legend": "", "reduceTo": "avg" } } } }

Generate ClickHouse query with schema v1 (distributed_logs)

SELECT toStartOfInterval(fromUnixTimestamp64Nano(timestamp), INTERVAL 43200 SECOND) AS ts, attribute_string_event$$name as event.name, toFloat64(count(*)) as value from signoz_logs.distributed_logs where (timestamp >= 1734315554000000000 AND timestamp <= 1734920354000000000) AND resource_string_service$$name = 'my-service' AND attribute_string_event$$name_exists=true group by event.name,ts order by value DESC

Generate ClickHouse query with schema v2 (distributed_logs_v2)

SELECT toStartOfInterval(fromUnixTimestamp64Nano(timestamp), INTERVAL 43200 SECOND) AS ts, attribute_string_event$$name as event.name, toFloat64(count(*)) as value from signoz_logs.distributed_logs_v2 where (timestamp >= 1734315554000000000 AND timestamp <= 1734920354000000000) AND (ts_bucket_start >= 1734315554 AND ts_bucket_start <= 1734920354) AND attribute_string_event$$name_exists=true AND (resource_fingerprint GLOBAL IN (SELECT fingerprint FROM signoz_logs.distributed_logs_v2_resource WHERE (seen_at_ts_bucket_start >= 1734315554) AND (seen_at_ts_bucket_start <= 1734920354) AND simpleJSONExtractString(labels, 'service.name') = 'my-service' AND labels like '%service.name%my-service%')) group by event.name,ts order by value DESC

It's easy to reproduce this issue. Because the generating query strategy is incorrect.

@srikanthccv
Copy link
Member

It's easy to reproduce this issue. Because the generating query strategy is incorrect.

There is no issue with the generation strategy. Please share how much time did it take earlier and how many rows did it scan. Share the same numbers for the v2 query but directly run queries from the clickhouse console.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants