Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The difference between the raw metrics and downsampling metrics #7800

Open
anarcher opened this issue Oct 7, 2024 · 1 comment
Open

The difference between the raw metrics and downsampling metrics #7800

anarcher opened this issue Oct 7, 2024 · 1 comment

Comments

@anarcher
Copy link

anarcher commented Oct 7, 2024

Thanos, Prometheus and Golang version used:
thanos:0.36.1

Object Storage Provider: S3

What happened:
There is a difference between the raw metrics and downsampling metrics as follows. (I couldn't see any particular issues in compaction.) Could there be a reason for this difference? Is there any specific area I should check?
image

image

kube_pod_info had the following skip series warn log:
image

ts=2024-10-06T11:38:49.185869258Z caller=streamed_block_writer.go:116 level=warn msg="empty chunks happened, skip series" series="{__cluster__='prod-kr-a-k8s', __name__='kube_pod_info', __replica__='prometheus-agent-k8s-thanos-0', cluster='prod-kr-a', container='kube-rbac-proxy-main', created_by_kind='Workflow', created_by_name='sync-ehr-1727999700', env='prod', host_ip='10.128.91.30', host_network='false', instance='10.128.72.3:8443', job='kube-state-metrics', namespace='katalog', node='ip-10-128-91-30.ap-northeast-2.compute.internal', pod='sync-ehr-1727999700-hook-621784931', pod_ip='10.128.91.196', priority_class='default', prometheus='addon-monitoring/agent-k8s-thanos', region='kr', role='service', uid='839e9db9-035d-4c4b-854a-e6862a7ece28'}"

running thanos tools bucket verify does not report any issues for the downsampled block

thanos tools bucket verify --objstore.config-file=./cfg/thanos-p01.yaml --id=01J9B6AC49SWMBZRE5G4Q333EK --issues=index_known_issues
ts=2024-10-06T12:45:59.363563Z caller=factory.go:53 level=info msg="loading bucket configuration"
ts=2024-10-06T12:45:59.36678Z caller=verify.go:138 level=info verifiers=index_known_issues msg="Starting verify task"
ts=2024-10-06T12:45:59.366804Z caller=index_issue.go:33 level=info verifiers=index_known_issues verifier=index_known_issues msg="started verifying issue" with-repair=false
ts=2024-10-06T12:46:03.995076Z caller=fetcher.go:623 level=info component=block.BaseFetcher msg="successfully synchronized block metadata" duration=4.628019583s duration_ms=4628 cached=1438 returned=244 partial=0
ts=2024-10-06T13:18:32.289597Z caller=index_issue.go:75 level=info verifiers=index_known_issues verifier=index_known_issues msg="verified issue" with-repair=false
ts=2024-10-06T13:18:32.295277Z caller=verify.go:157 level=info verifiers=index_known_issues msg="verify task completed"
ts=2024-10-06T13:18:32.377922Z caller=main.go:174 level=info msg=exiting

What you expected to happen:
The trend in both the raw data and the downsampled data is similar.

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components:

Anything else we need to know:

Copy link

dosubot bot commented Oct 7, 2024

Hello, @anarcher! I'm here to help with any bugs, questions, or contributions you have. Let's tackle this issue together!

The difference between raw metrics and downsampling metrics in Thanos could be due to the presence of empty chunks in the series, as indicated by the skip series warn log related to "empty chunks" in the kube_pod_info series. During the downsampling process, Thanos skips empty XOR chunks, which can lead to discrepancies between raw and downsampled data [1].

To address this, you should check for the presence of empty chunks in the kube_pod_info series and ensure that the data is correctly populated before downsampling. This might involve verifying the data ingestion process or checking for any issues in the data source that could lead to empty chunks being created. Additionally, ensure there are no overlapping blocks with the same external labels and time periods, as this can also affect the downsampling process [2].

To continue talking to Dosu, mention @dosu.

@anarcher anarcher changed the title difference between the raw metrics and downsampling metrics The difference between the raw metrics and downsampling metrics Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant