You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
thanos, version 0.37.1 (branch: HEAD, revision: e0812e2f46f81af3324686d910d885d8f2751d46)
build user: root@1294db3510d8
build date: 20241204-08:25:27
go version: go1.23.3
platform: linux/amd64
tags: netgo
(this happened at least until 0.36.1, but I've upgraded to 0.37.1 today)
Object Storage Provider: FILESYSTEM
What happened:
I basically see the following:
where, I blindly guess, the "doubled chunks" (at the same level) may however be the same issue than #7488 respectively fixed by #7492.
But haven't been able to confirm that, yet.
I do however also get:
Where you can see that ever few time spans, the:
level 4, with resolutions 0, 300000 and 3600000
are "shifted" to:
level 5, with resolutions 0, 300000 and 3600000
One of the sources is from sidecar (the non-alternating one, 1 in the lower image) the other is via receive (the non-alternating one, 2 in the lower image).
What you expected to happen:
Well, not alternating?
Anything else we need to know:
Well, I (still) regularly suffer from #7197, i.e. for some reason, compact seems to need more and more space to do it's job (despite not more metrics/instances being scraped).
Things are described there in more detail, but in short, compact uses more and more space for no good reason, then fills the filesystem, doesn't even properly exit with an error, and I have to manually clean everything up (and add extra space).
Maybe something with the levels gets messed up when that happens?
Regarding the alternating chunk levels you are experiencing, the resolutions '0', '300000', and '3600000' relate to different levels of data downsampling in Thanos. Resolution '0' is raw data, '300000' is downsampled to 5-minute intervals, and '3600000' is typically downsampled to 1-hour intervals [2].
The issue with alternating levels between '4' and '5' might be related to how Thanos handles compaction and transitions between levels. Thanos uses criteria such as block size, count, and parent range filling to determine when to transition between compaction levels. It also handles gaps and overlaps carefully to avoid improper compaction [3].
The problem with increasing space usage during compaction could be linked to known issues such as overlapping blocks or large total index sizes, which can prevent proper compaction and lead to increased space usage [4].
The 'sidecar' and 'receive' components have distinct roles in data ingestion and storage, which might affect how data is compacted. The 'sidecar' uploads data from Prometheus to cloud storage, while the 'receive' ingests samples from remote Prometheus instances. Both components upload data to cloud storage for the compactor to process, but they do not perform compaction themselves [5][6].
To address these issues, you might want to investigate the specific configurations and interactions between these components and the compactor, as well as any potential overlaps or large index sizes in your data blocks.
Thanos, Prometheus and Golang version used:
(this happened at least until 0.36.1, but I've upgraded to 0.37.1 today)
Object Storage Provider:
FILESYSTEM
What happened:
I basically see the following:
where, I blindly guess, the "doubled chunks" (at the same level) may however be the same issue than #7488 respectively fixed by #7492.
But haven't been able to confirm that, yet.
I do however also get:
Where you can see that ever few time spans, the:
4
, with resolutions0
,300000
and3600000
are "shifted" to:
5
, with resolutions0
,300000
and3600000
One of the sources is from
sidecar
(the non-alternating one,1
in the lower image) the other is viareceive
(the non-alternating one,2
in the lower image).What you expected to happen:
Well, not alternating?
Anything else we need to know:
Well, I (still) regularly suffer from #7197, i.e. for some reason,
compact
seems to need more and more space to do it's job (despite not more metrics/instances being scraped).Things are described there in more detail, but in short,
compact
uses more and more space for no good reason, then fills the filesystem, doesn't even properly exit with an error, and I have to manually clean everything up (and add extra space).Maybe something with the levels gets messed up when that happens?
Environment:
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
The text was updated successfully, but these errors were encountered: