Is it expected that Thanos Compactor has consistently high Read IOPS/Throughput? #8177
-
|
Right now we are running into a problem where the Compactor maxing out our configured EFS throughput on read operations. As far as we can tell this is happening essentially 24/7. Part of me thinks this is due to compactor regularly having silent crashes, however I have not found any evidence of that. I do not get any of the expected error or warn logs, nor do I see the halt metic ever be 1 in grafana. The only indication that I get that something is wrong is that according to grafana/prometheus, compactor keeps going "down" ^Is the above normal? If so, then our throughput limits must be set too low. Is there any documentation on expected minimum EFS/Disk throughput? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Hey @irgama! I'm here to help you with any bugs, questions, or contributions. Let's see what's going on with Thanos Compactor together. There isn't specific documentation on the expected minimum EFS or disk throughput for Thanos Compactor. However, consistently high read IOPS and throughput could be influenced by several factors, including the configuration of the compactor and the performance of the underlying storage system. To diagnose silent crashes or high read operations, you can monitor specific metrics and logs provided by Thanos Compactor. Key metrics include Additionally, Thanos Compactor interacts with several dependencies that could affect its performance, such as object storage and metadata fetchers. Any latency or issues with these services can impact the compactor's performance [1]. If the compactor is going "down" intermittently without logging errors, it might be worth checking the network and context management settings, as well as the concurrency settings, which can influence performance [1]. Given the high utilization you're experiencing, it might be necessary to review and potentially increase your EFS throughput limits to accommodate the compactor's demands. To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
Beta Was this translation helpful? Give feedback.
-
|
For anyone reading this in the future, my problem was that the thanos container was repeatedly restarting, and appearing to fail silently. No error logs in either kubernetes nor the container. I have since moved the thanos container to EBS storage, and upped the default limits from 100m CPU and 100MiB to 1 CPU and 2Gi. That has allowed the container to work. Also, if you want to check whether or not the thanos compactor is working: If, in grafana, the "thanos_compact_todo_compaction_blocks" never goes down it is not working. |
Beta Was this translation helpful? Give feedback.


For anyone reading this in the future, my problem was that the thanos container was repeatedly restarting, and appearing to fail silently. No error logs in either kubernetes nor the container.
I have since moved the thanos container to EBS storage, and upped the default limits from 100m CPU and 100MiB to 1 CPU and 2Gi. That has allowed the container to work.
Also, if you want to check whether or not the thanos compactor is working: If, in grafana, the "thanos_compact_todo_compaction_blocks" never goes down it is not working.