-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Receive] New Tenant is not queryable by receiver #7892
Comments
I found a similar open issue that might be related to your problem: Thanos query don't query all metrics on a receiver. In that issue, the solution involved configuring the querier to query all pods in the receiver ring by using a headless service. This was done by specifying each pod's endpoint in the configuration, rather than using the service endpoint. Additionally, they adjusted the For your specific issue, it seems like the new tenant is not being dynamically registered for querying without a restart. The Thanos receiver component should dynamically detect and register new tenants without requiring a restart by provisioning an independent TSDB for each new tenant detected via the tenant HTTP header [1]. However, if this is not happening, it might be a regression or a configuration issue in the latest main branch. You might want to check if the configuration flags related to tenant registration and queryability are correctly set, such as |
|
this incident occurred again today, after digging deeper, we found it can be reproduced if multiple tenants are added simantenously causing a race condition from this PR: #7782 |
Had a fix #7941 which can repro the race condition by unit test |
We are testing the latest thanos main branch and found a regression that didn't exist in v0.36 prior
For a given running thanos receiver cluster, we start a new tenant called "eng-host-networking" and we can see tsdb head metric started pop up but all metrics to that tenant are not queryable unless restart the receiver cluster
How to repro:
prometheus_tsdb_head_series{tenant="<new tenant>"}
Thanos, Prometheus and Golang version used:
Thanos: v0.37.0-dev
Golang: v1.23
Object Storage Provider:
What happened:
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Full logs to relevant components:
Anything else we need to know:
The text was updated successfully, but these errors were encountered: