-
Notifications
You must be signed in to change notification settings - Fork 671
Closed
Labels
Description
TSDB doesn't allow to append samples whose timestamp is older than the last block cut from the head. Given a block is cut from the head up until -50% of the max timestamp within the head and given the default block range period is 2h, this means that the blocks storage doesn't allow to append a sample whose timestamp is older than 1h compared to the most recent timestamp in the head.
Let's consider this scenario:
- Multiple Prometheus servers remote writing to the same Cortex tenant
- Some Prometheus servers stop remote writing to Cortex (for any reason, ie. networking issue) and they fall behind more than 1h
- When the Prometheus servers will be back online, Cortex will discard any sample whose timestamp is older than 1h because the max timestamp in the TSDB head is close to "now" (due to the working Prometheus servers which never stopped to write series) while the failing ones are trying to catch up writing samples older than 1h
We recently had an outage in our staging environment which triggered this condition and we should find a way to solve it.
@bwplotka You may be interested, given I think this issue affects Thanos receive too.
Submitted by: pracucci
Cortex Issue Number: 2366
cruscio, lhns, pmig and vonbarnekowa