Skip to content

tsdb: SeriesIterator.Seek() is vaguely defined #5871

@free

Description

@free

Context: I was looking into cleaning up the iterator code, as I'm trying to add a separate SeekBefore() method, that would return the value at time t as defined by Prometheus without always parsing and iterating over a whole chunk covering the previous 5 minutes (when the looked up value is toward the beginning of a chunk).

However, I ran into some issues regarding how chunked series and iterators should actually work. There are a couple of gray areas that are only partially covered by comments or test code, making it impossible to figure out what is the expected correct behavior:

  1. blockQuerier and the resulting chunkedSeries and chunkSeriesIterator all have mint and maxt fields, which (along with (*chunkSeriesIterator).Seek() implementation details) suggest that iterators should only return samples with timestamps mint <= t <= maxt. However, there are bugs (such as the one I'm attempting to fix with Seek() shouldn't return true when past maxt. Also add some tests to s… prometheus-junkyard/tsdb#327) and even confusing/confused unit tests -- which appear to create an iterator with mint = 5, maxt = 8 and expect (*chunkSeriesIterator).Seek() to return a value with timestamp 3.

    Aremint and maxt supposed to be "hard" limits on the respective iterators or only on which chunks are selected?

  2. I gather (from the implementation again) that SeriesIterator.Seek() is only supposed to seek forward (which doesn't work as expected, as it will also seek backward within the same chunk). Is this assumption correct?
  3. Is Seek() supposed to always advance the iterator or should a second Seek() call with the same timestamp as the immediately preceding one leave the iterator positioned where it was before?
  4. Is it actually a good idea for a Seek() with a timestamp t < mint to ever succeed? If so a poorly defined series e.g. [now - 10m, now] will "successfully" produce a value of the series at now - 1y equal to the first value after now - 10m.

I'm glad to do the work (including properly documenting the code) regardless of what the answers are, I'd just like to get some "authoritative" input. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions