Skip to content

distributor: Expermental support for producing ingest-storage records in format V2 #12060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

alexweav
Copy link
Contributor

@alexweav alexweav commented Jul 11, 2025

What this PR does

This PR adds experimental support for ingest-storage.kafka.producer-record-version=2. Consumer side support exists already, this allows toggling producers to write the new format.

This implements translation of RW1.0 to the new RW2.0-esque warpstream wire format.

We are not considering the format 100% stabilized, still. We continue to reserve the right to make backward-incompatible changes to the format, particularly the common symbols space.

In order to keep the PR size reasonable (it's already big), this PR omits a few key optimizations from main...alexweav/experiment-rw-v2 and focuses more on the plain implementation of the format. Namely, we don't yet pool the RW2.0 WriteRequest fields yet, similar to PreallocTimeseries, nor do we use yoloStrings when marshalling. These will be saved for a future PR.

The symbols table is an exception - it has been optimized to be significantly faster than the Prometheus implementation, with near-zero allocations.

Which issue(s) this PR fixes or relates to

contrib github.com/grafana/mimir-squad/issues/2253

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
  • about-versioning.md updated with experimental features.

}

func BenchmarkSymbolizer(b *testing.B) {
b.Run("prom symbolizer: 10k labels unique values", func(b *testing.B) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual sends from distributors to ingesters typically have 7 series, each with say 20 labels, so 280 strings total. That's the sort of ball-park we should be benchmarking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree on few series per per-partition write request. Let's say max_samples_per_send is 2000 and requests are full. Then we have per-partition write requests within this range:

  • A tenant with shard size = 1: 2000 symbols in the per-partition write request
  • A tenant with shard size = all ingesters: with 100 partitions that's 200 samples per per-partition write request, with 300 partitions (which is pretty large scale) that's 7, the number Bryan mentioned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thank you for the context.

I want to move the benchmark up a level, using one benchmark to cover the whole flow for writes including both marshalling and unmarshalling. When I do that, I'll also make the request look more realistic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this benchmark to be more realistic (a case for both bryan's and marco's scenarios). It's also not coupled to the implementation so much anymore, thus easier to use with benchstat and iterate on.

@alexweav alexweav force-pushed the alexweav/rwv2-symbolizer branch from e4581bb to 66c02a9 Compare July 17, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants