distributor: Expermental support for producing ingest-storage records in format V2 #12060

alexweav · 2025-07-11T22:31:15Z

What this PR does

This PR adds experimental support for ingest-storage.kafka.producer-record-version=2. Consumer side support exists already, this allows toggling producers to write the new format.

This implements translation of RW1.0 to the new RW2.0-esque warpstream wire format.

We are not considering the format 100% stabilized, still. We continue to reserve the right to make backward-incompatible changes to the format, particularly the common symbols space.

In order to keep the PR size reasonable (it's already big), this PR omits a few key optimizations from main...alexweav/experiment-rw-v2 and focuses more on the plain implementation of the format. Namely, we don't yet pool the RW2.0 WriteRequest fields yet, similar to PreallocTimeseries, nor do we use yoloStrings when marshalling. These will be saved for a future PR.

The symbols table is an exception - it has been optimized to be significantly faster than the Prometheus implementation, with near-zero allocations.

Which issue(s) this PR fixes or relates to

contrib github.com/grafana/mimir-squad/issues/2253

Checklist

Tests updated.
Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]. If changelog entry is not needed, please add the changelog-not-needed label to the PR.
about-versioning.md updated with experimental features.

bboreham · 2025-07-15T15:29:12Z

pkg/mimirpb/symbols_test.go

+}
+
+func BenchmarkSymbolizer(b *testing.B) {
+	b.Run("prom symbolizer: 10k labels unique values", func(b *testing.B) {


Individual sends from distributors to ingesters typically have 7 series, each with say 20 labels, so 280 strings total. That's the sort of ball-park we should be benchmarking.

Agree on few series per per-partition write request. Let's say max_samples_per_send is 2000 and requests are full. Then we have per-partition write requests within this range:

A tenant with shard size = 1: 2000 symbols in the per-partition write request

A tenant with shard size = all ingesters: with 100 partitions that's 200 samples per per-partition write request, with 300 partitions (which is pretty large scale) that's 7, the number Bryan mentioned

Gotcha, thank you for the context.

I want to move the benchmark up a level, using one benchmark to cover the whole flow for writes including both marshalling and unmarshalling. When I do that, I'll also make the request look more realistic.

I've updated this benchmark to be more realistic (a case for both bryan's and marco's scenarios). It's also not coupled to the implementation so much anymore, thus easier to use with benchstat and iterate on.

…rned to pools

bboreham reviewed Jul 15, 2025

View reviewed changes

alexweav added 21 commits July 17, 2025 17:48

First take on symbolizer

2e74d12

Pool symbol slices

455aef0

convert request skeleton, samples, histograms, labels

329e4af

Exemplars support

79183e2

Metadata support

57803f1

work in terms of WriteRequest

67789df

v2 serializer

168c677

Build a separate req instead of destroying the given one

cf4cfba

Implement support for offset on producer side

8447c57

resolve common symbols

b07004d

unit tests for offset and common symbols

12001c4

use prom functions to avoid builtins on type alias issues

2d4abd2

tighten tests, fix intermittently failing assertion

188405e

Basic benchmarks for serialization

cdcae9b

license headers

987e34d

More reusable, realistic symbols table benchmarks

7344ad5

Get rid of cachedSymbols optimization that is never hit outside of tests

4fefa3a

make ownership way more clear, fix freed slices being used after retu…

2c4bf1c

…rned to pools

re-use the slice when we are actually done with it, post serialization

a3e7f43

another missing license header

d3ebcdf

Allow record version 2 to pass arg validation

66c02a9

alexweav force-pushed the alexweav/rwv2-symbolizer branch from e4581bb to 66c02a9 Compare July 17, 2025 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

distributor: Expermental support for producing ingest-storage records in format V2 #12060

distributor: Expermental support for producing ingest-storage records in format V2 #12060

alexweav commented Jul 11, 2025 •

edited

Loading

Uh oh!

bboreham Jul 15, 2025

Uh oh!

pracucci Jul 15, 2025

Uh oh!

alexweav Jul 16, 2025

Uh oh!

alexweav Jul 17, 2025

Uh oh!

Uh oh!

distributor: Expermental support for producing ingest-storage records in format V2 #12060

Are you sure you want to change the base?

distributor: Expermental support for producing ingest-storage records in format V2 #12060

Conversation

alexweav commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

Uh oh!

bboreham Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

pracucci Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

alexweav Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

alexweav Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexweav commented Jul 11, 2025 •

edited

Loading