Skip to content

ingest-storage: Add a common symbols table to record format V2 #11566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 10, 2025

Conversation

alexweav
Copy link
Contributor

What this PR does

Note that record format V2 is still considered unstable.

This PR introduces a shared table of frequently-used symbols. This starts to fill in the "reserved" symbol space from 0 to 64 in record format V2, that is never used by an ordinary request.

This means that producers need not send certain frequently symbols in the per-request symbol table at all - they can instead send the shared symbol and trust that the consumer will understand it.

Note that the contents of the table are subject to change as we do more analysis. As a first pass, it's populated with various frequently-used symbols from a test cell. The symbols chosen are based on a combination of length and frequency in sampled data, as well as cleaned to represent symbols from Mimir or software that Mimir is frequently used with in practice (e.g. Kubernetes ecosystem or other Grafana Labs projects).

Once V2 is considered stabilized in the future, we cannot change the symbols table further without incrementing the version. Most likely this symbols table will change before stabilization.

Which issue(s) this PR fixes or relates to

Rel https://github.com/grafana/mimir-squad/issues/2253

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@alexweav alexweav marked this pull request as ready for review May 27, 2025 22:22
@alexweav alexweav requested a review from a team as a code owner May 27, 2025 22:22
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

"<aggregated>",
"le",
"component",
"cortex_request_duration_seconds_bucket",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised to see that this is more frequent than pod. Is this sorted by frequency? I didn't check how do we encode the references to the symbol table, but I guess they're varints and we should aim for a Huffman-like coding here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not sorted by frequency, this list is categorized by label purpose. I've also scrubbed it of any symbols that were too tenant-specific. It's not just a flat Top-N at all, rather just a first pass. This is where your idea of an operator that configures the symbol table per cell can gain some savings. 😄

@colega
Copy link
Contributor

colega commented May 30, 2025

I tried a similar thing in an internal project in the past (sorry OSS folks for the internal link 🙏 there's nothing interesting you're missing there) and I found that (obviously) the table that works for one cluster, doesn't work so well for another one.

So I thought that IMO the table should be configurable and the message should include the version of the table to use, with both read and write clients knowing multiple versions.

Something like uint64 to identify the table, taking a dumb unix millis timestamp of when it was defined to refer to it.

This could be, of course, a later extension with this hardcoded table being the table 0 but I also think it wouldn't be such a big effort to have that defined from the beginning.

In my mind there would be later a runtime analyzer (on the consumer side) that would suggest from time to time a new table (through logs in OSS, or an API endpoint to automate this in our cloud offering) which would be just appended to the list of known tables. My suggestion of using an uint64 for that is to be able to keep only N last versions and easily identify which are the older ones that aren't meant to be found in the records anymore (if symbol table is older than your kafka retention, you don't need to keep it anymore).

Anyway, just sharing my previous ideas because I saw this PR in my email.

@alexweav
Copy link
Contributor Author

alexweav commented Jun 3, 2025

Thanks for the input @colega! The versioning problem is exactly why I originally leaned away from making the symbols table configurable. We need to be very careful to not roll out a changed symbols table to a distributor before all the ingesters might understand it. Right now I'm leveraging the record format version header we currently send, but allowing multiple entire versions to live in the config together was not something I considered 😉 I do quite like the idea of an operator that's constantly tuning the label set to the best one for the current cell.

The main drawback I predict is that we've already de-duplicated the label symbols within each request in #11406. In RW2, each unique string is only sent a maximum of 1 time per request anyway. Under that format, a shared symbolization only saves us a tiny handful of bytes in the symbol table per request to begin with. If a symbol is 20 bytes long, we only really save 20 bytes on that request regardless of how often that symbol is used.

This means that I don't think the savings from this change will be astronomical - it's only a small, add-on optimization beyond what we're already doing. Maybe the savings don't justify the complexity? We'll need to test more in dev to be sure though.

@alexweav alexweav force-pushed the alexweav/shared-symbols-table branch from 4baf1fc to ecb8270 Compare June 10, 2025 20:18
@alexweav
Copy link
Contributor Author

@colega I'm going to merge it, but I'm not considering this done just yet. I expect to tune the symbols table further and iterate on the idea. Right now we haven't marked the V2 as "stable" so we can change it based on testing feedback. I linked your comment to the overall issue to we remember to explore it further.

@alexweav alexweav merged commit 6120c65 into main Jun 10, 2025
30 checks passed
@alexweav alexweav deleted the alexweav/shared-symbols-table branch June 10, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants