-
Notifications
You must be signed in to change notification settings - Fork 595
ingest-storage: Add a common symbols table to record format V2 #11566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
"<aggregated>", | ||
"le", | ||
"component", | ||
"cortex_request_duration_seconds_bucket", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am surprised to see that this is more frequent than pod
. Is this sorted by frequency? I didn't check how do we encode the references to the symbol table, but I guess they're varints and we should aim for a Huffman-like coding here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not sorted by frequency, this list is categorized by label purpose. I've also scrubbed it of any symbols that were too tenant-specific. It's not just a flat Top-N at all, rather just a first pass. This is where your idea of an operator that configures the symbol table per cell can gain some savings. 😄
I tried a similar thing in an internal project in the past (sorry OSS folks for the internal link 🙏 there's nothing interesting you're missing there) and I found that (obviously) the table that works for one cluster, doesn't work so well for another one. So I thought that IMO the table should be configurable and the message should include the version of the table to use, with both read and write clients knowing multiple versions. Something like uint64 to identify the table, taking a dumb unix millis timestamp of when it was defined to refer to it. This could be, of course, a later extension with this hardcoded table being the table In my mind there would be later a runtime analyzer (on the consumer side) that would suggest from time to time a new table (through logs in OSS, or an API endpoint to automate this in our cloud offering) which would be just appended to the list of known tables. My suggestion of using an Anyway, just sharing my previous ideas because I saw this PR in my email. |
Thanks for the input @colega! The versioning problem is exactly why I originally leaned away from making the symbols table configurable. We need to be very careful to not roll out a changed symbols table to a distributor before all the ingesters might understand it. Right now I'm leveraging the record format version header we currently send, but allowing multiple entire versions to live in the config together was not something I considered 😉 I do quite like the idea of an operator that's constantly tuning the label set to the best one for the current cell. The main drawback I predict is that we've already de-duplicated the label symbols within each request in #11406. In RW2, each unique string is only sent a maximum of 1 time per request anyway. Under that format, a shared symbolization only saves us a tiny handful of bytes in the symbol table per request to begin with. If a symbol is 20 bytes long, we only really save 20 bytes on that request regardless of how often that symbol is used. This means that I don't think the savings from this change will be astronomical - it's only a small, add-on optimization beyond what we're already doing. Maybe the savings don't justify the complexity? We'll need to test more in dev to be sure though. |
4baf1fc
to
ecb8270
Compare
@colega I'm going to merge it, but I'm not considering this done just yet. I expect to tune the symbols table further and iterate on the idea. Right now we haven't marked the V2 as "stable" so we can change it based on testing feedback. I linked your comment to the overall issue to we remember to explore it further. |
What this PR does
Note that record format V2 is still considered unstable.
This PR introduces a shared table of frequently-used symbols. This starts to fill in the "reserved" symbol space from 0 to 64 in record format V2, that is never used by an ordinary request.
This means that producers need not send certain frequently symbols in the per-request symbol table at all - they can instead send the shared symbol and trust that the consumer will understand it.
Note that the contents of the table are subject to change as we do more analysis. As a first pass, it's populated with various frequently-used symbols from a test cell. The symbols chosen are based on a combination of length and frequency in sampled data, as well as cleaned to represent symbols from Mimir or software that Mimir is frequently used with in practice (e.g. Kubernetes ecosystem or other Grafana Labs projects).
Once V2 is considered stabilized in the future, we cannot change the symbols table further without incrementing the version. Most likely this symbols table will change before stabilization.
Which issue(s) this PR fixes or relates to
Rel https://github.com/grafana/mimir-squad/issues/2253
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.