-
Notifications
You must be signed in to change notification settings - Fork 64
Add optimizer for extracting projection labels #550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit adds a logical optimizer for extracting projection labels from a query. Signed-off-by: Filip Petkovski <[email protected]>
projection = Projection{ | ||
Labels: slices.Clone(e.Grouping), | ||
Include: !e.Without, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I know how your scanner is implemented when it is a without? Does it return series hash label in this case?
This is what I implemented as a mock storage which does projection and always return series hash when projection is available. So maybe there is some difference here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it is without
, it returns the hash over all labels, and the labels which are not in the Labels
slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to include series hash in the grouping label for without? So that it can be removed after the aggregation.
Or you have another place to remove series hash label?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to take a look at the details, will post a bit later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we dont need to include because the hash will be just an implementation detail of the storage. We use it for things like: partitioning, map series to iterators, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We include the hash label only when Include is true. We also have an explicit select for the series hash label for some binary joins as shown in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have hash label when without then how do we pass the duplicate labels check in the engine? Do we have to disable it.
I am trying to run some correctness tests and saw some issues based on my setup. I assume something wrong in my mock storage implementation. Maybe you can try to add a similar correctness test using a mock storage and see if there is any correctness issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We disabled those duplicate checks, and afaik prometheus has a feature to do that now as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. But I can still see some correctness issues after disabling duplicate check so maybe you can take a closer look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's disable them here too - with a hash column we won't get issues though since we never drop it i think
slices.Sort(projection.Labels) | ||
projection.Labels = slices.Compact(projection.Labels) | ||
e.Projection = projection | ||
projection = Projection{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also need to change MergeSelect or just modify the scanner to get it work with Projection? Like the selects you merged might have different projection labels and you need to fetch both and project only required labels for each select
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this is a good question, we do not use MergeSelect
anymore, but I think we should modify it to be a union of all selectors that it merges.
This commit adds a logical optimizer for extracting projection labels from a query.