Description
I have been thinking what makes an expression computable in a remote engine and to be reassembled in the root engine. I think the following might be a sound algorithm that enables us to push down even more expressions.
Problem
Assume we have remote engines that are partitioned by a label P
. Imagine an expression like topk(10, sum by (P, instance) (X))
and assume that instance
is a high-cardinality label. In the current implementation we would have to evaluate sum by (P, instance) (X)
remotely and fetch all values of instance into the root engine to compute the surrounding topk
even though we would not need most of them, clearly we can evaluate topk
per engine and only coalesce the top 10 values for the inner expression since we dont have any overlap. But the algorithm right now doesnt know that.
Observation
Imagine we are traversing the AST of a PromQL expression from the bottom up. Assume we have expression nodes P and C, where P is the parent of node C. If the expression at node C still has its original P
label I want to conjecture that we can evaluate C in the remote engines and how we coalesce in the root engine depends on the parent node P. If P is nil and we are at the root we just join the series together, if its an aggregation we do the aggregation in the root engine, etc.
What is meant with the heuristic that expression node C is having the original P
label still? I think we can define this recursively. In the base case we are at a vector selector, by definition it will have P
(disregarding constant expressions and absent functions right now). If we are a node that has children and all children have the P
label still then we also have it if we are not a relabel_replace
function that overwrites it, if we are not aggregating it away (i.e. sum without(P) (X)
would lose P) or if we are not a binary expression that doesnt have P
in its matching labels (i.e. A * ignoring (P) B
would lose P). In all other cases unless im forgetting something we should still have the partition label P and should be computable in a remote engine.
Suggestion
We rewrite the distribution algorithm by checking recursively if we keep the partition labels P
, if so we traverse up until we cannot anymore. Once a parent P would lose the label we stop recursion and evaluate the expression henceforth in the root engine, we coalesce depending on the parent node P.
Is there anything obiously wrong with this approach?
Edit: this should also hold if P
is multiple labels that form a logical partition of remote engines.