-
Notifications
You must be signed in to change notification settings - Fork 25.3k
[ES|QL] Support implicit casting of aggregate_metric_double #129108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bfb2043
to
707c465
Compare
c4cc58e
to
da37671
Compare
025e131
to
cc2645c
Compare
This commit adds support for implicit casting of aggregate_metric_double when present with other numerics for a limited set of aggregation functions: - Max / MaxOverTime - Min / MinOverTime - Sum / SumOverTime - Count / CountOverTime - Avg / AvgOverTime Attempting to use fields mapped to aggregate_metric_double in one index but some other numeric in another index in any other context will still require explicit casting with ToAggregateMetricDouble
cc2645c
to
b6b041c
Compare
x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/esql/40_tsdb.yml
Outdated
Show resolved
Hide resolved
) | ||
) | ||
); | ||
// TODO: verify the numbers are accurate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've manually verified that the values produced make sense, but I'm not sure how to go about verifying within the IT test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah tricky.. maybe it's easy to track and min of mins and max of maxes, and verify just that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm depending on the query I'm not sure that would work, anything involving sum/_over_time would go out of that range, and then throwing negative numbers into the mix I think would throw everything off quite a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so only (min|max)(min|max_over_time) would work. One option is to track the min and and max value overall, and verify that they get returned if we have one of those combinations.
...downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DownsampleIT.java
Show resolved
Hide resolved
...downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DownsampleIT.java
Show resolved
Hide resolved
var resp = esqlCommand( | ||
"TS " | ||
+ dataStreamName | ||
+ " | STATS min = sum(min_over_time(cpu)), max = sum(max_over_time(cpu)) by cluster, bucket(@timestamp, 1 hour)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So cool to see this working.. well done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider randomizing the inner and outer agg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the current integration test I decided to have it loop through some known combinations for a single aggregation, right now the 2nd bug I listed would cause some combinations (avg_over_time + sum/count_over_time) to fail so I decided to not randomize it.
But it might be good to have some random testing that generates a random number of these aggregations to run in the same query once that bug gets figured out
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
|
||
// test _over_time commands with implicit casting of aggregate_metric_double | ||
for (String innerCommand : List.of("min_over_time", "max_over_time", "avg_over_time", "count_over_time")) { | ||
for (String outerCommand : List.of("min", "max", "sum", "count")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using randomFrom
instead of checking all combinations in each run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, deferring to Nhat to approve who understands the planner much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments but the approach looks good. Thanks Larisa!
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
int alreadyAddedUnionFieldAttributes = unionFieldAttributes.size(); | ||
plan = plan.transformExpressionsOnly(e -> { | ||
if (e instanceof Avg avg) { | ||
return substituteSurrogates(avg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid this special rule? Also, now that we have SumOverTime
and CountOverTime
, can we implement SurrogateExpression for AvgOverTime so it can be treated similarly to Avg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to substitute surrogates here/before this rule for Avg/AvgOverTime because they get split into 2 new aggregations that both require different convert functions, and I wasn't able to figure out how to get the engine to recognize that it needs to make both of these conversions individually without establishing them this early.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/esql/40_tsdb.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some minor comments, but LGTM. Great work, thanks Larisa!
...in/esql/src/main/java/org/elasticsearch/xpack/esql/planner/EsPhysicalOperationProviders.java
Show resolved
Hide resolved
@@ -28,7 +28,7 @@ private TypeConverter(String evaluatorName, ExpressionEvaluator convertEvaluator | |||
this.convertEvaluator = convertEvaluator; | |||
} | |||
|
|||
public static TypeConverter fromConvertFunction(AbstractConvertFunction convertFunction) { | |||
public static TypeConverter fromScalarFunction(EsqlScalarFunction convertFunction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we revert this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can actually because FromAggregateMetricDouble
implements ConvertFunction
but it doesn't extend AbstractConvertFunction
- and the reason I didn't change FromAggregateMetricDouble
to extend it was because it requires 2 inputs but AbstractConvertFunction
extends UnaryScalarFunction
...downsample/src/internalClusterTest/java/org/elasticsearch/xpack/downsample/DownsampleIT.java
Outdated
Show resolved
Hide resolved
…erter and remove limit from IT test
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
This reverts (parts of) commit af7a60b.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlCapabilities.java
Outdated
Show resolved
Hide resolved
Expression fn = (Expression) convert; | ||
List<Expression> children = new ArrayList<>(fn.children()); | ||
children.set(0, resolvedAttr); | ||
Expression e = ((Expression) convert).replaceChildren(children); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we make this change? AbstractConvertFunction contains only a single child, the field. Or do we have something that does not extend that class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I see FromAggregateMetricDouble does not extend AbstractConvertFunction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToDouble from aggregate_metric_double has two parameters: field and metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @limotova!
I’m wondering where I can find some aggregate function tests that take union-typed fields as input, that involves mixed aggregate_metric_double
and the other regular numeric typed fields (such as int
, long
, double
, etc.)? I was looking for some examples that show what didn’t work without this PR and what works with it, to help me better understand the change. Having some positive AnalyzerTests
or CsvTests
will be really helpful for debugging purposes.
aborted.set(Boolean.TRUE); | ||
return aggFunc; | ||
} | ||
if (aggFunc instanceof Avg || aggFunc instanceof AvgOverTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any chance this surrogate()
could be handled by SubstituteSurrogateAggregations
in LogicalPlanOptimizer
? As it seems like it could be separated from implicit casting, these two functions are not conversion functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the 2nd bug in the PR description, I am hoping to remove it once a better fix is found, but for now unless you substitute surrogates at this stage, a query containing both avg_over_time + sum_over_time or count_over_time will fail
count_over_time
becomes sum_over_time(agg_metric.count_metric)
when fed an aggregate_metric_double,
And similarly avg_over_time
becomes div(sum_over_time(agg_metric), count_over_time(agg_metric))
, which then becomes div(sum_over_time(agg_metric.sum_metric), sum_over_time(agg_metric.count_metric))
And at the InsertFieldExtraction phase (specifically here), it drops one of the sum_over_time(agg_metric.count_metric)
, (which seems correct? Though I don't fully understand the code here)
But then it seems something else later on expects the reference to not be dropped, because the error that appears is a block casting error, where it tries to cast one block as another, presumably because it expected some of the blocks in the page to be shifted over by one
But if this substitution is made sooner, then it's possible to give different NameIds when implicitly casting, and then the references aren't dropped (I understand that this is maybe not ideal though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a comment explaining why it's there..!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation @limotova! If there is a (negative)test(that we can debug into) for this particular case, we can take a closer look together to see if there is anything we can do to make it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll reach out on slack!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the surrogate()
here thanks to @fang-xing-esql figuring out that it actually was unnecessary (the key for making the queries work with Avg and AvgOverTime was making unique NameIds from Sum/Count/OverTime)
Thank you for taking a look! @fang-xing-esql I also have a commit in a separate branch that works off of this one with 3 tests related to the 2nd bug; they should all produce the same result in theory, though only the |
I'm currently working on adding a dataset with AggregateMetricDouble to the CSV tests and once that is available I'll add in some CSV tests and AnalyzerTests related to this PR and link them back, but for now I'll merge this in how it is |
This commit adds support for implicit casting of aggregate_metric_double
when present with other numerics for a limited set of aggregation
functions:
Attempting to use fields mapped to aggregate_metric_double in one index
but some other numeric in another index in any other context will still
require explicit casting with ToAggregateMetricDouble
Known bugs:
TS my-data-stream | EVAL m = my-mixed-metric | STATS max(m)
does not workIn other words,
avg_over_time
plussum/count_over_time
aggregation, both on an aggregate_metric_double field in the same way (both implicitly casted (FIXED), or both explicitly casted with to_aggregate_metric_double (still broken)), grouped by a keyword field