CSHARP-3222: Add LINQ support for median and percentile accumulators/window functions#1743
CSHARP-3222: Add LINQ support for median and percentile accumulators/window functions#1743adelinowona merged 15 commits intomongodb:mainfrom
Conversation
rstam
left a comment
There was a problem hiding this comment.
Initial quick review. Will review more thoroughly after next commit.
There was a problem hiding this comment.
Normally we do if (method.IsOneOf(__medianMethods)).
Do we want to do it differently here?
There was a problem hiding this comment.
I noticed that pattern but I just felt that would be too much boilerplate kind of code and there's an easier way to do it. Plus I noticed the StandardDeviationMethodsToAggregationExpressionTranslator follows a similar pattern already.
There was a problem hiding this comment.
There is some boilerplate in setting up the __medianMethods static field.
There isn't any boilerplate in the if statement. It's roughly the same.
One advantage of using __medianMethods is that it is VERY precise. There is no danger of false hits like there is with the IsMedianMethod approach.
Plus I noticed the StandardDeviationMethodsToAggregationExpressionTranslator follows a similar pattern already.
Yes. That's older code that predates the newer practice of being more precise.
...tionExpressionTranslators/MethodTranslators/MedianMethodToAggregationExpressionTranslator.cs
Outdated
Show resolved
Hide resolved
...tionExpressionTranslators/MethodTranslators/MedianMethodToAggregationExpressionTranslator.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Expressions/AstComplexAccumulatorExpression.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Visitors/AstNodeVisitor.cs
Outdated
Show resolved
Hide resolved
...xpressionTranslators/MethodTranslators/MedianMethodToAggregationExpressionTranslatorTests.cs
Outdated
Show resolved
Hide resolved
...xpressionTranslators/MethodTranslators/WindowMethodToAggregationExpressionTranslatorTests.cs
Outdated
Show resolved
Hide resolved
...er.Tests/Linq/Linq3ImplementationWithLinq2Tests/Translators/AggregateGroupTranslatorTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
There is some boilerplate in setting up the __medianMethods static field.
There isn't any boilerplate in the if statement. It's roughly the same.
One advantage of using __medianMethods is that it is VERY precise. There is no danger of false hits like there is with the IsMedianMethod approach.
Plus I noticed the StandardDeviationMethodsToAggregationExpressionTranslator follows a similar pattern already.
Yes. That's older code that predates the newer practice of being more precise.
...xpressionTranslators/MethodTranslators/MedianMethodToAggregationExpressionTranslatorTests.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Visitors/AstNodeVisitor.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Visitors/AstNodeVisitor.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This is clever but I don't like the idea of a Dictionary<string, AstExpression>. A class that can represent anything in this way is not easy to work with.
I would prefer you just create two new classes AstMedianAccumulatorExpression and AstPercentileAccumulatorExpression.
That would mirror AstMedianExpression and AstPercentileExpression.
It would also avoid the messiness in the VisitComplexAccumulatorExpression method. Instead we could just have two simple methods VisitMedianAccumulatorExpression and VisitPercentileAccumulatorExpression.
The whole point of the Ast classes is to have type-safe representations of MQL, and the use of a Dictionary<string, AstExpression> throws type-safety out the window.
There was a problem hiding this comment.
That's fair. I'll revert this then
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Optimizers/AstGroupingPipelineOptimizer.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Optimizers/AstGroupingPipelineOptimizer.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Optimizers/AstGroupingPipelineOptimizer.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Optimizers/AstGroupingPipelineOptimizer.cs
Outdated
Show resolved
Hide resolved
src/MongoDB.Driver/Linq/Linq3Implementation/Ast/Optimizers/AstGroupingPipelineOptimizer.cs
Show resolved
Hide resolved
...tionExpressionTranslators/MethodTranslators/MedianMethodToAggregationExpressionTranslator.cs
Outdated
Show resolved
Hide resolved
...ExpressionTranslators/MethodTranslators/PercentileMethodToAggregationExpressionTranslator.cs
Outdated
Show resolved
Hide resolved
...tionExpressionTranslators/MethodTranslators/WindowMethodToAggregationExpressionTranslator.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I'm still concerned that the return type is double instead of decimal.
Same below.
There was a problem hiding this comment.
discussed on slack. The Server team currently has $median returning doubles regardless of input type but there is a ticket for improving accuracy in the future to return the correct types. So I'll change the return type here to decimal.
There was a problem hiding this comment.
Enumerable.Average(IEnumerable<float>) returns float.
Should we also?
There was a problem hiding this comment.
Should the percentiles be decimal also to match?
There was a problem hiding this comment.
Should we return float[]?
SHould the percentiles web IEnumerable<flost> to match?
This PR introduces the capability to calculate the median and percentile of numeric values in the MongoDB aggregation pipeline for $group and $setWindowFields stages.