SERVER-25957 Optimize $filter + $arrayElemAt to avoid scanning entire array#1236
SERVER-25957 Optimize $filter + $arrayElemAt to avoid scanning entire array#1236KevinCybura wants to merge 8 commits intomongodb:masterfrom
Conversation
src/mongo/db/pipeline/expression.cpp
Outdated
| } | ||
|
|
||
| intrusive_ptr<Expression> ExpressionArrayElemAt::optimize() { | ||
| // If ExpressionArrayElemAt is passed an ExpressionFilter as its first arugment |
There was a problem hiding this comment.
Similar to what you did in the ExpressionIndexOf optimization, I think you should first call ExpressionNary::optimize():
mongo/src/mongo/db/pipeline/expression.cpp
Lines 2809 to 2813 in c799c6c
(Also, while you're there, it looks like the comment right below that got wrapped at a strange place, can you fix that?)
src/mongo/db/pipeline/expression.cpp
Outdated
| str::stream() << getOpName() << "'s second argument must be a numeric value," | ||
| << " but is " | ||
| << typeName(indexArg.getType()), | ||
| indexArg.numeric()); |
There was a problem hiding this comment.
I think this check is redundant with the one below. If something is integral it must be numeric. https://github.com/mongodb/mongo/blob/master/src/mongo/db/pipeline/value.cpp#L1010
src/mongo/db/pipeline/expression.cpp
Outdated
| << " a 32-bit integer: " | ||
| << indexArg.coerceToDouble(), | ||
| indexArg.integral()); | ||
| auto ind = indexArg.coerceToInt(); |
src/mongo/db/pipeline/expression.cpp
Outdated
| str::stream() << "Second argument to $slice must be a numeric value," | ||
| << " but is of type: " | ||
| << typeName(secondVal.getType()), | ||
| secondVal.numeric()); |
There was a problem hiding this comment.
Again, I don't think you need the assertion about numeric().
src/mongo/db/pipeline/expression.cpp
Outdated
| if (auto thirdArg = dynamic_cast<ExpressionConstant*>(vpOperand[2].get())) { | ||
| auto thirdVal = thirdArg->getValue(); | ||
|
|
||
| uassert(50799, |
There was a problem hiding this comment.
I think you can condense these three asserts into 'uassertIfNotIntegralAndNonNegative': https://github.com/mongodb/mongo/blob/master/src/mongo/db/pipeline/expression.cpp#L2704
src/mongo/db/pipeline/expression.cpp
Outdated
|
|
||
| int arg3 = thirdVal.coerceToInt(); | ||
| if (arg2 >= 0) { | ||
| // If ExpressionSlice is given three arguments set limit to 'firstArg' + |
There was a problem hiding this comment.
Can you elaborate a little bit more here? Maybe something like If we have three arguments, then we can stop the $filter once we have seen 'firstArg' + 'secondArg' + 1. We need to skip 'firstArg' elements, then include 'secondArg' things after that. For example, if we want to slice starting from 2 with length 3, we skip the first 2, then include 3 after that.
Wait, I think we don't need the +1 here? Is that right?
src/mongo/db/pipeline/expression.h
Outdated
| boost::intrusive_ptr<Expression> _input; | ||
| // The expression determining whether each element should be present in the result array. | ||
| boost::intrusive_ptr<Expression> _filter; | ||
| // For expression ArrayElemAt and Slice we can optimize filter by setting a limit to end the |
There was a problem hiding this comment.
I'd make it clearer that this optimization applies when $filter is being used as an argument to either $slice or $arrayElemAt. Also, feel free to use the $slice name instead of ExpressionSlice here.
|
|
||
| } // namespace Object | ||
|
|
||
| TEST(ExpressionFilter, ExpressionFilterWithASetLimitShouldReturnAArrayNoGreaterThanTheLimit) { |
There was a problem hiding this comment.
Use AnArray instead of AArray.
| auto arrayElemAtSpec = BSON("$arrayElemAt" << BSON_ARRAY(filterSpec << 2)); | ||
|
|
||
| auto expArrayElemAt = ExpressionArrayElemAt::parse(expCtx, arrayElemAtSpec.firstElement(), vps); | ||
| auto xpArrayElemAt = dynamic_cast<ExpressionArrayElemAt*>(expArrayElemAt.get())->optimize(); |
There was a problem hiding this comment.
Use a better variable name, maybe 'optimized'?
| auto val = xpArrayElemAt->evaluate(Document()); | ||
| ASSERT_VALUE_EQ(val, Value(6)); | ||
|
|
||
| auto eA = ExpressionArrayElemAt::parse(expCtx, BSON("$arrayElemAt" << BSON_ARRAY(BSON_ARRAY(1 << 2 <<3 << 4 <<5) << 1)).firstElement(), vps); |
There was a problem hiding this comment.
You can re-use the variable names from before, I think that'd be clearer than two-letter variable names.
There was a problem hiding this comment.
Same with below. Alternatively, you could separate each one into it's own test.
|
Hey @KevinCybura! I was checking through old PR's and saw that SERVER-25957 is actually still open, as this wasn't merged. I am poking that ticket to see where it's at. If you are still looking to get these changes in, can you please resolve the merge conflicts and get this caught up with master? I saw you already signed our contributor's agreement. I know it's been a few years, so it's okay if that's not on the table. (I'm trying to clean up our pull request queue right now) |
No description provided.