-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-12651: Fix flaky VectorDistributedTest.rangeRestrictedTest #1834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Checklist before you submit for review
|
test/distributed/org/apache/cassandra/distributed/test/sai/VectorDistributedTest.java
Show resolved
Hide resolved
Can we have a repeatable run in CI, like running it in a loop in the multiplexer a few hundred times? |
Thanks for the review. I have increased to 6% and tried to run it in the multiplexer. However, it seems it's not running the tests: https://jenkins-stargazer.aws.dsinternal.org/view/cc-builds/job/ds-cassandra-build/1120/parameters/ Maybe I'm missing some parameter. I found this Slack thread about how to use the multiplexer, did you manage to do it? |
Yes, I managed as per @jacek-lewandowski 's explanation in that thread. I just took a look at what you submitted, and there was an unneeded space after the first |
Aah, an extra white space! Good catch! I have run the multiplexer for the dtest without failures for Perhaps we should increase that percentage to something as high as 20%, to match the recall? I guess that in the worst-case scenario the number of absent rows can match the ones missing in recall calculations. |
I have just increased the percentage to 20% to match recall, in case all the unmatched rows are all missing. Repeated runs are:
I have also inlined There are similar errors across both test classes, but I'd prefer to deal with them separately, since they seem to work slightly differently. For example, some expect a perfect match if we are expecting less than ten rows, and 5% if we expect more. |
I agree, no need to revert anything, thanks.
+1
Makes sense, considering the recall is 0.8 and we assert on it. |
Don't expect an exact, fixed number of results from ANN. This is analogous to the fix applied to the not-distributed version of the test, VectorLocalTest.rangeRestrictedTest, by ae96e0f
…geRestrictedTest to match expected recall Also, inline searchWithRange since the previous refactor using beforeAndAfterFlush makes it single-called
ae32f29
to
45a2fe3
Compare
|
✔️ Build ds-cassandra-pr-gate/PR-1834 approved by ButlerApproved by Butler |
Fix
VectorDistributedTest.rangeRestrictedTest
to don't require an exact match in ANN query results.That test is the dtest version of the
VectorLocalTest.rangeRestrictedTest
utest. The utest was relaxed by this commit to only check for results within a 5% of the expected, given the approximate nature of ANN. However, the dtest was never updated in the same way. This PR applies the same change to the dtest.