-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes #2066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes #2066
Conversation
Checklist before you submit for review
|
| */ | ||
| public boolean selects(PrimaryKey key) | ||
| { | ||
| return !indexFeatureSet.isRowAware() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would seem the bug was introduced when we switched the makeFilter methods in the QueryController as part of #1883. An alternate solution is to use the indexFeatureSet.isRowAware() to determine that we should use the regular clusteringIndexFilter instead of one made up of only the primary keys coming from the iterator.
I agree that removing this check works and that it might be slightly more efficient in some cases, but I worry that it leaves the two makeFilter methods having different sets of logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that using isRowAware in the makeFilter method also works and that was the first thing I tried.
But, there is a caveat - I argue we really don't want this check here anyways. It increases the complexity of the code by introducing additional code path and creates a hidden dependency between selects and the particular filter created by makeFilter. So we have two distant methods where both must take a correct path for the code to be correct. If one does the wrong thing out of sync from another - then the results are incorrect. Also, why would we even want to further process any keys that don't match the clustering filter (regardless of AA / non-AA)?
So here by removing this check I made this method do just one thing and decoupled it from the particular filter used.
Earlier it had two behaviors different between row-aware and non row-aware - this method either checked the filter or not, depending on the "mode" the query was running in. In general such long-distant dependencies make code hard to reason about and introduce subtle bugs. In general I'll insist on avoiding such a programming style, where one method does something partially in hope some other code called some time later fixes or somehow tolerates that.
And btw, the fact we have some inconsistency between the two variants of makeFilter where one does the isRowAware check and another one doesn't is something that concerns me as well. Why do we even have two makeFilter variants? Why do we even need to build our custom filter that filters by keys from the index?
So it's not the question of whether we want this fix or the makeFilter fix, but rather do we want to harden / simplify makeFilter in addition to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your reasoning, and I am good with adding a follow up to clean up this part of the code base.
test/unit/org/apache/cassandra/index/sai/cql/NumericIndexMixedVersionTest.java
Outdated
Show resolved
Hide resolved
54fb5d7 to
0fdf370
Compare
When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query.
0fdf370 to
d326725
Compare
|
✔️ Build ds-cassandra-pr-gate/PR-2066 approved by ButlerApproved by Butler |
|
|
||
| flush(); // Force to sstable with EC version | ||
|
|
||
| beforeAndAfterFlush(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add beforeAndAfterFlushAndCompact ? we recently had problems because we did not test indexes after compaction
I think that it does no harm
I expect that compaction will probably rewrite all the indexes to the same version, that's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this comment but now I can see the branch got merged. Yes, it wouldn't do any harm.
…es (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query.
…es (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query.
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query. Rebase notes: - includes CNDB-15683
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query. Rebase notes: - includes CNDB-15683
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query. Rebase notes: - includes CNDB-15683
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query. Rebase notes: - includes CNDB-15683
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066) When row-aware and non-row-aware indexes are mixed, we now check the clustering index filter for all the keys that have clustering information, i.e. keys coming from the row-aware indexes. Earlier that check was accidentally disabled if at least one non-row-aware index was used by the query. That could cause retrieving rows that do not match the clustering condition of the query. Rebase notes: - includes CNDB-15683



When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.