Skip to content

Conversation

@pkolaczk
Copy link

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

@github-actions
Copy link

github-actions bot commented Oct 15, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

*/
public boolean selects(PrimaryKey key)
{
return !indexFeatureSet.isRowAware() ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would seem the bug was introduced when we switched the makeFilter methods in the QueryController as part of #1883. An alternate solution is to use the indexFeatureSet.isRowAware() to determine that we should use the regular clusteringIndexFilter instead of one made up of only the primary keys coming from the iterator.

I agree that removing this check works and that it might be slightly more efficient in some cases, but I worry that it leaves the two makeFilter methods having different sets of logic.

Copy link
Author

@pkolaczk pkolaczk Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that using isRowAware in the makeFilter method also works and that was the first thing I tried.

But, there is a caveat - I argue we really don't want this check here anyways. It increases the complexity of the code by introducing additional code path and creates a hidden dependency between selects and the particular filter created by makeFilter. So we have two distant methods where both must take a correct path for the code to be correct. If one does the wrong thing out of sync from another - then the results are incorrect. Also, why would we even want to further process any keys that don't match the clustering filter (regardless of AA / non-AA)?

So here by removing this check I made this method do just one thing and decoupled it from the particular filter used.

Earlier it had two behaviors different between row-aware and non row-aware - this method either checked the filter or not, depending on the "mode" the query was running in. In general such long-distant dependencies make code hard to reason about and introduce subtle bugs. In general I'll insist on avoiding such a programming style, where one method does something partially in hope some other code called some time later fixes or somehow tolerates that.

And btw, the fact we have some inconsistency between the two variants of makeFilter where one does the isRowAware check and another one doesn't is something that concerns me as well. Why do we even have two makeFilter variants? Why do we even need to build our custom filter that filters by keys from the index?

So it's not the question of whether we want this fix or the makeFilter fix, but rather do we want to harden / simplify makeFilter in addition to this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your reasoning, and I am good with adding a follow up to clean up this part of the code base.

@pkolaczk pkolaczk force-pushed the c15683-check-clustering-filter-for-mixed-AA-EC branch from 54fb5d7 to 0fdf370 Compare October 17, 2025 09:26
When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.
@pkolaczk pkolaczk force-pushed the c15683-check-clustering-filter-for-mixed-AA-EC branch from 0fdf370 to d326725 Compare October 17, 2025 09:28
@sonarqubecloud
Copy link

@cassci-bot
Copy link

✔️ Build ds-cassandra-pr-gate/PR-2066 approved by Butler


Approved by Butler
See build details here


flush(); // Force to sstable with EC version

beforeAndAfterFlush(() -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add beforeAndAfterFlushAndCompact ? we recently had problems because we did not test indexes after compaction
I think that it does no harm
I expect that compaction will probably rewrite all the indexes to the same version, that's fine

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this comment but now I can see the branch got merged. Yes, it wouldn't do any harm.

@eolivelli eolivelli merged commit e5a32d3 into main Oct 18, 2025
491 checks passed
@eolivelli eolivelli deleted the c15683-check-clustering-filter-for-mixed-AA-EC branch October 18, 2025 09:38
pkolaczk pushed a commit that referenced this pull request Oct 31, 2025
…es (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.
pkolaczk pushed a commit that referenced this pull request Nov 3, 2025
…es (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.
michaelsembwever pushed a commit that referenced this pull request Nov 3, 2025
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

1. A key with non-empty clustering wins over a key with
empty clustering and same partition.

2. The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

Rebase notes:
 - includes CNDB-15683
michaelsembwever pushed a commit that referenced this pull request Nov 5, 2025
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

1. A key with non-empty clustering wins over a key with
empty clustering and same partition.

2. The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

Rebase notes:
 - includes CNDB-15683
michaelsembwever pushed a commit that referenced this pull request Nov 5, 2025
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

1. A key with non-empty clustering wins over a key with
empty clustering and same partition.

2. The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

Rebase notes:
 - includes CNDB-15683
michaelsembwever pushed a commit that referenced this pull request Nov 12, 2025
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

1. A key with non-empty clustering wins over a key with
empty clustering and same partition.

2. The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

Rebase notes:
 - includes CNDB-15683
michaelsembwever pushed a commit that referenced this pull request Nov 12, 2025
CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

1. A key with non-empty clustering wins over a key with
empty clustering and same partition.

2. The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

CNDB-15683: Fix incorrect results when querying mixed AA and EC indexes (#2066)

When row-aware and non-row-aware indexes are mixed, we now check
the clustering index filter for all the keys that have clustering
information, i.e. keys coming from the row-aware
indexes. Earlier that check was accidentally disabled
if at least one non-row-aware index was used by the query.
That could cause retrieving rows that do not match
the clustering condition of the query.

Rebase notes:
 - includes CNDB-15683
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants