Skip to content

Conversation

@michaelsembwever
Copy link
Member

https://github.com/riptano/cndb/issues/16176

Port into main-5.0 commit f80afce

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list

Fixes: https://github.com/riptano/cndb/issues/15919
Test PR: https://github.com/riptano/cndb/pull/15949

In the original implementation for
https://github.com/datastax/cassandra/pull/820, we introduced the `PrimaryKeyMapIterator` to iterate all primary keys in an sstable and then do an anti-join on the result of an equality query. That design works, but requires some additional reads from disk to get primary keys that are unnecessary.

There are two possible solutions:

1. We can use row ids (either sstable or segment) to do the complement of the resulting posting lists. This will be the most performant, since it avoids object allocations. The main issue with this solution is that it is much more complicated to implement and had unaddressed edge cases.
2. We can use the `primaryKeyFromRowId` that takes primary key bounds and then uses a row id, when rows are from the same sstable. This will be worse that solution 1 because it creates an object per key and requires comparing sstable ids before comparing sstable row ids, but it is a significant improvement over the current solution, which hits disk to load the primary key.

When testing on my local machine and reviewing the JMH benchmarks, I can see that the current solution is about 16x worse than the minimum solution (2) and 32x worse than the optimal (1) solution. Given that the benchmarks in question are highly specific to the use case, I do no think we have sufficient motivation to introduce the exceedingly complex (1) solution.

Note that the ideal solution to 1, that would have much less complexity, is to convert posting lists into a single iterator of sstable row ids, and then to take the complement of them.

…ting lists (#2112)

Fixes: riptano/cndb#15919
Test PR: riptano/cndb#15949

In the original implementation for
#820, we introduced the
`PrimaryKeyMapIterator` to iterate all primary keys in an sstable and
then do an anti-join on the result of an equality query. That design
works, but requires some additional reads from disk to get primary keys
that are unnecessary.

There are two possible solutions:

1. We can use row ids (either sstable or segment) to do the complement
of the resulting posting lists. This will be the most performant, since
it avoids object allocations. The main issue with this solution is that
it is much more complicated to implement and had unaddressed edge cases.
2. We can use the `primaryKeyFromRowId` that takes primary key bounds
and then uses a row id, when rows are from the same sstable. This will
be worse that solution 1 because it creates an object per key and
requires comparing sstable ids before comparing sstable row ids, but it
is a significant improvement over the current solution, which hits disk
to load the primary key.

When testing on my local machine and reviewing the JMH benchmarks, I can
see that the current solution is about 16x worse than the minimum
solution (2) and 32x worse than the optimal (1) solution. Given that the
benchmarks in question are highly specific to the use case, I do no
think we have sufficient motivation to introduce the exceedingly complex
(1) solution.

Note that the ideal solution to 1, that would have much less complexity,
is to convert posting lists into a single iterator of sstable row ids,
and then to take the complement of them.
@github-actions
Copy link

github-actions bot commented Dec 9, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 9, 2025

Copy link

@djatnieks djatnieks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've seen the TestAllowFiltering fail occasionally before

@michaelsembwever
Copy link
Member Author

Yes, TestAllowFiltering passes locally for me.

@michaelsembwever michaelsembwever merged commit 9dd4c28 into main-5.0 Dec 15, 2025
571 of 594 checks passed
@michaelsembwever michaelsembwever deleted the mck-cndb-16176-main-5.0 branch December 15, 2025 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants