CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list #2164

michaelsembwever · 2025-12-09T14:15:07Z

https://github.com/riptano/cndb/issues/16176

Port into main-5.0 commit f80afce

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list

Fixes: https://github.com/riptano/cndb/issues/15919
Test PR: https://github.com/riptano/cndb/pull/15949

In the original implementation for
https://github.com/datastax/cassandra/pull/820, we introduced the `PrimaryKeyMapIterator` to iterate all primary keys in an sstable and then do an anti-join on the result of an equality query. That design works, but requires some additional reads from disk to get primary keys that are unnecessary.

There are two possible solutions:

1. We can use row ids (either sstable or segment) to do the complement of the resulting posting lists. This will be the most performant, since it avoids object allocations. The main issue with this solution is that it is much more complicated to implement and had unaddressed edge cases.
2. We can use the `primaryKeyFromRowId` that takes primary key bounds and then uses a row id, when rows are from the same sstable. This will be worse that solution 1 because it creates an object per key and requires comparing sstable ids before comparing sstable row ids, but it is a significant improvement over the current solution, which hits disk to load the primary key.

When testing on my local machine and reviewing the JMH benchmarks, I can see that the current solution is about 16x worse than the minimum solution (2) and 32x worse than the optimal (1) solution. Given that the benchmarks in question are highly specific to the use case, I do no think we have sufficient motivation to introduce the exceedingly complex (1) solution.

Note that the ideal solution to 1, that would have much less complexity, is to convert posting lists into a single iterator of sstable row ids, and then to take the complement of them.

…ting lists (#2112) Fixes: riptano/cndb#15919 Test PR: riptano/cndb#15949 In the original implementation for #820, we introduced the `PrimaryKeyMapIterator` to iterate all primary keys in an sstable and then do an anti-join on the result of an equality query. That design works, but requires some additional reads from disk to get primary keys that are unnecessary. There are two possible solutions: 1. We can use row ids (either sstable or segment) to do the complement of the resulting posting lists. This will be the most performant, since it avoids object allocations. The main issue with this solution is that it is much more complicated to implement and had unaddressed edge cases. 2. We can use the `primaryKeyFromRowId` that takes primary key bounds and then uses a row id, when rows are from the same sstable. This will be worse that solution 1 because it creates an object per key and requires comparing sstable ids before comparing sstable row ids, but it is a significant improvement over the current solution, which hits disk to load the primary key. When testing on my local machine and reviewing the JMH benchmarks, I can see that the current solution is about 16x worse than the minimum solution (2) and 32x worse than the optimal (1) solution. Given that the benchmarks in question are highly specific to the use case, I do no think we have sufficient motivation to introduce the exceedingly complex (1) solution. Note that the ideal solution to 1, that would have much less complexity, is to convert posting lists into a single iterator of sstable row ids, and then to take the complement of them.

github-actions · 2025-12-09T14:15:26Z

sonarqubecloud · 2025-12-09T16:37:08Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-12-09T17:03:21Z

❌ Build ds-cassandra-pr-gate/PR-2164 rejected by Butler

9 regressions found
See build details here

Found 9 new test failures

Test	Explanation	Runs	Upstream
replica_side_filtering_test.TestAllowFiltering.test_count (offheap-bti)	NEW	🔴	0 / 25
o.a.c.cql3.validation.operations.AggregationQueriesTest.testAggregationQueryShouldNotTimeoutWhenItExceedesReadTimeout (compression)	NEW	🔴	2 / 25
o.a.c.distributed.test.DropUDTWithRestartTest.loadCommitLogAndSSTablesWithDroppedColumnTestCC50	NEW	🔴	22 / 25
o.a.c.distributed.test.DropUDTWithRestartTest.loadCommitLogAndSSTablesWithDroppedColumnTestCassandra41	NEW	🔴	1 / 25
o.a.c.distributed.test.DropUDTWithRestartTest.loadCommitLogAndSSTablesWithDroppedColumnTestCassandra5	NEW	🔴	1 / 25
o.a.c.distributed.test.repair.ForceRepairTest.terminated successfully ()	NEW	🔴	1 / 25
o.a.c.index.SecondaryIndexManagerTest.testIndexRebuildWhenAddingSStableViaRemoteReload (compression)	NEW	🔴	20 / 25
o.a.c.index.sai.cql.VectorKeyRestrictedOnPartitionTest.partitionRestrictedWidePartitionBqCompressedTest[dc] (compression)	NEW	🔴	0 / 25
o.a.c.metrics.TrieMemtableMetricsTest.testContentionMetrics (compression)	NEW	🔴	9 / 25

No known test failures found

djatnieks

I think we've seen the TestAllowFiltering fail occasionally before

michaelsembwever · 2025-12-15T09:20:05Z

Yes, TestAllowFiltering passes locally for me.

michaelsembwever requested a review from michaeljmarshall December 9, 2025 14:15

djatnieks approved these changes Dec 10, 2025

View reviewed changes

michaelsembwever merged commit 9dd4c28 into main-5.0 Dec 15, 2025
571 of 594 checks passed

michaelsembwever deleted the mck-cndb-16176-main-5.0 branch December 15, 2025 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list #2164

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list #2164

Uh oh!

michaelsembwever commented Dec 9, 2025

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

sonarqubecloud bot commented Dec 9, 2025

Uh oh!

cassci-bot commented Dec 9, 2025

Uh oh!

djatnieks left a comment

Uh oh!

michaelsembwever commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list #2164

CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting list #2164

Uh oh!

Conversation

michaelsembwever commented Dec 9, 2025

Uh oh!

github-actions bot commented Dec 9, 2025

Checklist before you submit for review

Uh oh!

sonarqubecloud bot commented Dec 9, 2025

Quality Gate passed

Uh oh!

cassci-bot commented Dec 9, 2025

❌ Build ds-cassandra-pr-gate/PR-2164 rejected by Butler

Found 9 new test failures

No known test failures found

Uh oh!

djatnieks left a comment

Choose a reason for hiding this comment

Uh oh!

michaelsembwever commented Dec 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants