Commit 9dd4c28
CNDB-16176: CNDB-15919: Optimize SAI NOT queries, push logic into posting lists (#2112)
Fixes: riptano/cndb#15919
Test PR: riptano/cndb#15949
In the original implementation for
#820, we introduced the
`PrimaryKeyMapIterator` to iterate all primary keys in an sstable and
then do an anti-join on the result of an equality query. That design
works, but requires some additional reads from disk to get primary keys
that are unnecessary.
There are two possible solutions:
1. We can use row ids (either sstable or segment) to do the complement
of the resulting posting lists. This will be the most performant, since
it avoids object allocations. The main issue with this solution is that
it is much more complicated to implement and had unaddressed edge cases.
2. We can use the `primaryKeyFromRowId` that takes primary key bounds
and then uses a row id, when rows are from the same sstable. This will
be worse that solution 1 because it creates an object per key and
requires comparing sstable ids before comparing sstable row ids, but it
is a significant improvement over the current solution, which hits disk
to load the primary key.
When testing on my local machine and reviewing the JMH benchmarks, I can
see that the current solution is about 16x worse than the minimum
solution (2) and 32x worse than the optimal (1) solution. Given that the
benchmarks in question are highly specific to the use case, I do no
think we have sufficient motivation to introduce the exceedingly complex
(1) solution.
Note that the ideal solution to 1, that would have much less complexity,
is to convert posting lists into a single iterator of sstable row ids,
and then to take the complement of them.1 parent e054ecf commit 9dd4c28
File tree
2 files changed
+99
-1
lines changed- src/java/org/apache/cassandra/index/sai/disk
- test/microbench/org/apache/cassandra/test/microbench/index/sai
2 files changed
+99
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
110 | 110 | | |
111 | 111 | | |
112 | 112 | | |
113 | | - | |
| 113 | + | |
114 | 114 | | |
115 | 115 | | |
116 | 116 | | |
| |||
Lines changed: 98 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
0 commit comments