Skip to content

Conversation

@djatnieks
Copy link

@djatnieks djatnieks commented May 29, 2025

What is the issue

CNDB-14275 Port commits from main to main-5.0

What does this PR fix and why was it fixed

This PR is porting the following commits from main branch to main-5.0 branch.

CNDB-13739 count numRows correctly in SSTable SAI (#1695) cb3ee3d Ruslan Fomkin [email protected] Apr 25, 2025 at 2:06 AM

  • minor conflicts in SSTableIndexWriter due to original usage of TypeUtil.encode being now TypeUtil.asIndexBytes in 5.0

CNDB-13847: Add Token::fromLongValue method to simplify SAI token creation (#1709) 7d52230 Michael Marshall [email protected] Apr 24, 2025 at 7:41 AM

  • No conflicts

CNDB-13848: avoid serializing AlwaysPresentFilter when BF memory limit is reached (#1710) 1458fed Zhao Yang [email protected] Apr 24, 2025 at 1:13 AM

  • Checks for filter instanceOf BloomFilter collapsed into the FilterComponent.save method because 5.0 has done a lot of refactoring here. This covers the original commit changes to SSTableReader.saveBloomFilter, BigTableWriter.flushBf, TrieIndexSSTableReader.recreateBloomFilter, and TrieIndexSSTableWriter.flushBf.
  • AlwaysPresentFilter.toString goes instead to FilterFactory.AlwaysPresentFilter.toString.

CNDB-13625: Implement rerankless vector search through ANN_OPTIONS (#1705) 053fc12 Michael Marshall [email protected] Apr 23, 2025 at 9:54 AM

  • trivial conflict on Tracing.trace statement in CassandraDiskAnn

CNDB-13833: Fix QueryTimeoutTest on JDK22 6c2cd8d Ekaterina Dimitrova [email protected] Apr 23, 2025 at 8:11 AM

  • trivial conflict that I don't remember - it's a one line change

CNDB-13822: Add ANN_OPTION use_pruning (#1704) 9cd80ce Michael Marshall [email protected] Apr 22, 2025 at 9:39 PM

  • Added CassandraRelevantProperties entry:
    SAI_VECTOR_USE_PRUNING_DEFAULT("cassandra.sai.jvector.use_pruning_default", "1000"),

CNDB-13689: use NodeQueue::pushMany to decrease time complexity to build heap (#1693) df81d9a Michael Marshall [email protected] Apr 22, 2025 at 9:17 AM

  • no merge conflicts

michaeljmarshall and others added 7 commits May 28, 2025 14:30
…ild heap (#1693)

### What is the issue
Fixes: riptano/cndb#13689

### What does this PR fix and why was it fixed
This PR utilizes the NodeQueue::pushMany method to decrease the time
complexity required to build the NodeQueue from `O(n log(n))` to `O(n)`.
This is likely only significant for sufficiently large hybrid queries.
For example, we have seen cases of the search producing 400k rows, which
means that we do 400k insertions into these NodeQueue objects.
Fixes riptano/cndb#13822

CNDB test pr riptano/cndb#13826

SAI’s DiskANN/JVector engine currently **always searches with pruning
enabled**, trading recall for latency. That is fine for most workloads,
but:

* **Threshold / bounded ANN queries** can lose matches when pruning
exits early.
* Performance‑testing users need an easy way to turn pruning off to
measure the recall/latency curve.

This patch introduces a per‑query **`use_pruning`** option so users and
operators can choose the trade‑off that suits them.

---

```cql
WITH ann_options = {'use_pruning': true|false}
```

*When omitted we fall back to the node level default (see below).*

* Cluster‑wide default is controlled by the JVM system property:
  ```
  -Dcassandra.sai.jvector.use_pruning_default=<true|false>
  ```
  exposed as `V3OnDiskFormat.JVECTOR_USE_PRUNING_DEFAULT`.
* The property defaults to `true`, preserving existing behaviour.

* Value must be the literal `true` or `false` (case‑insensitive).
* Unknown ANN option keys continue to raise `InvalidRequestException`.

* `Orderer` computes the value of the `usePruning` option by using the
`use_pruning` value if it is not null or the jvm default if it is null
and passes it down to **all** `graph.search()` calls.
* Threshold / bounded ANN queries always pass `use_pruning = false`
because correctness > latency for those paths (this is a net new change,
but it's very minor and might not have any impact on those queries
depending on the jvector implementation)

* We added one flag bit to `ANNOptions` serialization; older nodes
ignore unknown bits, so mixed‑version clusters are fine (though they do
throw an exception for unknown settings)

* Parsing, validation and transport round‑trips (`ANNOptionsTest`).
* Distributed smoke (`ANNOptionsDistributedTest`).
* Recall regression for pruning vs no‑pruning
(`VectorSiftSmallTest.ensureDisablingPruningIncreasesRecall`).
…1705)

### What is the issue
Fixes: riptano/cndb#13625

### What does this PR fix and why was it fixed
When `rerank_k` is non-positive, we will skip reranking where possible.

A minor note: this change is not strictly backwards compatible because
of the original implementation. However, it will fail quickly if the
coordinator accepts the value but the writer does not, and since this is
only a query-time parameter, I propose that we move forward and accept
the net-new value that was previously disallowed.
…ation (#1709)

### What is the issue
Fixes: riptano/cndb#13847

### What does this PR fix and why was it fixed
We already make use of the `Token#getLongValue` method in SAI, which
only works for the Murmur partitioner, so I propose that we add a
symmetric method that allows us to create tokens without converting a
long to a byte buffer back to a long.
The number of rows stored in SSTable's SAI index was incorrect for
analyzed indexes and for indexes on collections as it was counting
posting lists for each term. In non-analyzed or non-collection indexes
there is a term per row, while in analyzed index there can be many terms
in a row. This led to incorrect count of rows.

Fixes counting number of rows stored in SSTable SAI index.

Minor fixes of code warnings in the affected files.
@github-actions
Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@djatnieks djatnieks requested a review from driftx May 29, 2025 16:56
@sonarqubecloud
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1780 rejected by Butler


16 new test failure(s) in 1 builds
See build details here


Found 16 new test failures

Showing only first 15 new test failures

Test Explanation Branch history Upstream history
...ompactionWriter_fiveShards[isReplicaAware=true] regression 🔴
...adCommitLogAndSSTablesWithDroppedColumnTestCC40 regression 🔴
...adCommitLogAndSSTablesWithDroppedColumnTestCC50 regression 🔴
...oadCommitLogAndSSTablesWithDroppedColumnTestDSE regression 🔴
...thRestartTest.testReadingValuesOfDroppedColumns regression 🔴
o.a.c.i.s.c.BM25Test.testCollections regression 🔴
...est.testOrderingSeveralSSTablesWithMapPredicate regression 🔴
...uldTolerateDuplicatedRowIDsAfterMemtableUpdates regression 🔴
...ectorTypeTest.newJVectorOptionsTestVersion2[ca] regression 🔴
...ectorTypeTest.newJVectorOptionsTestVersion2[dc] regression 🔴
...ectorTypeTest.newJVectorOptionsTestVersion4[ca] regression 🔴
...lushingTest.testFlushingLargeStaleMemtableIndex regression 🔴
...cySSTableTest.testVerifyOldDroppedTupleSSTables regression 🔴
o.a.c.t.SSTablePartitionsTest.testDirectory regression 🔴
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴

No known test failures found

@djatnieks
Copy link
Author

Reviewed the test failures that butler reported and all but one are already known (listed in the 5.0 Test Failures spreadsheet)

The new failure is ShardedMultiWriterTest.testShardedCompactionWriter but this has been previously reported in https://github.com/riptano/cndb/issues/12712 and the stacktraces match, so I don't think any action is needed here

junit.framework.AssertionFailedError: expected:<5> but was:<4>
	at org.apache.cassandra.db.compaction.unified.ShardedMultiWriterTest.testShardedCompactionWriter(ShardedMultiWriterTest.java:106)
	at org.apache.cassandra.db.compaction.unified.ShardedMultiWriterTest.testShardedCompactionWriter_fiveShards(ShardedMultiWriterTest.java:68)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

@djatnieks djatnieks merged commit 29ef34e into main-5.0 May 29, 2025
566 of 581 checks passed
@djatnieks djatnieks deleted the CNDB-14275-20250528-1 branch May 29, 2025 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants