Skip to content

CNDB 14808 #1876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 16, 2025
Merged

CNDB 14808 #1876

merged 1 commit into from
Jul 16, 2025

Conversation

driftx
Copy link

@driftx driftx commented Jul 15, 2025

This ports CNDB-14317 to main-5.0, but it is based off of CNDB-14791 which is needed.

Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

…1789)

- **CNDB-14317: Optimize doc freq computation for memtable BM25
queries**
- **Update approximateTotalTermCount() javadoc**

### What is the issue
Fixes riptano/cndb#14317

### What does this PR fix and why was it fixed
We optimize the in memory BM25 computation by using the trie to get the
number of rows matching a query term. This change removes a memtable
scan and analyze by replacing it with calls to the index to get the
number of docs. There are no new tests because it is expected to
maintain the same semantics. I will review the test coverage to verify
that assertion.
@driftx driftx merged commit 23c854b into main-5.0 Jul 16, 2025
6 of 173 checks passed
@driftx driftx deleted the CNDB-14808 branch July 16, 2025 15:37
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1876 rejected by Butler


35 new test failure(s) in 2 builds
See build details here


Found 35 new test failures

Showing only first 15 new test failures

Test Explanation Branch history Upstream history
...lidation.operations.AlterTest-compression_jdk11 regression 🔴🔴
...nQueryShouldNotTimeoutWhenItExceedesReadTimeout regression 🔴🔴
...nglePageReadIsFastButAggregationExceedesTimeout regression 🔴🔴
...adCommitLogAndSSTablesWithDroppedColumnTestCC50 regression 🔴🔴
...oadCommitLogAndSSTablesWithDroppedColumnTestDSE regression 🔴🔴
...thRestartTest.testReadingValuesOfDroppedColumns regression 🔴🔴
...d.t.s.VectorDistributedTest.rangeRestrictedTest regression 🔴🔵
o.a.c.d.t.s.f.FeaturesVersionSupportDBTest.testANN regression 🔴🔴
o.a.c.d.t.s.f.FeaturesVersionSupportDCTest.testANN regression 🔴🔴
o.a.c.d.t.s.f.FeaturesVersionSupportEBTest.testANN regression 🔴🔴
...c.FeaturesVersionSupportTest.testANNSupport[eb] regression 🔴🔴
....FeaturesVersionSupportTest.testGeoDistance[aa] regression 🔴🔴
....FeaturesVersionSupportTest.testGeoDistance[ba] regression 🔴🔴
...cySSTableTest.testVerifyOldDroppedTupleSSTables regression 🔴🔴
...m.TrieMemtableMetricsTest.testContentionMetrics regression 🔴🔵

Found 2 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants