Skip to content

CNDB-13689: use NodeQueue::pushMany to decrease time complexity to build heap #1693

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

michaeljmarshall
Copy link
Member

@michaeljmarshall michaeljmarshall commented Apr 11, 2025

What is the issue

Fixes: https://github.com/riptano/cndb/issues/13689

What does this PR fix and why was it fixed

This PR utilizes the NodeQueue::pushMany method to decrease the time complexity required to build the NodeQueue from O(n log(n)) to O(n). This is likely only significant for sufficiently large hybrid queries. For example, we have seen cases of the search producing 400k rows, which means that we do 400k insertions into these NodeQueue objects.

@michaeljmarshall michaeljmarshall self-assigned this Apr 11, 2025
Copy link

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@michaeljmarshall
Copy link
Member Author

Looks like I caught a bug in the initial implementation. Fix proposed: datastax/jvector#433.

@michaeljmarshall michaeljmarshall marked this pull request as draft April 11, 2025 21:05
@michaeljmarshall michaeljmarshall changed the title CNDB-13689: SAI: use NodeQueue::pushAll to decrease time complexity to build heap CNDB-13689: use NodeQueue::pushMany to decrease time complexity to build heap Apr 18, 2025
@michaeljmarshall michaeljmarshall marked this pull request as ready for review April 18, 2025 15:34
@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1693 rejected by Butler


3 new test failure(s) in 1 builds
See build details here


Found 3 new test failures

Test Explanation Branch history Upstream history
...gLegacyIndex.test_sstableloader_with_failing_2i regression 🔴 🔵🔵🔵🔵🔵🔵🔵
...testWithRangeTombstoneMarkersWithoutCompression regression 🔴 🔵🔵🔵🔵🔵🔵🔵
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴 🔵🔵🔵🔵🔵🔵🔵

Found 1 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants