Skip to content

[WIP] CNDB-15608 port CASSANDRA-18673 to reduce disk usage of row-aware indexes#2122

Open
k-rus wants to merge 26 commits intomainfrom
rf-15608-port-reduce-sai-size
Open

[WIP] CNDB-15608 port CASSANDRA-18673 to reduce disk usage of row-aware indexes#2122
k-rus wants to merge 26 commits intomainfrom
rf-15608-port-reduce-sai-size

Conversation

@k-rus
Copy link
Member

@k-rus k-rus commented Nov 13, 2025

It removes usage of Trie for primary key component, which requires additional adjustments. Also it might not be really acceptable.
There are many more small issue to fix, see the original issue https://github.com/riptano/cndb/issues/15608

What is the issue

Fixes https://github.com/riptano/cndb/issues/15608

What does this PR fix and why was it fixed

This port brings the patch from AFS without much changes or refactoring, i.e., I haven't simplified or refactored the original code coming from Apache, and complexity of the code remains the same as the original patch and as CC's relevant code.

The patch implements new disk format for SAI, which changes how row aware primary key map is stored in components. The primary key map is split into storing partition key map and clustering key map in separate components. Both use Key Store coming from Apache, which replaces the sorted terms structure of row aware primary key maps. As result primary keys are not stored in ordered way and partition keys are stored without token prefix to allow better compression. Clustering keys are sorted lexicographically within a partition.

KeyLookup uses different structure than SortedTerms. This required to implement ceiling and floor methods in those structures, e.g., in LondArray implementations. My understanding is that ceiling and floor methods are used for sorting and ANN. This doesn't exist in Apache.

Because of specific case for clustering it was necessary to propagate and store a flag if table is with clustering and clustering comparator into index components and index descriptor.

Other things got in with the port of the patch:

  • Cleanup on file handle creation failure
  • Replace hasEmptyClustering methods with hasClustering, so it's consistently used only hasClustering

@k-rus k-rus changed the title [WIP] CNDB-15609 test SAI disk size for all versions [WIP] CNDB-15608 port CASSANDRA-18673 to reduce disk usage of row-aware indexes Nov 14, 2025
@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 3 times, most recently from 0977619 to a95a292 Compare December 8, 2025 15:43
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
4 New Blocker Issues (required ≤ 1)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 2 times, most recently from d996761 to ecfc957 Compare January 22, 2026 16:41
@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 2 times, most recently from cb56ff5 to 83d23bb Compare February 6, 2026 08:50
@k-rus
Copy link
Member Author

k-rus commented Feb 13, 2026

I run latte's read_range workflow and the result doesn't show any degradation in performance:

CYCLE LATENCY for run [ms]  ════════════════════════════════════════════════════════════════════════════════════════════════
                            ───────────── A ─────────────  ────────────── B ────────────     Change     P-value  Signif.
          Min                   2.464 ± 1.426                  2.726 ± 1.367                 +10.6%     0.69396       
           25                   5.603 ± 0.759                  5.587 ± 0.160                  -0.3%     0.95937       
           50                   6.156 ± 0.890                  6.185 ± 0.329                  +0.5%     0.95120       
           75                   6.681 ± 1.701                  6.742 ± 0.251                  +0.9%     0.90385       
           90                   7.799 ± 11.201                 7.569 ± 0.929                  -2.9%     0.94156       
           95                   8.970 ± 24.616                 8.716 ± 2.735                  -2.8%     0.97121       
           98                  18.711 ± 24.999                12.394 ± 9.883                 -33.8%     0.44850       
           99                  23.593 ± 24.999                17.859 ± 11.792                -24.3%     0.49810       
           99.9                35.422 ± 24.999                29.884 ± 27.109                -15.6%     0.61899       
           99.99               61.309 ± 24.999                41.484 ± 27.109                -32.3%     0.07564       
          Max                  87.097 ± 24.999                55.476 ± 27.109                -36.3%     0.00488  *    

A is the base reference performance of the current main
B is the performance on this patch

I run also another and more complicated workflow and there the result wasn't bad, but wasn't so positive as this one:

CYCLE LATENCY for run [ms]  ════════════════════════════════════════════════════════════════════════════════════════════════
                            ───────────── A ─────────────  ────────────── B ────────────     Change     P-value  Signif.
          Min                   3.731 ± 1.218                  3.580 ± 1.289                  -4.1%     0.72420       
           25                   6.959 ± 0.615                  6.779 ± 0.358                  -2.6%     0.32725       
           50                   7.561 ± 0.896                  7.139 ± 0.451                  -5.6%     0.15915       
           75                   8.765 ± 3.964                  7.799 ± 2.820                 -11.0%     0.58951       
           90                  13.566 ± 44.548                12.026 ± 9.505                 -11.4%     0.92162       
           95                  29.721 ± 82.163                15.884 ± 47.058                -46.6%     0.61302       
           98                  72.417 ± 79.383                29.229 ± 122.529               -59.6%     0.38187       
           99                  80.019 ± 79.383                77.988 ± 122.389                -2.5%     0.96604       
           99.9               108.069 ± 79.383               134.611 ± 122.389               +24.6%     0.59101       
           99.99              135.135 ± 79.383               181.010 ± 122.389               +33.9%     0.35219       
          Max                 187.826 ± 79.383               252.314 ± 122.389               +34.3%     0.19236   

This workflow is not published yet.

@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 4 times, most recently from c331117 to 996a373 Compare February 20, 2026 13:17
@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 3 times, most recently from 497e319 to ff9c875 Compare February 26, 2026 14:41
@k-rus
Copy link
Member Author

k-rus commented Mar 3, 2026

I run this nightly build and it seems that the failed tests are the same as in main.

@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch 2 times, most recently from 7428b38 to f237404 Compare March 16, 2026 09:16
@k-rus k-rus force-pushed the rf-15608-port-reduce-sai-size branch from a67c8d6 to 471cf20 Compare March 16, 2026 10:34
@k-rus
Copy link
Member Author

k-rus commented Mar 18, 2026

Majority of Sonar warnings come from the existing code and not introduced in this PR. One warning is for refactoring a large method, which originates from Apache's patch.
Thus, no plans to address them.

@sonarqubecloud
Copy link

@cassci-bot
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants