forked from apache/cassandra
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CNDB-11613 SAI compressed indexes #1474
Open
pkolaczk
wants to merge
10
commits into
main
Choose a base branch
from
c11613-sai-compression
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pkolaczk
force-pushed
the
c11613-sai-compression
branch
from
January 20, 2025 15:43
b6c7de2
to
cc3605e
Compare
pkolaczk
force-pushed
the
c11613-sai-compression
branch
from
January 21, 2025 12:09
cc3605e
to
93bbdb4
Compare
What was ported: - current compaction throughput measurement by CompactionManager - exposing current compaction throughput in StorageService and CompactionMetrics - nodetool getcompactionthroughput, including tests Not ported: - changes to `nodetool compactionstats`, because that would require porting also the tests which are currently missing in CC and porting those tests turned out to be a complex task without porting the other changes in the CompactionManager API - Code for getting / setting compaction throughput as double
This commit introduces a new AdaptiveCompressor class. AdaptiveCompressor uses ZStandard compression with a dynamic compression level based on the current write load. AdaptiveCompressor's goal is to provide similar write performance as LZ4Compressor for write heavy workloads, but a significantly better compression ratio for databases with a moderate amount of writes or on systems with a lot of spare CPU power. If the memtable flush queue builds up, and it turns out the compression is a significant bottleneck, then the compression level used for flushing is decreased to gain speed. Similarly, when pending compaction tasks build up, then the compression level used for compaction is decreased. In order to enable adaptive compression: - set `-Dcassandra.default_sstable_compression=adaptive` JVM option to automatically select `AdaptiveCompressor` as the main compressor for flushes and new tables, if not overriden by specific options in cassandra.yaml or table schema - set `flush_compression: adaptive` in cassandra.yaml to enable it for flushing - set `AdaptiveCompressor` in Table options to enable it for compaction Caution: this feature is not turned on by default because it may impact read speed negatively in some rare cases. Fixes riptano/cndb#11532
Reduces some overhead of setting up / tearing down those contexts that happened inside the calls to Zstd.compress / Zstd.decompress. Makes a difference with very small chunks. Additionally, added some compression/decompression rate metrics.
Index compression options have been split into key_compression and value_compression so you can write: CREATE INDEX ON tab(v) WITH key_compression = { 'class': 'LZ4Compressor' } AND value_compression = { 'class': 'LZ4Compressor' }
pkolaczk
force-pushed
the
c11613-sai-compression
branch
from
January 27, 2025 16:27
6969bb1
to
5415fd8
Compare
Quality Gate passedIssues Measures |
❌ Build ds-cassandra-pr-gate/PR-1474 rejected by Butler1 new test failure(s) in 6 builds Found 1 new test failures
Found 232 known test failures |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the issue
SAI Indexes consume too much storage space, sometimes.
What does this PR fix and why was it fixed
This PR allows to compress both the per-sstable and per-index components of SAI.
Use
index_compression
table param to control the per-sstable components compression.Use
compression
property on the index to control the per-index components compression.Checklist before you submit for review
NoSpamLogger
for log lines that may appear frequently in the logs