Skip to content

CASSANDRA-21000 optionaly remove dropped columns from SSTable header when upgrading#4474

Open
smiklosovic wants to merge 1 commit intoapache:trunkfrom
smiklosovic:CASSANDRA-21000
Open

CASSANDRA-21000 optionaly remove dropped columns from SSTable header when upgrading#4474
smiklosovic wants to merge 1 commit intoapache:trunkfrom
smiklosovic:CASSANDRA-21000

Conversation

@smiklosovic
Copy link
Contributor

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@smiklosovic smiklosovic force-pushed the CASSANDRA-21000 branch 2 times, most recently from 4686cba to f8f2051 Compare November 14, 2025 13:29
@smiklosovic smiklosovic force-pushed the CASSANDRA-21000 branch 4 times, most recently from 6269a1c to cd5dfd0 Compare January 22, 2026 13:57
patch by Stefan Miklosovic; reviewed by TBD for CASSANDRA-21000
@blambov blambov self-requested a review January 23, 2026 08:59
Copy link
Contributor

@blambov blambov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have a more general solution where we store somewhere the list of columns that are actually present in the sstable. This will not be the serialization header, because we can't write without it, but we could write a second set to use on the next compaction.

However, I'm happy to go with this version as it should work well enough, and the general solution is something we could do in the compaction part of CEP-57.

ReusableLivenessInfo cellLiveness = cellCursor.cellLiveness;
DataOutputBuffer tempCellBuffer = null;

if (!metadata().regularAndStaticColumns().contains(cellCursor.cellColumn))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the cursor path doesn't drop data that is older than the drop time of the column?

If so, I think this is something we need to fix (i.e., in addition to this, adjust the applicable deletion time if the column is dropped and recreated).

We should pre-compute the intersection of the table's dropped columns with the ones in this file to avoid performing this contains lookup unnecessarily (most compactions should have an empty intersection).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants