Skip to content

CNDB-17010: Fix CC4→CC5 memtable configuration loss during upgrade#2261

Open
djatnieks wants to merge 4 commits intomain-5.0from
CNDB-17010
Open

CNDB-17010: Fix CC4→CC5 memtable configuration loss during upgrade#2261
djatnieks wants to merge 4 commits intomain-5.0from
CNDB-17010

Conversation

@djatnieks
Copy link
Member

What is the issue

CNDB-17010

What does this PR fix and why was it fixed

CC4 stored the memtable column in system_schema.tables as frozen<map<text, text>>, while CC5 uses text. During upgrades, binary-serialized map data is misinterpreted as UTF-8 text, causing memtable configurations to fall back to defaults.

Changes:

  • Add MemtableParams.getWithCC4Fallback() to detect and parse binary map data using null-byte heuristic
  • Add mapCC4ClassNameToCC5Key() to map CC4 class names (TrieMemtable, SkipListMemtable) to CC5 config keys
  • Update SchemaKeyspace.createTableParamsFromRow() to use new compatibility method

CC4 stored the memtable column in system_schema.tables as
frozen<map<text, text>>, while CC5 uses text. During upgrades,
binary-serialized map data is misinterpreted as UTF-8 text,
causing memtable configurations to fall back to defaults.

Changes:
- Add MemtableParams.getWithCC4Fallback() to detect and parse
  binary map data using null-byte heuristic
- Add mapCC4ClassNameToCC5Key() to map CC4 class names
  (TrieMemtable, SkipListMemtable) to CC5 config keys
- Update SchemaKeyspace.createTableParamsFromRow() to use
  new compatibility method
@github-actions
Copy link

github-actions bot commented Mar 7, 2026

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@djatnieks djatnieks requested a review from driftx March 9, 2026 17:03
@djatnieks djatnieks marked this pull request as ready for review March 9, 2026 17:12
@djatnieks
Copy link
Member Author

I think that tables created with CC5 and the text memtable column could cause problems if that node is then downgraded back to CC4. I don't think the SCM setting handles this and not sure if it should.

@driftx
Copy link

driftx commented Mar 10, 2026

I think we could do this with SCM at the schema level rather than at read time, such that we maintain the CC4 map approach when SCM is < 5 and then upgrade it away from the map when SCM is changed. This would be similar (but more involved) to the way we delay other schema changes with SCM now, and allows us to avoid the byte sniffing to detect the correct format. WDYT?

@djatnieks
Copy link
Member Author

I think we could do this with SCM at the schema level rather than at read time, such that we maintain the CC4 map approach when SCM is < 5 and then upgrade it away from the map when SCM is changed. This would be similar (but more involved) to the way we delay other schema changes with SCM now, and allows us to avoid the byte sniffing to detect the correct format. WDYT?

Yes, that would be much better - and resolve the downgrade concern. I overlooked existing use of SCM to delay schema changes initially, but see them there now and give that at try.

CC5 changed the memtable column in system_schema.tables/views from
frozen<map<text, text>> (CC4) to text (CC5). This prevents safe
downgrades from CC5 to CC4 when storage_compatibility_mode is set.

This commit adds bidirectional compatibility:
- Reading: Previous commit (abdbcf1) handles CC4→CC5 upgrades
- Writing: This commit handles CC5→CC4 downgrades
@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
73.4% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2261 rejected by Butler


7 regressions found
See build details here


Found 7 new test failures

Test Explanation Runs Upstream
o.a.c.cql3.SimpleQueryTest.test2ndaryIndexBug (compression) REGRESSION 🔵🔴 0 / 21
o.a.c.distributed.test.hostreplacement.HostReplacementTest.replaceAliveHost REGRESSION 🔴🔵 0 / 21
o.a.c.distributed.test.hostreplacement.HostReplacementTest.replaceDownedHost REGRESSION 🔴🔵 0 / 21
o.a.c.distributed.test.hostreplacement.HostReplacementTest.retryingFailedReplaceWithNodeInHibernateState REGRESSION 🔴🔵 0 / 21
o.a.c.distributed.test.hostreplacement.HostReplacementTest.seedGoesDownBeforeDownHost REGRESSION 🔴🔵 0 / 21
o.a.c.distributed.test.repair.ForceRepairTest.terminated successfully () NEW 🔴 4 / 21
o.a.c.index.sai.cql.VectorCompaction100dTest.testZeroOrOneToManyCompaction[db true] () NEW 🔴 0 / 21

Found 3 known test failures

@djatnieks djatnieks requested a review from driftx March 12, 2026 01:10

// Validate and map the configuration key to CC4 class name
String className = mapCC5KeyToCC4ClassName(configurationKey);
return ImmutableMap.of("class", className);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we losing the memtable parameters if we just return the class?

private static final List<TableMetadata> ALL_TABLE_METADATA = ImmutableList.of(
Keyspaces,
// Use legacy schema (frozen<map>) when in CC_4 compatibility mode to support downgrade
DatabaseDescriptor.getStorageCompatibilityMode().isBefore(CassandraVersion.CASSANDRA_5_0.major) ? TablesLegacy : Tables,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this in static init I don't think we'll be able to change SCM later. Maybe this won't get accessed again but I'm not certain.

return "PersistentMemoryMemtable";
default:
// For unknown types, capitalize first letter
return configKey.substring(0, 1).toUpperCase() + configKey.substring(1) + "Memtable";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering 'TrieMemtableStage1' doesn't end in 'Memtable', maybe we should pull the definition from CONFIGURATION_DEFINITIONS and get the class_name from that instead of synthesizing?

Replace hardcoded switch statement with dynamic lookup from
CONFIGURATION_DEFINITIONS. Get actual class name from
definition.class_name instead of synthesizing it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants