Skip to content

Conversation

@michaelsembwever
Copy link
Member

https://github.com/riptano/cndb/issues/15582

Port into main-5.0 commit bc9628b

CNDB-15362: Replace SASI by a dummy implementation (https://github.com/datastax/cassandra/pull/2013)
Replace SASI by a no-op dummy implementation that
ignores writes and rejects reads. It's still allowed to create new SASI
indexes so old schemas keeps working, for example when restoring a
snapshot with SASI. However, they will be no-op with a clear client
warning about SASI not being supported nor functional anymore.

SASI has never been supported, so it's mostly dead code. This would
allow us to get rid of code that we are still partially maintaining,
spending resources on CI, build, etc.

This has already be done in DSE 6.9:
https://github.com/riptano/bdp/commit/358767f8ae21c91b8241bc26e98cf7f922daff8b

@github-actions
Copy link

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@michaelsembwever michaelsembwever force-pushed the mck-cndb-15582-main-5.0 branch 2 times, most recently from e1b7575 to 88d284d Compare October 19, 2025 20:37
@michaelsembwever
Copy link
Member Author

michaelsembwever commented Oct 20, 2025

opened a discussion about SASI removal in CC4/CC5 and how it impacts Cassandra upgrades to HCD.
ref: https://datastax.slack.com/archives/C025YAXGFPA/p1760940710163699

otherwise this PR is ready for review, and tests look good.

@michaelsembwever michaelsembwever force-pushed the mck-cndb-15582-main-5.0 branch 3 times, most recently from e945b88 to 218a291 Compare October 22, 2025 15:59
@michaelsembwever
Copy link
Member Author

michaelsembwever commented Oct 22, 2025

Note the change from main as discussed here: https://datastax.slack.com/archives/C025YAXGFPA/p1761145700669459?thread_ts=1760940710.163699&cid=C025YAXGFPA

The SASIIndex code has been removed, rather than noop dummied out. And if sasi_indexes_enabled is true then StartupCheck prevents the node starting. (does not apply within unit tests)

@michaelsembwever michaelsembwever force-pushed the mck-cndb-15582-main-5.0 branch 4 times, most recently from 236c15e to 2b18c72 Compare October 29, 2025 12:00
Copy link
Member

@JeremiahDJordan JeremiahDJordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way is fine to me as well. A couple suggested updates to the log messages.

Can we add tests run with the property set and unset trying to create SASI indexes and checking for the correct errors (or non-error).

Also, ideally we have an upgrade test that shows startup failing with the custom index in the schema, and not failing with the property set.

@michaelsembwever
Copy link
Member Author

Can we add tests run with the property set and unset trying to create SASI indexes and checking for the correct errors (or non-error).

Also, ideally we have an upgrade test that shows startup failing with the custom index in the schema, and not failing with the property set.

Absolutely (and that was my intent, sorry for not being clear about that :)

@michaelsembwever
Copy link
Member Author

squash commit updated. just upgrade test remaining.

@michaelsembwever
Copy link
Member Author

Copy link
Member

@JeremiahDJordan JeremiahDJordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I would let @adelapena give it a look at well.

@michaelsembwever
Copy link
Member Author

michaelsembwever commented Oct 31, 2025

upgrade test added in IndexUnknownIgnoreTest https://github.com/datastax/cassandra/pull/2074/files#diff-aaa7544ec9512536398b3aa7f39ea57170692f5b6d3f197d60f09e41b6100bde

There's a separate problem here that's impacting all main-5.0 jvm-dtest-upgrade, because it builds 5.0.4.0 while the apache/cassandra cassandra-5.0 branch builds 5.0.7. Many of the jvm-dtest upgrade tests only perform upgrades to the "latest", and that's the dtest-5.0.7.jar while the dtest-5.0.4.0.jar gets ignored.

I don't know where/how CC runs the jvm-dtest-upgrade testsuite, but when doing it like

.build/run-tests.sh -a jvm-dtest-upgrade -t IndexUnknownIgnoreTest

then .build/run-tests.sh:202 needs to change to (i.e. "cassandra-5.0" needs to be removed)

for branch in cassandra-4.0 cassandra-4.1 ; do

I'll put this in as a separate ticket and pr. (EDIT: https://github.com/riptano/cndb/issues/15885 )

Copy link

@adelapena adelapena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative way to deal with SASI removal works for me.

I would however add a more specific message than Unable to find custom indexer class 'org.apache.cassandra.index.sasi.SASIIndex', and would probably explicitly mention that SASI has been removed if the class is SASI, rather than acting as if we don't know what SASI is.

I think we also need to emit a client warning when a client tries to create a SASI index and it gets ignored, so users and client applications don't miss it. I would add the original SASIIndexText from CNDB-15362 and modify it to test the behaviour of creating, querying and dropping a SASI index with the two possible values of the new INDEX_UNKNOWN_IGNORE flag.

Comment on lines 222 to 227
if (isUnknownCustomIndexCreateStatement(className) && INDEX_UNKNOWN_IGNORE.getBoolean())
{
logger.error("Cannot find index type {}, but '{}' is true so ignoring index {} creation",
className, INDEX_UNKNOWN_IGNORE.getKey(), indexName);
return schema;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A CQL client trying to create a SASI index will get a generic error. I think it would be better to check if the class name is SASI and throw a specific message about SASI removal.

More importantly, if the flag to ignore unknown errors is set, a CQL client won't receive any warning about the statement being ignored. I think we should throw a client warning too so clients get notified.

We should add a test extending CQLTester to verify the behaviour of attempting creating SASI indexes in both cases (ignored and rejected).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a test extending CQLTester to verify the behaviour of attempting creating SASI indexes in both cases (ignored and rejected).

already exists in CreateIndexStatementTest

I think we should throw a client warning too so clients get notified.

done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this conversation is resolved. My first suggestion about this was to throw a specific message about SASI removal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really needed?

It should be obvious, given the operator has had to already set the flag (to load a schema that's otherwise causing errors). That SASI has been removed really is intuitive enough at that point IMO. This is not going to be an error message that the operator suddenly comes across.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also note, this will be properly documented in the HCD docs.

Copy link
Member Author

@michaelsembwever michaelsembwever Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the proposed change to applying the ignore flag only to restoring cql schemas in commit: 3e34934

SQUASH – don't apply INDEX_UNKNOWN_IGNORE flag to upgrades

ignoring unknown custom indexes (e.g. upgrades) found in system_schema.indexes will not remove them from system_schema.indexes
being able to restore system_schema tables across major versions or products is by default not supported, and if we want to support this we should do it in a separate ticket and for all and any unknown custom indexes.

keep the ability to ignore unknown custom indexes when applying a schema (i.e. cql DDL statements).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking the node startup and requesting to downgrade and drop the index in the old version, without a workaround

My preference would be that we allow the upgrade to continue, and the user can drop the index after upgrading. Failing startup is often painful all around for everyone, including our support team, when the end user can't figure out in the tons of logs we spit out which one is telling them why startup failed.

I think just ignoring the index without loading it, and not having a way for the user to drop it, is no good.

@adelapena 's suggestion to use a dummy implementation when the flag is set seems reasonable to be, it would mean the index could load such that it could then be later dropped.

and to repeat: we should align the code and the UX to how we have handled removing dse custom indexes.

I agree we should treat all the removed custom indexes the same. But I am perfectly happy to change how the dse ones are handled as well to make it match what we decide is the right thing for SASI removal.

Copy link
Member Author

@michaelsembwever michaelsembwever Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just ignoring the index without loading it, and not having a way for the user to drop it, is no good.

In what situation would this be now ?

We don't support loading system_schema data across major versions or products. This is likely going to break anyway (for other system_schema incompatibilities/changes)?

I don't see a situation where unexpectedly bringing down a cluster during an upgrade is acceptable, because a index is unexpectedly now doing full table scans– so long as there are ways to get nodes started…

Copy link
Member Author

@michaelsembwever michaelsembwever Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add the dummy implementation for when the ignore flag is used and we hit an index in system_schema .index

But FTR, I'm not sold this is a valid operator scenario, and if it is only because we've failed to document and guardrail how HCD/MC is used (and what backups cross-version and product are supported).

Copy link
Member Author

@michaelsembwever michaelsembwever Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last commit is replaced accordingly.

SQUASH – INDEX_UNKNOWN_IGNORE on upgrades uses NoopIndex

ignoring unknown custom indexes (e.g. upgrades) found in system_schema.indexes will not remove them from system_schema.indexes
while being able to restore system_schema tables across major versions or products is by default not supported, we want a safety catch to get nodes up and running first so to be able to drop the unknown index afterwards.

the use of INDEX_UNKNOWN_IGNORE in SecondaryIndexManager (when reading system_schema.index) is different to CreateIndexStatement (executing cql CREATE INDEX…) is that it will create the index but of type NoopIndex, rather than silently ignore its creation.

I've created the dummy impl separately, in NoopIndex. This makes future rebases simpler. It is used in SecondaryIndexManager for any unidentified custom index when -Dcassandra.index.unknown_custom_class.ignore=true (INDEX_UNKNOWN_IGNORE) has also be set.

I don't believe the custom index class not matching the instantiated class is an issue, letting tests run first.

adelapena and others added 4 commits November 4, 2025 15:21
SASI has never been supported, so it's mostly dead code. This would
allow us to get rid of code that we are still partially maintaining,
spending resources on CI, build, etc.

This has already be done in DSE 6.9:
riptano/bdp@358767f

Rebase notes:
- previously a dummy noop impl existed, now it's altogether removed. this means the operator must first adjust the schema (drop the index) before running a node on this version. this provides clearer UX about the subsequent lack of behaviour. a node won't even start (StartupChecks) if sasi_indexes_enabled is true.
…schema (or create index statement)

- fails-fast by default: tells the operator there's no index and things are not going to work, or work in unexpected ways (that could impact cluster availability),
- permits an override so the custom index in a schema can be silently ignored, using a sys property -Dcassandra.index.unknown_custom_class.ignore=true ,
- this is also a similar approach to how we handle the unknown dse index classes
@adelapena
Copy link

A curious effect of ignoring old/unknown indexes is that I think there will still be an entry for them in system_schema.indexes even if they are not loaded. Not sure what effect that mismatch between the system table and what is actually loaded will have. Anyway, we will keep seeing the warning on every node startup until the index is removed, and any other side effect of keeping the index on the system table will remain until that entry is cleared and the index is actually removed.

Can we add a check on the upgrade test to see what the system schema table contains (SELECT * FROM system_schema.indexes WHERE keyspace_name = 'distributed_test_keyspace' AND table_name = 'tbl'), and verify that we can clear it with DROP INDEX after the upgrade?

@michaelsembwever
Copy link
Member Author

michaelsembwever commented Nov 4, 2025

A curious effect of ignoring old/unknown indexes is that I think there will still be an entry for them in system_schema.indexes even if they are not loaded.

This problem exists for the dse custom indexes too… I think we can and should follow this up separately. (actually it doesn't…)

A dse custom index currently (in main-5.0) stops the first node from starting. And the operator is forced to drop it before trying to upgrade the first node again. This remains my recommended approach for both upgrades and backup restores for any index no longer supported.

This changed in this PR when we added the flag to ignore. (And yes, system_schema.indexes keeps the potentially problematic row entry in system_schema.indexes )

I think now the right thing to do is: the ignore flag should only apply to explicit DDL, not to upgrades (and what's in system_schema.indexes already).
This forces the recommended solution on the operator: the first node fails, so they drop the index and continue with the upgrade.

If we have users that want indexes to randomly work during an upgrade we can tackle that later IMHO.

@michaelsembwever michaelsembwever force-pushed the mck-cndb-15582-main-5.0 branch 3 times, most recently from ffc7aa5 to 3e61c19 Compare November 11, 2025 15:14
ignoring unknown custom indexes (e.g. upgrades) found in system_schema.indexes will not remove them from system_schema.indexes
 while being able to restore system_schema tables across major versions or products is by default not supported, we want a safety catch to get nodes up and running first so to be able to drop the unknown index afterwards.

the use of INDEX_UNKNOWN_IGNORE in SecondaryIndexManager (when reading system_schema.index) is different to CreateIndexStatement (executing cql `CREATE INDEX…`) is that it will create the index but of type NoopIndex, rather than silently ignore its creation.
Copy link

@adelapena adelapena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. I have left only a few minor suggestions. I think the only loose end would be deciding the default value of cassandra.index.unknown_custom_class.ignore.

INDEX_SUMMARY_EXPECTED_KEY_SIZE("cassandra.index_summary_expected_key_size", "64"),
/** Set to true for `create custom index` cql statements on unknown index classes to be ignored rather than error,
* and for existing entries in `system_schema.index` of unknown index classes to silently use the NoopIndex. */
INDEX_UNKNOWN_IGNORE("cassandra.index.unknown_custom_class.ignore"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeremiahDJordan what do you think should be the default for this? Fail the node startup and reject index creation queries on SASI by default, or ignoring them?

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
75.9% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2074 rejected by Butler


34 regressions found
See build details here


Found 34 new test failures

Showing only first 15 new test failures

Test Explanation Runs Upstream
o.a.c.cql3.validation.entities.SecondaryIndexTest.testAllowFilteringOnPartitionKeyWithSecondaryIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.AggregationQueriesTest.testAggregationQueryShouldNotTimeoutWhenItExceedesReadTimeout (compression) REGRESSION 🔴🔴 2 / 17
o.a.c.cql3.validation.operations.SelectMultiColumnRelationTest.testMultipleClusteringWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectMultiColumnRelationTest.testMultiplePartitionKeyAndMultiClusteringWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectSingleColumnRelationTest.testCompositeIndexWithPrimaryKey (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectSingleColumnRelationTest.testMultiplePartitionKeyWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectSingleColumnRelationTest.testRangeQueryOnIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectTest.testListContainsWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectTest.testMapKeyContainsWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.cql3.validation.operations.SelectTest.testMapValueContainsWithIndex (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.db.filter.IndexHintsTest.testDuplicatedHints (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.db.filter.IndexHintsTest.testLegacy (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.db.filter.IndexHintsTest.testLegacyIndexWithoutAllowFiltering (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.db.filter.IndexHintsTest.testSAIWithoutAllowFiltering (compression) REGRESSION 🔵🔴 0 / 17
o.a.c.distributed.test.jmx.JMXFeatureTest.testOneNetworkInterfaceProvisioning REGRESSION 🔴🔵 0 / 17

Found 6 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants