-
Notifications
You must be signed in to change notification settings - Fork 21
HCD-237: UCS settings json file cleanup fails #2145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). Fix: Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.
Checklist before you submit for review
|
|
There's sthg I am missing here. Dropping a table makes a node never start again? That makes no sense. What am I missing? What is the actual bug we're trying to fix? |
To be precise that's dropping a table with UCS. The test showcases the steps required, the only unordinary step it does it saving UCS settings on demand rather than waiting for it to be periodically saved by the background thread (https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L180). |
|
Did you notice there are missing json files exceptions in the CNDB CI run? |
looks like different sort of files. To have more data, I have triggered CNDB CI again. |
The clause ensures we get a meaningful error message, to maintain the behaviour we rethrow the exception.
|
My CNDB CI has been cancelled 😭, I will try again. |
e0d2bcc to
bead4ff
Compare
|
❌ Build ds-cassandra-pr-gate/PR-2145 rejected by Butler2 regressions found Found 2 new test failures
Found 4 known test failures |
|
To satisfy the SonarQube I had to add the following test - bead4ff. |
### What is the issue UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). ### What does this PR fix and why was it fixed Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.



What is the issue
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup.
Root Cause:
The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace).
What does this PR fix and why was it fixed
Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.