HCD-237: UCS settings json file cleanup fails#2145
Conversation
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). Fix: Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.
Checklist before you submit for review
|
|
There's sthg I am missing here. Dropping a table makes a node never start again? That makes no sense. What am I missing? What is the actual bug we're trying to fix? |
To be precise that's dropping a table with UCS. The test showcases the steps required, the only unordinary step it does it saving UCS settings on demand rather than waiting for it to be periodically saved by the background thread (https://github.com/datastax/cassandra/blob/main/src/java/org/apache/cassandra/db/compaction/CompactionManager.java#L180). |
|
Did you notice there are missing json files exceptions in the CNDB CI run? |
looks like different sort of files. To have more data, I have triggered CNDB CI again. |
The clause ensures we get a meaningful error message, to maintain the behaviour we rethrow the exception.
|
My CNDB CI has been cancelled 😭, I will try again. |
e0d2bcc to
bead4ff
Compare
|
❌ Build ds-cassandra-pr-gate/PR-2145 rejected by Butler2 regressions found Found 2 new test failures
Found 4 known test failures |
|
To satisfy the SonarQube I had to add the following test - bead4ff. |
### What is the issue UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). ### What does this PR fix and why was it fixed Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.
### What is the issue
UCS settings files are not dropped after the table gets dropped. Instead
they are supposed to be cleared after the node restart. The cleanup is
faulty though and it prevents the node from startup.
Root Cause:
The cleanupControllerConfig() method in CompactionManager attempts to
verify if a table exists by calling getColumnFamilyStore(). When the
table is dropped, this method throws IllegalArgumentException, which was
not being caught. The existing catch block only handled
NullPointerException (for missing keyspace).
### What does this PR fix and why was it fixed
Extended the exception handler to catch both NullPointerException and
IllegalArgumentException, allowing orphaned controller-config.JSON files
to be properly identified and deleted during node restart.
### What is the issue UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup. Root Cause: The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace). ### What does this PR fix and why was it fixed Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart. 5.0 counterpart of #2145.



What is the issue
UCS settings files are not dropped after the table gets dropped. Instead they are supposed to be cleared after the node restart. The cleanup is faulty though and it prevents the node from startup.
Root Cause:
The cleanupControllerConfig() method in CompactionManager attempts to verify if a table exists by calling getColumnFamilyStore(). When the table is dropped, this method throws IllegalArgumentException, which was not being caught. The existing catch block only handled NullPointerException (for missing keyspace).
What does this PR fix and why was it fixed
Extended the exception handler to catch both NullPointerException and IllegalArgumentException, allowing orphaned controller-config.JSON files to be properly identified and deleted during node restart.