Skip to content

Conversation

@cbornet
Copy link

@cbornet cbornet commented Dec 17, 2025

What is the issue

https://github.com/riptano/cndb/issues/16330

What does this PR fix and why was it fixed

This removes the lock in PartialLifecycleTransaction::trackNewWritten on the main transaction trackNewWritten as RemoteLogTransaction::trackNewWritten performs the upload of SSTables to remote storage which takes a long time and thus it leads to contention and prevents parallel upload to remote storage.

@github-actions
Copy link

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@blambov
Copy link

blambov commented Dec 17, 2025

I'm not sure removing the synchronization here is okay. Yes, some methods of the main transaction (including this one) are normally thread-safe, but most can go very wrong if we allow them to be called concurrently. At the very least we need quite a few comments in various places to ensure that implementations of the transaction interface are thread-safe for a selection of methods, which is a departure from the way they have been initially designed.

On the other hand, does compaction need to wait for the upload to complete before continuing work? This seems like a big waste of time, trackNewWritten could schedule the work instead of performing it.

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2181 rejected by Butler


3 regressions found
See build details here


Found 3 new test failures

Test Explanation Runs Upstream
o.a.c.distributed.test.PreviewRepairCoordinatorTimeoutTest.prepareRPCTimeout[SEQUENTIAL/false] REGRESSION 🔴 0 / 19
o.a.c.index.sai.cql.VectorCompaction100dTest.testZeroOrOneToManyCompaction[dc false] NEW 🔴 0 / 19
o.a.c.index.sai.cql.VectorSiftSmallTest.testCompaction[ca false] (compression) REGRESSION 🔴 0 / 19

Found 1 known test failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants