Skip to content

Conversation

@adelapena
Copy link

What is the issue

Redaction of queried values in CQL queries printed to logs (added by CNDB-15280) benefits privacy and query aggregation,
but it might lose some useful information when analyzing logs. Probably the biggest miss is whether a queried value is large or small.

What does this PR fix and why was it fixed

Add hints about value size when redacting column values in CQL queries printed to logs. No size hints will be included for values smaller than 100 bytes, or for values of fixed-size data types (e.g., int, UUID, timestamp).

Size hints are provided in logarithmic buckets (e.g., ">100B", ">1KiB", ">10KiB", ">100KiB") to give a rough indication of data size while maintaining privacy.

Add hints about value size when redacting column values in CQL queries printed to logs.
No size hints will be included for values smaller than 100 bytes, or for values of
fixed-size data types (e.g., int, UUID, timestamp).

Size hints are provided in logarithmic buckets (e.g., ">100B", ">1KiB", ">10KiB", ">100KiB")
to give a rough indication of data size while maintaining privacy.
@adelapena adelapena self-assigned this Nov 10, 2025
@github-actions
Copy link

github-actions bot commented Nov 10, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@adelapena adelapena marked this pull request as draft November 10, 2025 13:23
@adelapena adelapena marked this pull request as ready for review November 10, 2025 14:21
* @return a redacted string representation, either "?" or "?[size_hint]"
*/
public static String redact(@Nullable ByteBuffer bytes, boolean isValueLengthFixed)
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Highly unlikely to have negative value but in case of some weird bug, I would say - let's add some precondition check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an assertion.

@adelapena
Copy link
Author

The failure of PartitionIndexEncryptedTest.testDeepRecursion seems unrelated, and I cannot reproduce it locally. I think it's scary enough to open an issue after this new CI run.

@sonarqubecloud
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2114 rejected by Butler


1 regressions found
See build details here


Found 1 new test failures

Test Explanation Runs Upstream
o.a.c.io.sstable.format.trieindex.PartitionIndexEncryptedTest.testDeepRecursion[0] (compression) REGRESSION 🔵🔴 0 / 16

Found 1 known test failures

@ekaterinadimitrova2
Copy link

The failure of PartitionIndexEncryptedTest.testDeepRecursion seems unrelated, and I cannot reproduce it locally. I think it's scary enough to open an issue after this new CI run.

Agreed, let's open a ticket and flag it for @pkolaczk in case it appears also on the release branch.

Copy link

@ekaterinadimitrova2 ekaterinadimitrova2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you.

@adelapena adelapena merged commit 62a1ee1 into main Nov 12, 2025
496 of 497 checks passed
@adelapena adelapena deleted the CNDB-15807-main-redaction-clues branch November 12, 2025 11:15
@adelapena
Copy link
Author

Thanks for the review :)

The failure of PartitionIndexEncryptedTest.testDeepRecursion seems unrelated, and I cannot reproduce it locally. I think it's scary enough to open an issue after this new CI run.

Agreed, let's open a ticket and flag it for @pkolaczk in case it appears also on the release branch.

I've created https://github.com/riptano/cndb/issues/15978.

michaelsembwever pushed a commit that referenced this pull request Dec 9, 2025
…lues (#2114)

Add hints about value size when redacting column values in CQL queries printed to logs.
No size hints will be included for values smaller than 100 bytes, or for values of
fixed-size data types (e.g., int, UUID, timestamp).

Size hints are provided in logarithmic buckets (e.g., ">100B", ">1KiB", ">10KiB", ">100KiB")
to give a rough indication of data size while maintaining privacy.
michaelsembwever pushed a commit that referenced this pull request Dec 9, 2025
…lues (#2114)

Add hints about value size when redacting column values in CQL queries printed to logs.
No size hints will be included for values smaller than 100 bytes, or for values of
fixed-size data types (e.g., int, UUID, timestamp).

Size hints are provided in logarithmic buckets (e.g., ">100B", ">1KiB", ">10KiB", ">100KiB")
to give a rough indication of data size while maintaining privacy.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants