Skip to content

Conversation

@lesnik2u
Copy link

@lesnik2u lesnik2u commented Nov 24, 2025

What is the issue

https://github.com/riptano/cndb/issues/15360

What does this PR fix and why was it fixed

This PR adds the ability to inspect hot and cold entries in ChunkCache to better understand cache dynamics and identify which files and sections are most useful for caching.

Changes

New ChunkCache methods:

  • Added inspectEntries() method to iterate over cache entries ordered by access frequency
  • Added InspectEntriesOrder enum (HOTTEST/COLDEST) to specify ordering by access frequency
  • Added ChunkCacheInspectionEntry class containing file, position, and size metadata for each cached chunk
  • Leverages Caffeine's policy().eviction().hottest()/coldest() APIs to efficiently retrieve ordered entries without materializing full cache in memory
  • Consumer pattern avoids creating large in-memory lists

Use Cases

  • Identify which files/sections benefit most from caching vs. eviction candidates
  • Compare access patterns between different file types (partition index vs. data, SAI token files vs. kdtree files)
  • Debug cache efficiency and inform cache sizing decisions
  • Complement predictive caching metrics with lower-level cache behavior insights

Testing

Added comprehensive parameterized unit tests covering:

  • Multiple files with different limits
  • Both HOTTEST and COLDEST ordering
  • Edge cases (empty cache, zero limit, disabled cache)
  • Verification that all cached files appear when limit exceeds cache size

@lesnik2u lesnik2u changed the title [WIP] CNDB-15360 Chunk Cache inspection [WIP] CNDB-15360 Chunk Cache inspection POC Nov 24, 2025
@github-actions
Copy link

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

@lesnik2u lesnik2u marked this pull request as ready for review November 26, 2025 11:42
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left a couple of suggestions, you are on your way.

Let's add unit tests, there are some tests about ChunkCache, you can add them there.

We don't really care about the implementation of Caffeine, you will hardly have a deterministic behavior for ordering, for the unit tests it is just enough to ensure that if there is something that it is returned correctly, in any order

@lesnik2u lesnik2u force-pushed the CNDB-15360-ChunkCache branch from 1fa7663 to 1c9a057 Compare November 28, 2025 13:05
@lesnik2u lesnik2u changed the title [WIP] CNDB-15360 Chunk Cache inspection POC CNDB-15360 Chunk Cache inspection POC Dec 1, 2025
private long assignFileId(File file)
{
return nextFileId.getAndIncrement();
long id = nextFileId.getAndIncrement();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: revert unneeded change

* @param limit maximum number of entries to inspect
* @param consumer consumer to process each entry
*/
public void inspectHotEntries(int limit, java.util.function.Consumer<ChunkCacheInspectionEntry> consumer)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have 2 methods ?

can't we keep the boolean parameter ?
if you have two methods I guess that on the caller site you will have some "if (hottest) inspectHotEntries else inspectColdEntries" and you have to unroll this

// We need to shift right to extract just the File ID portion by discarding the lower bits
int shift = CHUNK_SIZE_LOG2_BITS + READER_TYPE_BITS;

synchronousCache.policy().eviction().ifPresent(policy -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no eviction policy we should throw and exception, because there is no concept of "hot" or "cold"

This it not going to happen, so you can simply add a precondition

@lesnik2u lesnik2u force-pushed the CNDB-15360-ChunkCache branch from b236ad5 to abf97b6 Compare December 1, 2025 16:52
}

@Test
public void testInspectEntriesWithLimit() throws IOException

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put this into the parametrizedtest

public static Stream<Argument> testInspectEntriesValues() {
    return ..... Arguments.of(CacheOrder.HOTTEST, 10), Arguments.of(CacheOrder.HOTTEST, 2)....
}

@ParametrizedTest
vood testInspectEntries(CacheOrder order, int limit)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to put tests in separate class because in Junit4 for test's to be parameterised the entire class has to run with Parameterised.class

implements RemovalListener<ChunkCache.Key, ChunkCache.Chunk>, CacheSize
import java.util.Map;

public class ChunkCache implements RemovalListener<ChunkCache.Key, ChunkCache.Chunk>, CacheSize

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not reformat code we don't touch, if not really useful

/**
* Defines the ordering strategy for cache entries during inspection.
*/
public enum CacheOrder

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InspectEntriesOrder ?

*/
public enum CacheOrder
{
/** Orders cache entries from most frequently accessed to least */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style ? missing new lines

public void testInspectEntriesWithEmptyCache()
{
logger.info("Testing inspect entries with empty cache for order={}", order);
ChunkCache.instance.clear();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any particular reason to not using a "new ChunkCache" in each test ?


try (FileHandle.Builder builder1 = new FileHandle.Builder(file).withChunkCache(ChunkCache.instance))
{
try (FileHandle handle1 = builder1.complete();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this only removing blank lines?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've overlooked this

@lesnik2u lesnik2u changed the title CNDB-15360 Chunk Cache inspection POC CNDB-15360 Chunk Cache inspection Dec 4, 2025
Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

I have left a minor comment about Files.write.

{
File file = FileUtils.createTempFile("test" + i, null);
file.deleteOnExit();
writeBytes(file, new byte[RandomAccessReader.DEFAULT_BUFFER_SIZE]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can use Files.write(file.toPath() .... )

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 5, 2025

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2140 rejected by Butler


2 regressions found
See build details here


Found 2 new test failures

Test Explanation Runs Upstream
o.a.c.index.sai.cql.VectorCompaction100dTest.testOneToManyCompaction[eb true] NEW 🔴 0 / 19
o.a.c.index.sai.cql.VectorSiftSmallTest.testCompaction[ca false] REGRESSION 🔴 0 / 19

Found 3 known test failures


// Look up the File by searching through fileIdMap entries
File file = null;
for (Map.Entry<File, Long> entry : fileIdMap.entrySet())
Copy link

@jasonstack jasonstack Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall time complexity would be: O (N * F) where N is the num of cache entries and F is num of files in cache..

Should we build a temporary Map<Long, File> above to cache the reversed mappings once? This will save some CPU cycles. Time complexity would be O(N + F)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants