Skip to content

Conversation

EungsopYoo
Copy link
Contributor

No description provided.

@EungsopYoo EungsopYoo changed the title Add row-level cache for the get operation HBASE-29585 Add row-level cache for the get operation Sep 10, 2025
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea, thanks for sharing it. I do have some comments, though:

  1. Can the RowCacheService be an implementation of BlockCache? Maybe a wrapper to the LRUBlockCache. I'm a bit worried about introducing a whole new layer for intercepting all read/write operations at the RPC service with cache specific logic, however this class is not the cache implementation itself. Seems a bit confusing to have a complete separate entry point to the cache.

  2. Are we accepting to have same row data in multiple cache? In the current code, I haven't see any checks to avoid that. Maybe if we implement RowCacheService as a block cache implementation, so that the cache operations happen from the inner layers of the read/write operations, it would be easier to avoid duplication.

  3. Why not simply evict the row that got mutated? I guess we cannot simply override it in the cache because mutation can happen on individual cells.

  4. Are we accepting to have data duplicated over separate caches? I don't see any logic to avoid caching a whole block containing a region for a Get in the L2 cache, still we'll be cache the row in the row cache. Similarly, we might re-cache a row that's in the memstore in the row cache.

  5. One problem of adding such small units (a single row) in the cache is that we need to keep a map index for each entry. So, the smaller the row in size, more rows would fit in the cache, but more key objects would be retained in the map. In your tests, assuming the default block cache size of 40% of the heap, it would give a 12.8GB of block cache. Have you managed to measure the block cache usage by the row cache, in terms of number of rows in the cache, byte size of the L1 cache and the total heap usage? Maybe wort collecting a heapdump to analyse the map index size in the heap.


RegionScannerImpl scanner = getScannerInternal(region, scan, results);

// The row cache is ineffective when the number of store files is small. If the number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate more on this? Is it really a matter of number of files or total store file size? For a single CF table, where a given region, after major compaction, has a 10GB store file, wouldn't this be more efficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get performance is more affected by the number of StoreFiles than by their size. This is because a StoreFileScanner must be created for each StoreFile, and the process of aggregating their results into a single Get result becomes increasingly complex. However, in testing, I found that when there was only one StoreFile, the row cache provided almost no performance benefit. Therefore, I added this condition to prevent the row cache from unnecessarily occupying BlockCache space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, right, so the main gain here comes from avoiding the merge of results from different store file scanners. I guess, there could be still benefits on doing this row caching for gets only, even when only having one store file. Say, L2 cache is at capacity already, long client scans could cause evictions for blocks of gets for repeating keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. I’ll remove the condition to cache only when the number of StoreFiles is above a threshold, and always cache the row.

Copy link
Contributor Author

@EungsopYoo EungsopYoo Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e33db29.


private boolean tryGetFromCache(HRegion region, RowCacheKey key, Get get, List<Cell> results) {
RowCells row =
(RowCells) region.getBlockCache().getBlock(key, get.getCacheBlocks(), false, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RowCacheKey uses the region encoded name for indexing, whilst BlockCacheKey uses (store file name + offset). If the given row is already cached in a L2 cache block, this call will fail to fetch it and we'll cache it on the L1 too.

Copy link
Contributor Author

@EungsopYoo EungsopYoo Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially intended for the row cache to reside only in L1 and not be cached in L2, but I haven’t actually implemented that yet. I’ll give further thought to adding this.

@EungsopYoo
Copy link
Contributor Author

This is a great idea, thanks for sharing it. I do have some comments, though:

Thank you for starting the PR review.

  1. Can the RowCacheService be an implementation of BlockCache? Maybe a wrapper to the LRUBlockCache. I'm a bit worried about introducing a whole new layer for intercepting all read/write operations at the RPC service with cache specific logic, however this class is not the cache implementation itself. Seems a bit confusing to have a complete separate entry point to the cache.

BlockCache operates at the HFile access layer, whereas the row cache needs to function at a higher layer that covers both MemStore and HFile. That’s why I implemented RowCacheService in the RPC service layer.

That said, the row cache does not actually cache HFileBlocks, yet it currently relies on the BlockCache interface. I realize this might not be appropriate. I reused the BlockCache interface to reduce the overhead of creating a separate cache implementation solely for the row cache, but in hindsight, this might not have been the best approach. It may be better to build a dedicated cache implementation specifically for the row cache.

What do you think?

  1. Are we accepting to have same row data in multiple cache? In the current code, I haven't see any checks to avoid that. Maybe if we implement RowCacheService as a block cache implementation, so that the cache operations happen from the inner layers of the read/write operations, it would be easier to avoid duplication.

What exactly does “multiple cache” refer to? Does it mean the L1 and L2 caches in the CombinedBlockCache? If so, I haven’t really considered that aspect yet, but I’ll start looking into it.

  1. Why not simply evict the row that got mutated? I guess we cannot simply override it in the cache because mutation can happen on individual cells.

I didn’t fully understand the intention behind your question. Could you please explain it in more detail?

  1. Are we accepting to have data duplicated over separate caches? I don't see any logic to avoid caching a whole block containing a region for a Get in the L2 cache, still we'll be cache the row in the row cache. Similarly, we might re-cache a row that's in the memstore in the row cache.

This is in the same L1/L2 context as your comment 2, correct? If so, I haven’t considered that aspect yet, but I’ll start thinking about how to handle it.

Since the row cache is only enabled when there are at least two HFiles, rows that exist only in the MemStore are not cached. However, when there are two or more HFiles, rows in MemStore are also added again to the row cache. This is an intentional design choice, aimed at bypassing the process of generating results via SegmentScanner and StoreFileScanner, and instead serving Get requests directly from the cache.

  1. One problem of adding such small units (a single row) in the cache is that we need to keep a map index for each entry. So, the smaller the row in size, more rows would fit in the cache, but more key objects would be retained in the map. In your tests, assuming the default block cache size of 40% of the heap, it would give a 12.8GB of block cache. Have you managed to measure the block cache usage by the row cache, in terms of number of rows in the cache, byte size of the L1 cache and the total heap usage? Maybe wort collecting a heapdump to analyse the map index size in the heap.

I slightly modified the LruBlockCache code to record the row cache size and entry count. The row cache occupies 268.67MB with 338,602 entries. The average size of a single row cache entry is 830 bytes. Within the overall BlockCache, the row cache accounts for 45% by entry count and 2% by size.

2025-09-12T09:08:44,112 INFO  [LruBlockCacheStatsExecutor {}] hfile.LruBlockCache: totalSize=12.80 GB, usedSize=12.48 GB, freeSize=329.41 MB, max=12.80 GB, blockCount=752084, accesses=35942999, hits=27403857, hitRatio=76.24%, , cachingAccesses=35942954, cachingHits=27403860, cachingHitsRatio=76.24%, evictions=170, evicted=5806436, evictedPerRun=34155.50588235294, rowBlockCount=338602, rowBlockSize=268.67 MB

@wchevreuil
Copy link
Contributor

That said, the row cache does not actually cache HFileBlocks, yet it currently relies on the BlockCache interface. I realize this might not be appropriate. I reused the BlockCache interface to reduce the overhead of creating a separate cache implementation solely for the row cache, but in hindsight, this might not have been the best approach. It may be better to build a dedicated cache implementation specifically for the row cache.

What do you think?

Yeah, I had the same thought while going through the comments. Having a separate cache structure seems the best way to implement this.

  1. Are we accepting to have same row data in multiple cache? In the current code, I haven't see any checks to avoid that. Maybe if we implement RowCacheService as a block cache implementation, so that the cache operations happen from the inner layers of the read/write operations, it would be easier to avoid duplication.

What exactly does “multiple cache” refer to? Does it mean the L1 and L2 caches in the CombinedBlockCache? If so, I haven’t really considered that aspect yet, but I’ll start looking into it.

Nevermind my previous comment. We should focus on the separate cache for rows.

  1. Why not simply evict the row that got mutated? I guess we cannot simply override it in the cache because mutation can happen on individual cells.

I didn’t fully understand the intention behind your question. Could you please explain it in more detail?

Rather than blocking writes to the row cache during updates/bulkload, can we simply make the updates evict/override the row from the cache if it's already there? For puts, we shouldn't need to worry about barries, if we make sure we don't cache the row if it's in the memstore only, but we should to make sure to remove it from the row cache because the cache would now be stale. For bulkloads, I guess we only need to make sure to evict the rows for affected regions after the bulkload has been committed.

  1. Are we accepting to have data duplicated over separate caches? I don't see any logic to avoid caching a whole block containing a region for a Get in the L2 cache, still we'll be cache the row in the row cache. Similarly, we might re-cache a row that's in the memstore in the row cache.

This is in the same L1/L2 context as your comment 2, correct? If so, I haven’t considered that aspect yet, but I’ll start thinking about how to handle it.

Since the row cache is only enabled when there are at least two HFiles, rows that exist only in the MemStore are not cached. However, when there are two or more HFiles, rows in MemStore are also added again to the row cache. This is an intentional design choice, aimed at bypassing the process of generating results via SegmentScanner and StoreFileScanner, and instead serving Get requests directly from the cache.

Per other comments, agree it's fine to have the row in the row cache and its' block also in the block cache. We need to decide if we want to add blocks to the block cache when doing Get, or Get should cache only in the row cache? Also, should we avoid caching if the row is the memstore? Could be challenging in the current design of caching the whole row, because memstore migh have only updates for few cells within a row.

  1. One problem of adding such small units (a single row) in the cache is that we need to keep a map index for each entry. So, the smaller the row in size, more rows would fit in the cache, but more key objects would be retained in the map. In your tests, assuming the default block cache size of 40% of the heap, it would give a 12.8GB of block cache. Have you managed to measure the block cache usage by the row cache, in terms of number of rows in the cache, byte size of the L1 cache and the total heap usage? Maybe wort collecting a heapdump to analyse the map index size in the heap.

I slightly modified the LruBlockCache code to record the row cache size and entry count. The row cache occupies 268.67MB with 338,602 entries. The average size of a single row cache entry is 830 bytes. Within the overall BlockCache, the row cache accounts for 45% by entry count and 2% by size.

2025-09-12T09:08:44,112 INFO  [LruBlockCacheStatsExecutor {}] hfile.LruBlockCache: totalSize=12.80 GB, usedSize=12.48 GB, freeSize=329.41 MB, max=12.80 GB, blockCount=752084, accesses=35942999, hits=27403857, hitRatio=76.24%, , cachingAccesses=35942954, cachingHits=27403860, cachingHitsRatio=76.24%, evictions=170, evicted=5806436, evictedPerRun=34155.50588235294, rowBlockCount=338602, rowBlockSize=268.67 MB

What if more rows get cached, over time, as more gets for different rows are executed? It could lead to many rows in the cache, and many more objects in the map to index it. In the recent past. we've seen some heap issues when having very large file based bucket cache and small compressed blocks. I guess we could face similar problems here too.

@Apache9
Copy link
Contributor

Apache9 commented Sep 12, 2025

The design doc looks good. Skimmed the code, seems we put row cache into block cache? Minding explaining more on why we choose to use block cache to implement row cache? What is the benefit?

Thanks.

@EungsopYoo
Copy link
Contributor Author

  1. Why not simply evict the row that got mutated? I guess we cannot simply override it in the cache because mutation can happen on individual cells.

I didn’t fully understand the intention behind your question. Could you please explain it in more detail?

Rather than blocking writes to the row cache during updates/bulkload, can we simply make the updates evict/override the row from the cache if it's already there? For puts, we shouldn't need to worry about barries, if we make sure we don't cache the row if it's in the memstore only, but we should to make sure to remove it from the row cache because the cache would now be stale. For bulkloads, I guess we only need to make sure to evict the rows for affected regions after the bulkload has been committed.

When the data exists in both the MemStore and the StoreFiles, we need to store it in the row cache to avoid result merging. In that case, due to the following issues, a barrier was introduced.

Thread Time 1 Time 2 Time 3 Time 4
th1 delete row1 from RowCache Put row1 to Region write row1 to RowCache
th2 delete row1 from RowCache Put row1 to Region write row1 to RowCache
th3 Get for row1 not from RowCache. Good Get for row1 not from RowCache. Good Get for row1 from stale RowCache. Bad Get for row1 not from RowCache. Good

It would be more efficient to do as you mentioned when doing a bulkload.

  1. Are we accepting to have data duplicated over separate caches? I don't see any logic to avoid caching a whole block containing a region for a Get in the L2 cache, still we'll be cache the row in the row cache. Similarly, we might re-cache a row that's in the memstore in the row cache.

This is in the same L1/L2 context as your comment 2, correct? If so, I haven’t considered that aspect yet, but I’ll start thinking about how to handle it.
Since the row cache is only enabled when there are at least two HFiles, rows that exist only in the MemStore are not cached. However, when there are two or more HFiles, rows in MemStore are also added again to the row cache. This is an intentional design choice, aimed at bypassing the process of generating results via SegmentScanner and StoreFileScanner, and instead serving Get requests directly from the cache.

Per other comments, agree it's fine to have the row in the row cache and its' block also in the block cache. We need to decide if we want to add blocks to the block cache when doing Get, or Get should cache only in the row cache? Also, should we avoid caching if the row is the memstore? Could be challenging in the current design of caching the whole row, because memstore migh have only updates for few cells within a row.

I already answered this in another comment, but I’ll respond here as well.

I think it’s better to put it into the BlockCache when doing a Get, according to the BlockCache setting.

It is more efficient not to create a row cache when the cells to be fetched exist only in the MemStore. However, if the cells to be fetched are in both the MemStore and the StoreFiles, then creating a row cache is efficient to avoid result merging.

I’ll give some more thought on how we can achieve this.

  1. One problem of adding such small units (a single row) in the cache is that we need to keep a map index for each entry. So, the smaller the row in size, more rows would fit in the cache, but more key objects would be retained in the map. In your tests, assuming the default block cache size of 40% of the heap, it would give a 12.8GB of block cache. Have you managed to measure the block cache usage by the row cache, in terms of number of rows in the cache, byte size of the L1 cache and the total heap usage? Maybe wort collecting a heapdump to analyse the map index size in the heap.

I slightly modified the LruBlockCache code to record the row cache size and entry count. The row cache occupies 268.67MB with 338,602 entries. The average size of a single row cache entry is 830 bytes. Within the overall BlockCache, the row cache accounts for 45% by entry count and 2% by size.

2025-09-12T09:08:44,112 INFO  [LruBlockCacheStatsExecutor {}] hfile.LruBlockCache: totalSize=12.80 GB, usedSize=12.48 GB, freeSize=329.41 MB, max=12.80 GB, blockCount=752084, accesses=35942999, hits=27403857, hitRatio=76.24%, , cachingAccesses=35942954, cachingHits=27403860, cachingHitsRatio=76.24%, evictions=170, evicted=5806436, evictedPerRun=34155.50588235294, rowBlockCount=338602, rowBlockSize=268.67 MB

What if more rows get cached, over time, as more gets for different rows are executed? It could lead to many rows in the cache, and many more objects in the map to index it. In the recent past. we've seen some heap issues when having very large file based bucket cache and small compressed blocks. I guess we could face similar problems here too.

Okay. Then I’ll take a heap dump and check the size of the map’s index.

@EungsopYoo
Copy link
Contributor Author

The design doc looks good. Skimmed the code, seems we put row cache into block cache? Minding explaining more on why we choose to use block cache to implement row cache? What is the benefit?

Thanks.

I did it that way because the implementation was simpler. However, it causes confusion and makes it harder to have clear control over the row cache, so I’ve decided to create a separate RowCache implementation.

@EungsopYoo
Copy link
Contributor Author

EungsopYoo commented Sep 15, 2025

The TODOs are as follows, and I will proceed in order:

  • Separate the row cache implementation
  • Remove the condition that decides whether to put data into the row cache based on the number of StoreFiles
  • Do not use the row cache when the data exists only in the MemStore
  • Invalidate only the row cache of regions that were bulkloaded
  • Take a heap dump to check the index size of the map

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

- Implement RowCache
  - Initially considered modifying LruBlockCache, but the required changes were extensive.
    Instead, implemented RowCache using Caffeine cache.
- Add row.cache.size configuration
  - Default is 0.0 (disabled); RowCache is enabled only if explicitly set to a value > 0.
  - The combined size of BlockCache + MemStore + RowCache must not exceed 80% of the heap.
- Add Row Cache tab to RegionServer Block Cache UI
  - RowCache is not a BlockCache, but added here since there is no better place.
- Add RowCache metrics
  - Metrics for size, count, eviction, hit, and miss are now exposed.
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@EungsopYoo
Copy link
Contributor Author

EungsopYoo commented Sep 25, 2025

  1. One problem of adding such small units (a single row) in the cache is that we need to keep a map index for each entry. So, the smaller the row in size, more rows would fit in the cache, but more key objects would be retained in the map. In your tests, assuming the default block cache size of 40% of the heap, it would give a 12.8GB of block cache. Have you managed to measure the block cache usage by the row cache, in terms of number of rows in the cache, byte size of the L1 cache and the total heap usage? Maybe wort collecting a heapdump to analyse the map index size in the heap.

I slightly modified the LruBlockCache code to record the row cache size and entry count. The row cache occupies 268.67MB with 338,602 entries. The average size of a single row cache entry is 830 bytes. Within the overall BlockCache, the row cache accounts for 45% by entry count and 2% by size.

2025-09-12T09:08:44,112 INFO  [LruBlockCacheStatsExecutor {}] hfile.LruBlockCache: totalSize=12.80 GB, usedSize=12.48 GB, freeSize=329.41 MB, max=12.80 GB, blockCount=752084, accesses=35942999, hits=27403857, hitRatio=76.24%, , cachingAccesses=35942954, cachingHits=27403860, cachingHitsRatio=76.24%, evictions=170, evicted=5806436, evictedPerRun=34155.50588235294, rowBlockCount=338602, rowBlockSize=268.67 MB

What if more rows get cached, over time, as more gets for different rows are executed? It could lead to many rows in the cache, and many more objects in the map to index it. In the recent past. we've seen some heap issues when having very large file based bucket cache and small compressed blocks. I guess we could face similar problems here too.

Okay. Then I’ll take a heap dump and check the size of the map’s index.

I configured the RegionServer with a 4 GB heap, setting hfile.block.cache.size to 0.3 and row.cache.size to 0.1, then reran the same workload as before. Under these settings, the maximum RowCache capacity is approximately 400 MB. After the RowCache was fully populated, I generated and analyzed a heap dump.

  • RowCache Size: 409 MB
  • RowCache Count: 697,234 entries
  • Average RowCache Entry Size: 615 B
    • This is reduced from 830 B previously, mainly due to a simplified RowCacheKey.
  • Retained Heap Size: 622 MB
    • Because of the overhead associated with Caffeine’s key/value structures, the retained size on heap amounts to 52% more than the actual data size for this workload.
    • I believe this is acceptable if the RowCache size is configured relatively smaller than the BlockCache, for example, around 2% of the BlockCache size. The positive impact of RowCache is already noticeable even at this smaller capacity.

@EungsopYoo
Copy link
Contributor Author

@wchevreuil
I have completed all the tasks on the TODO list. Please review it again.

And remove the condition that decides whether to put data into the row cache based on the number of StoreFiles
@EungsopYoo
Copy link
Contributor Author

I’m currently trying to determine the appropriate size of the RowCache relative to the BlockCache.

</div>
<div class="tab-pane" id="tab_row_cache" role="tabpanel">
<& row_cache_stats; rowCache = rowCache &>
</div>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We may need to rename the labels here. Where we current say "Block Cache" should be only "Cache", then on L1/L2 tabs should be labeled "BlockCache L1"/"BlockCache L2".

.addCounter(Interns.info(ROW_CACHE_EVICTED_ROW_COUNT, ""),
rsWrap.getRowCacheEvictedRowCount())
.addGauge(Interns.info(ROW_CACHE_SIZE, ""), rsWrap.getRowCacheSize())
.addGauge(Interns.info(ROW_CACHE_COUNT, ""), rsWrap.getRowCacheCount())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@wchevreuil
Copy link
Contributor

@wchevreuil I have completed all the tasks on the TODO list. Please review it again.

Thanks! Please allow me a few days to review it.

Copy link
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for lagging on this, @EungsopYoo , I'm still going through the core of your implementation, but here goes some minor "cosmetic" changes I think we could do here.

I may give another review by tomorrow EOD.

.addCounter(Interns.info(ROW_CACHE_EVICTED_ROW_COUNT, ""),
rsWrap.getRowCacheEvictedRowCount())
.addGauge(Interns.info(ROW_CACHE_SIZE, ""), rsWrap.getRowCacheSize())
.addGauge(Interns.info(ROW_CACHE_COUNT, ""), rsWrap.getRowCacheCount())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

</div>
<div class="tab-pane" id="tab_row_cache" role="tabpanel">
<& row_cache_stats; rowCache = rowCache &>
</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We may need to rename the labels here. Where we current say "Block Cache" should be only "Cache", then on L1/L2 tabs should be labeled "BlockCache L1"/"BlockCache L2".

RowCache rowCache;
</%args>
<%if rowCache == null %>
<p>RowCache is null</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather say: "RowCache disabled"?

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 34s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 42s Maven dependency ordering for branch
+1 💚 mvninstall 3m 36s master passed
+1 💚 compile 8m 16s master passed
+1 💚 checkstyle 1m 14s master passed
+1 💚 spotbugs 10m 31s master passed
+0 🆗 refguide 2m 27s branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
+1 💚 spotless 0m 47s branch has no errors when running spotless:check.
-0 ⚠️ patch 1m 16s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 3m 5s the patch passed
+1 💚 compile 8m 18s the patch passed
+1 💚 javac 8m 18s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 32s /buildtool-patch-checkstyle-root.txt The patch fails to run checkstyle in root
+1 💚 xmllint 0m 0s No new issues.
+1 💚 spotbugs 11m 0s the patch passed
+0 🆗 refguide 2m 3s patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
+1 💚 hadoopcheck 12m 7s Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚 spotless 0m 45s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
75m 36s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7291/10/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #7291
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless xmllint refguide
uname Linux 20f18c0a92f2 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 99ce0f0
Default Java Eclipse Adoptium-17.0.11+9
refguide https://nightlies.apache.org/hbase/HBase-PreCommit-GitHub-PR/PR-7291/10/yetus-general-check/output/branch-site/book.html
refguide https://nightlies.apache.org/hbase/HBase-PreCommit-GitHub-PR/PR-7291/10/yetus-general-check/output/patch-site/book.html
Max. process+thread count 191 (vs. ulimit of 30000)
modules C: hbase-common hbase-hadoop-compat hbase-client hbase-server . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7291/10/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 xmllint=20913
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for branch
+1 💚 mvninstall 3m 32s master passed
+1 💚 compile 2m 11s master passed
+1 💚 javadoc 3m 7s master passed
+1 💚 shadedjars 6m 15s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ patch 6m 46s Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 3m 9s the patch passed
+1 💚 compile 2m 16s the patch passed
+1 💚 javac 2m 16s the patch passed
+1 💚 javadoc 3m 8s the patch passed
+1 💚 shadedjars 6m 16s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 291m 47s root in the patch passed.
330m 55s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7291/10/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #7291
Optional Tests javac javadoc unit compile shadedjars
uname Linux d59483981ca2 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 99ce0f0
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7291/10/testReport/
Max. process+thread count 7645 (vs. ulimit of 30000)
modules C: hbase-common hbase-hadoop-compat hbase-client hbase-server . U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7291/10/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Comment on lines +46 to +49
private final LongAdder hitCount = new LongAdder();
private final LongAdder missCount = new LongAdder();
private final LongAdder evictedRowCount = new LongAdder();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we use Cache.stats() for this?

Comment on lines +63 to +84
void cacheBlock(RowCacheKey key, RowCells value) {
cache.put(key, value);
}

public RowCells getBlock(RowCacheKey key, boolean caching) {
if (!caching) {
missCount.increment();
return null;
}

RowCells value = cache.getIfPresent(key);
if (value == null) {
missCount.increment();
} else {
hitCount.increment();
}
return value;
}

void evictBlock(RowCacheKey key) {
cache.asMap().remove(key);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename all these methods, as we are not caching blocks, but rows.

Comment on lines +278 to +286

// After creating the barrier, evict the existing row cache for this row,
// as it becomes invalid after the mutation
evictRowCache(key);

return execute(operation);
} finally {
// Remove the barrier after mutation to allow the row cache to be populated again
removeRowLevelBarrier(key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should (or could) we recache the mutated row?

return operation.execute();
}

void evictRowCache(RowCacheKey key) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be "evictRow".

Comment on lines +33 to +35
// Row cache keys should not be evicted on close, since the cache may contain many entries and
// eviction would be slow. Instead, the region’s rowCacheSeqNum is used to generate new keys that
// ignore the existing cache when the region is reopened or bulk-loaded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when do stale rows in the row cache get evicted? And if we don't evict rows from cache for a closed region, would these be wasting cache space until cache is full and LFU logic finally finds those for eviction?

Comment on lines +8710 to +8712
/**
* This is used to invalidate the entire row cache after bulk loading.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment correct? I thought we would be invalidating only the rows for the given regions. Rows from regions not touched by bulkload would stay valid.

private final Map<RowCacheKey, AtomicInteger> rowLevelBarrierMap = new ConcurrentHashMap<>();

private final boolean enabledByConf;
private final RowCache rowCache;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider define an interface for RowCache and refer to the interface from here, so that we can accommodate future new RowCache implementations, beyond the caffeine one currently provided as the reference one?

@Apache9
Copy link
Contributor

Apache9 commented Oct 9, 2025

The change is pretty large, I suggest we start a feature branch to land it step by step.

First we introduce the framework for integrating row cache, add a flag to enable/disable it, but the code when enabling row cache can be empty.
Then we implement all the necessary fencing code, introduce a row cache interface and a very simple row cache implementation, to verify the correctness.
And last, we introduce a more powerful implementation for row cache, which has a good performance.

And then we can start some integration tests, like YCSB to verify performance, and ITBLL to verify correctness, if all things are good, we can merge the feature branch back.

WDYT?

Thanks.

@EungsopYoo
Copy link
Contributor Author

EungsopYoo commented Oct 9, 2025

@Apache9
OK. I’ll create a feature branch and develop the work on sub-branches, merging them step by step.
It seems someone who has permission to create branches should make the feature branch, right?

@wchevreuil
I’ll address the review comments in a new branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants