-
Notifications
You must be signed in to change notification settings - Fork 358
replace LRUCache with CaffeineCache #10225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c626d83 to
b6051d7
Compare
zilm13
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
- Keep both
- Replace only in one-two places for the beginning
- Add jmh benchmark for some random multithread read/write access to see if it will show different numbers compared to LRUCache
...lections/src/main/java/tech/pegasys/teku/infrastructure/collections/cache/CaffeineCache.java
Outdated
Show resolved
Hide resolved
...ions/src/test/java/tech/pegasys/teku/infrastructure/collections/cache/CaffeineCacheTest.java
Show resolved
Hide resolved
I run some benchmarking and tried to make it realistic based on our current LRUCache usage (5:1 get_with_fallback:invalidate_with_new_value ratio): Cache Benchmark Results (Separate Columns)
TL;DR |
eth-benchmark-tests/src/jmh/java/tech/pegasys/teku/benchmarks/CacheConcurrencyBenchmark.java
Show resolved
Hide resolved
038a629 to
bfb450e
Compare
...lections/src/main/java/tech/pegasys/teku/infrastructure/collections/cache/CaffeineCache.java
Show resolved
Hide resolved
...ions/src/test/java/tech/pegasys/teku/infrastructure/collections/cache/CaffeineCacheTest.java
Outdated
Show resolved
Hide resolved
...lections/src/main/java/tech/pegasys/teku/infrastructure/collections/cache/CaffeineCache.java
Show resolved
Hide resolved
|
I updated and run the
|
| Benchmark | Cache Size | Cache Type | Fallback Delay (ms) | Key Space | Mode | Cnt | Score | Error | Units |
|---|---|---|---|---|---|---|---|---|---|
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 673.378 | ±652.880 | ops/ms |
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks:concurrentReads | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 672.203 | ±653.139 | ops/ms |
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks:slowFallbacks | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 1.175 | ±0.742 | ops/ms |
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 117,818.069 | ±9,525.138 | ops/ms |
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks:concurrentReads | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 117,817.824 | ±9,525.138 | ops/ms |
| CacheConcurrencyBenchmark.concurrentReadsWithSlowFallbacks:slowFallbacks | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 0.246 | ±0.005 | ops/ms |
| CacheConcurrencyBenchmark.mixedReadWriteScenario | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 6,063.919 | ±192.653 | ops/ms |
| CacheConcurrencyBenchmark.mixedReadWriteScenario | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 77,068.284 | ±1,298.065 | ops/ms |
| CacheConcurrencyBenchmark.pureReadPerformance | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 9,348.489 | ±216.479 | ops/ms |
| CacheConcurrencyBenchmark.pureReadPerformance | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 154,061.989 | ±11,562.494 | ops/ms |
| CacheConcurrencyBenchmark.realWorldScenario | 1024 | LEGACY_LRU | 5 | 2048 | thrpt | 15 | 8.412 | ±0.536 | ops/ms |
| CacheConcurrencyBenchmark.realWorldScenario | 1024 | CAFFEINE | 5 | 2048 | thrpt | 15 | 94.171 | ±2.572 | ops/ms |
CacheConcurrencyBenchmark processed results
| Benchmark Scenario | LegacyLRUCache | CaffeineCache | Performance Gain |
|---|---|---|---|
| Pure Read Hits | 9,348 ops/ms | 154,061 ops/ms | ~16.5x |
| Mixed Read/Write | 6,063 ops/ms | 77,068 ops/ms | ~12.7x |
| "Real World" (90% hits, some slow misses) | 8.4 ops/ms | 94.1 ops/ms | ~11.2x |
| High Contention with Slow Fallback | 673 ops/ms | 117,818 ops/ms | ~175x |
TransitionCachesBenchmark raw results
| Benchmark | Cache Type | Fallback Delay (ms) | Mode | Cnt | Score | Error | Units |
|---|---|---|---|---|---|---|---|
| TransitionCachesBenchmark.contendedMissWithFallback | CAFFEINE | 0 | thrpt | 10 | 186,702,132.960 | ±9,114,314.307 | ops/s |
| TransitionCachesBenchmark.contendedMissWithFallback | CAFFEINE | 5 | thrpt | 10 | 200,019,960.009 | ±9,128,102.734 | ops/s |
| TransitionCachesBenchmark.contendedMissWithFallback | LEGACY_LRU | 0 | thrpt | 10 | 23,046,142.897 | ±809,427.663 | ops/s |
| TransitionCachesBenchmark.contendedMissWithFallback | LEGACY_LRU | 5 | thrpt | 10 | 27,536,678.636 | ±5,894,598.243 | ops/s |
| TransitionCachesBenchmark.copyCaches | CAFFEINE | 0 | thrpt | 10 | 8,954.644 | ±136.269 | ops/s |
| TransitionCachesBenchmark.copyCaches | CAFFEINE | 5 | thrpt | 10 | 8,533.452 | ±59.198 | ops/s |
| TransitionCachesBenchmark.copyCaches | LEGACY_LRU | 0 | thrpt | 10 | 12,645.276 | ±292.941 | ops/s |
| TransitionCachesBenchmark.copyCaches | LEGACY_LRU | 5 | thrpt | 10 | 12,630.207 | ±733.149 | ops/s |
| TransitionCachesBenchmark.realisticWorkload | CAFFEINE | 0 | thrpt | 10 | 23,293,859.582 | ±1,663,813.239 | ops/s |
| TransitionCachesBenchmark.realisticWorkload | CAFFEINE | 5 | thrpt | 10 | 13,550.253 | ±1,637.875 | ops/s |
| TransitionCachesBenchmark.realisticWorkload | LEGACY_LRU | 0 | thrpt | 10 | 6,385,206.352 | ±446,515.793 | ops/s |
| TransitionCachesBenchmark.realisticWorkload | LEGACY_LRU | 5 | thrpt | 10 | 426,367.649 | ±858,306.047 | ops/s |
TransitionCachesBenchmark processed results
| Benchmark Scenario | LegacyLRUCache | CaffeineCache | Performance Gain |
|---|---|---|---|
| Realistic Workload (no delay) | 6,385,206 ops/s | 23,293,859 ops/s | ~3.6x |
| Realistic Workload (5ms delay) | 426,367 ops/s * | 13,550 ops/s ** | See Note |
| Contended Miss (Thundering Herd) | 27,536,678 ops/s | 200,019,960 ops/s | ~7.2x |
| Copy Caches | 12,630 ops/s | 8,533 ops/s | Legacy is ~1.5x faster |
*Note on "Realistic Workload (5ms delay)":
- The
LegacyLRUCachescore is statistically unreliable, with an error margin (±858k) far exceeding the score itself. This instability is caused by severe lock contention. - The
CaffeineCachescore is lower but predictable. Its performance degrades gracefully because a slow fallback on one key does not block reads for other cached keys.
*Note on "Copy Caches":
The legacy cache is faster at the non-critical copy() operation due to its simpler internal structure
PR Description
Replace our custom
synchronizedLRUCacheimplementation withCaffeineCache, a wrapper around the Caffeine cache library.The
Cacheinterface remains unchanged. The existing test suite has been updated to be compatible with Caffeine's eviction policy.The existing
LRUCacheimplementation presents a significant performance bottleneck under concurrent load due to its reliance on thesynchronizedkeyword for all operations.The primary issue is with the
getmethod:If one thread experiences a cache miss, it acquires a lock on the entire cache. If the
fallbackfunction to compute the new value is slow , all other threads are blocked, even those trying to access completely different, already-cached keys. This leads to poor scalability and thread contention.How does this PR address the issue?
Caffeine uses sophisticated locking mechanisms. A cache miss and subsequent value computation on one key does not block other threads from reading or writing different keys. This eliminates the primary bottleneck of the old implementation
Superior Eviction Policy (TinyLFU): While
LRUCacheuses a classic LRU policy, Caffeine employs a TinyLFU. This policy offers the same core benefit as LRU (evicting old, unused items) but is smarter, providing a better overall hit rate. TinyLFU combines recency (LRU) with frequency (LFU). It keeps track of not just when an item was last used, but also how often it's usedCaffeine is a mature library known for achieving near optimal hit rates with minimal overhead
Fixed Issue(s)
Documentation
doc-change-requiredlabel to this PR if updates are required.Changelog
Note
Replaces the custom synchronized LRU cache with a Caffeine-based implementation, wires it through state caches, adds JMH benchmarks, updates tests, and introduces a shared BeaconState cache container.
CaffeineCache: Addsinfrastructure/collections/.../CaffeineCacheas theCacheimpl and removes LRU-specific tests.CaffeineCacheTestandCacheTestUtilfor deterministic testing.SharedBeaconStateCachesto hold globalvalidatorsPubKeysandValidatorIndexCache.CaffeineCache, introducesCacheFactory, uses shared caches, and preservescopy()behavior.CaffeineCacheand addsclear().DefaultReputationManager,DataColumnSidecarSignatureValidator,SimpleSidecarRetriever, andExecutionLayerChannelStub.TestSpecFactory: clears shared caches on spec creation.CacheConcurrencyBenchmarkand updatesTransitionCachesBenchmarkto compare Legacy LRU vs Caffeine.benchmarks/gen/LegacyLRUCachefor comparisons.com.github.ben-manes.caffeine:caffeinedependency in Gradle.Written by Cursor Bugbot for commit 33c3b7c. This will update automatically on new commits. Configure here.