-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
phase/BetaBeta scopeBeta scope
Description
Architecture Review Finding: Cache Miss Latency Spikes
Issue: Current cache strategy causes latency spikes when metadata expires, impacting user experience.
Current Problem
// Current synchronous cache pattern in CachingLayerCatalog.cs
var cached = await _cache.GetAsync<LayerDefinition>(cacheKey);
if (cached == null)
{
// BLOCKING: User waits for fresh data fetch
var fresh = await _inner.GetLayerAsync(layerId, ct);
await _cache.SetAsync(cacheKey, fresh, ttl, ct);
return fresh;
}Impact:
- P95 latency spikes when cache expires (cold cache penalty)
- Poor user experience during cache misses
- Database load spikes during cache refresh
- No differentiation between cache hit/miss in monitoring
Recommended Solution: Proactive Background Refresh
Implementation Pattern
public class BackgroundRefreshCacheDecorator<T> : ILayerCatalog where T : class
{
private readonly ILayerCatalog _inner;
private readonly ICacheService _cache;
private readonly IBackgroundTaskQueue _taskQueue;
private readonly TimeSpan _refreshThreshold; // e.g., 25% of TTL remaining
public async Task<LayerDefinition?> GetLayerAsync(int layerId, CancellationToken ct)
{
var cacheKey = $"layer:{layerId}";
var cached = await _cache.GetWithMetadataAsync<LayerDefinition>(cacheKey);
if (cached is { Value: not null })
{
// Background refresh if approaching expiration
if (cached.RemainingTtl < _refreshThreshold)
{
_taskQueue.EnqueueBackgroundWorkItem(async ct =>
{
try
{
var fresh = await _inner.GetLayerAsync(layerId, ct);
if (fresh is not null)
{
await _cache.SetAsync(cacheKey, fresh, _options.LayerTtl, ct);
_metrics.RecordBackgroundRefresh("layer", "success");
}
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Background refresh failed for layer {LayerId}", layerId);
_metrics.RecordBackgroundRefresh("layer", "failure");
}
});
}
return cached.Value;
}
// Cache miss - fetch synchronously (unavoidable first time)
var layer = await _inner.GetLayerAsync(layerId, ct);
if (layer is not null)
await _cache.SetAsync(cacheKey, layer, _options.LayerTtl, ct);
return layer;
}
}Background Task Queue Service
public interface IBackgroundTaskQueue
{
void EnqueueBackgroundWorkItem(Func<CancellationToken, Task> workItem);
}
public class BackgroundTaskQueue : BackgroundService, IBackgroundTaskQueue
{
private readonly Channel<Func<CancellationToken, Task>> _queue;
private readonly ILogger<BackgroundTaskQueue> _logger;
public void EnqueueBackgroundWorkItem(Func<CancellationToken, Task> workItem)
{
_queue.Writer.WriteAsync(workItem);
}
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await foreach (var workItem in _queue.Reader.ReadAllAsync(stoppingToken))
{
try
{
await workItem(stoppingToken);
}
catch (Exception ex)
{
_logger.LogError(ex, "Background cache refresh failed");
}
}
}
}Configuration Options
{
"Cache": {
"BackgroundRefresh": {
"Enabled": true,
"RefreshThresholdPercent": 25, // Refresh when 25% TTL remaining
"MaxConcurrentRefresh": 10, // Limit concurrent background tasks
"RefreshTimeoutSeconds": 30 // Timeout for background refresh
}
}
}Implementation Tasks
- Create
IBackgroundTaskQueueservice and implementation - Implement
BackgroundRefreshCacheDecoratorforILayerCatalog - Add cache metadata support (
GetWithMetadataAsync) toICacheService - Extend to
IFeatureStorefor query result caching - Add configuration options for refresh threshold
- Add metrics for background refresh success/failure rates
- Add health check for background task queue depth
- Integration tests for background refresh behavior
Testing Scenarios
- Cache near expiration triggers background refresh
- Multiple requests don't trigger duplicate background refreshes
- Background refresh failures don't impact foreground requests
- Metrics accurately track refresh success/failure rates
- Load testing shows improved P95/P99 latencies
Expected Benefits
- Eliminate Cache Miss Latency: Users always get cached responses
- Improve P95/P99 Latency: No more cold cache penalties
- Reduce Database Load Spikes: Smooth, predictable refresh pattern
- Better Monitoring: Separate metrics for cache hits vs background refreshes
- Graceful Degradation: Background failures don't impact user requests
Priority: MEDIUM (performance optimization, not blocking production)
Effort: MEDIUM (2-3 days implementation + testing)
Timeline: Next 2 sprints
Dependencies: None
References: Architecture Review 2026-01-03, Section M2 "Performance - Background Cache Refresh"
Metadata
Metadata
Assignees
Labels
phase/BetaBeta scopeBeta scope