-
Notifications
You must be signed in to change notification settings - Fork 83
feat: deviate Redis expiration time #2608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6060b6d
to
68fa0e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we do this?
Do we have any evidence that redis cannot handle it when multiple keys expire at the same time?
The way redis expires keys this does not really make sense IMO:
A key is passively expired when a client tries to access it and the key is timed out.
So there is no internal redis tasks that tries to run at the same time for all expired keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few minor comments, overall, this looks good 💪
Similar to Manu's question: do we have clear evidence of Redis stampedes occurring in prod? Also, I’m curious if this could be mitigated directly at the Redis config level?
…mum TTL enforcement in RedisCacheService for clarity
@schmanu @gjeanmart This happens because when all keys expire at the same time, it puts pressure on CGW. Since the service aggregates multiple endpoints—including expensive ones like transactions—this leads to additional load on the TX service as well. To reduce this pressure, I’d like to distribute the traffic by randomizing the key expiration times. I’ll update the PR description accordingly. |
Summary
This PR introduces a random deviation to Redis key expiration times to prevent cache stampedes and reduce the likelihood of simultaneous expirations. By spreading out key expirations, we help distribute network load more evenly and avoid putting unnecessary pressure on other services like the TX service.
Changes