Skip to content

feat: deviate Redis expiration time #2608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 5, 2025
Merged

Conversation

PooyaRaki
Copy link
Contributor

@PooyaRaki PooyaRaki commented Jun 4, 2025

Summary

This PR introduces a random deviation to Redis key expiration times to prevent cache stampedes and reduce the likelihood of simultaneous expirations. By spreading out key expirations, we help distribute network load more evenly and avoid putting unnecessary pressure on other services like the TX service.

Changes

  • Added logic to apply a random deviation to Redis TTL values.
  • Ensured the deviation stays within a safe and configurable range.

@PooyaRaki PooyaRaki force-pushed the feature/redisExpirationDeviation branch from 6060b6d to 68fa0e5 Compare June 4, 2025 11:50
@PooyaRaki PooyaRaki self-assigned this Jun 4, 2025
@PooyaRaki PooyaRaki marked this pull request as ready for review June 4, 2025 12:00
@PooyaRaki PooyaRaki requested a review from a team as a code owner June 4, 2025 12:00
Copy link
Member

@schmanu schmanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this?
Do we have any evidence that redis cannot handle it when multiple keys expire at the same time?

The way redis expires keys this does not really make sense IMO:

A key is passively expired when a client tries to access it and the key is timed out.

So there is no internal redis tasks that tries to run at the same time for all expired keys.

Copy link

@gjeanmart gjeanmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few minor comments, overall, this looks good 💪

Similar to Manu's question: do we have clear evidence of Redis stampedes occurring in prod? Also, I’m curious if this could be mitigated directly at the Redis config level?

@PooyaRaki
Copy link
Contributor Author

@schmanu @gjeanmart This happens because when all keys expire at the same time, it puts pressure on CGW. Since the service aggregates multiple endpoints—including expensive ones like transactions—this leads to additional load on the TX service as well. To reduce this pressure, I’d like to distribute the traffic by randomizing the key expiration times. I’ll update the PR description accordingly.

@PooyaRaki PooyaRaki requested review from schmanu and gjeanmart June 5, 2025 13:02
@PooyaRaki PooyaRaki merged commit 35505d0 into main Jun 5, 2025
17 checks passed
@PooyaRaki PooyaRaki deleted the feature/redisExpirationDeviation branch June 5, 2025 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants