Skip to content

Adjust token rate limit reset to align with cluster-local midnight #657

@tanweersalah

Description

@tanweersalah

Description:

Currently, our token rate limit is enforced per cluster on a rolling 24-hour window. This means if a user hits their limit late in the day, they must wait a full 24 hours before the limit resets, which can lead to a poor user experience.

Problem

  • The 24-hour rolling window is unintuitive and inconsistent.
  • Users in different time zones experience the reset at odd hours, especially if they hit their limit near the end of their day.
  • The current system may unfairly affect users depending on when they typically use the AI Assistant.

Proposal

Update the token reset logic to:

Reset the token limit once daily at midnight local time of the cluster the user is served from.

This offers a better experience by aligning the reset with the local notion of a "new day," which:

  • Feels more natural and predictable to users.
  • Avoids long waiting periods for those who hit the limit late in their day.

Implementation Notes

  • Determine the time zone of the cluster serving the user.
  • Schedule the token limit to reset at 00:00 local time of that cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions