Skip to content

Conversation

@blefo
Copy link
Member

@blefo blefo commented Nov 10, 2025

No description provided.

@blefo blefo requested a review from Copilot November 10, 2025 15:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors web search rate limiting from a global Redis-based approach to a local per-process rate limiter using the aiolimiter library. The change moves rate limiting enforcement from the general rate limiting middleware into the web search handler itself, providing more granular control over Brave API request rates.

  • Introduces AsyncLimiter for client-side rate limiting of Brave Search API requests
  • Removes web search RPS checks from the global rate limiting middleware
  • Adds comprehensive test coverage for the new rate limiting behavior

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
nilai-api/src/nilai_api/handlers/web_search.py Implements local rate limiting with AsyncLimiter, adds _get_brave_rate_limiter() and _send_brave_api_request() functions, and handles provider 429 responses
nilai-api/src/nilai_api/rate_limiting.py Removes web search RPS bucket check and unused CONFIG import
tests/unit/nilai_api/test_web_search.py Adds new tests for rate limiting behavior including RPS configuration, rejection, disabled state, and provider 429 handling; updates existing test patches
tests/unit/nilai_api/test_rate_limiting.py Removes web search RPS test and CONFIG import (coverage moved to test_web_search.py)
nilai-api/pyproject.toml Adds aiolimiter>=1.0.0 dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 94 to 99
@lru_cache(maxsize=1)
def _get_brave_rate_limiter() -> AsyncLimiter | None:
rps = CONFIG.web_search.rps
if rps is None or rps <= 0:
return None
return AsyncLimiter(max_rate=rps, time_period=1.0)
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @lru_cache decorator on _get_brave_rate_limiter() creates a concurrency issue. The cached AsyncLimiter instance will be shared across all concurrent requests, but lru_cache itself is not thread-safe and doesn't synchronize access. Additionally, if CONFIG.web_search.rps changes at runtime (e.g., via config reload), the cached limiter won't reflect the new value until the cache is manually cleared.

Consider using a module-level variable with proper synchronization or a thread-safe singleton pattern to ensure the rate limiter is safely shared across concurrent requests.

Copilot uses AI. Check for mistakes.
…and documentation

- Added detailed docstrings for rate limiter and API request functions.
- Removed immediate rejection of requests exceeding rate limit, allowing for queuing.
- Introduced unit tests to validate rate limiting behavior across multiple users and request scenarios.
@blefo blefo requested a review from Copilot November 12, 2025 09:51
Copilot finished reviewing on behalf of blefo November 12, 2025 09:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +239 to +240
assert sorted_times[2] - sorted_times[0] >= 0.4
assert sorted_times[3] - sorted_times[1] >= 0.4
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The magic numbers 0.4 represent an expected delay based on the rate limit of 2 RPS (0.5 seconds between requests). Consider defining these as named constants (e.g., EXPECTED_MIN_DELAY = 0.4) to improve code clarity and make the relationship to the configured RPS of 2 more explicit.

Copilot uses AI. Check for mistakes.
latest_a = max(call_log[q] for q in user_a_queries)
earliest_b = min(call_log[q] for q in user_b_queries)
# Even though user B starts after A finished, they still wait for the global limiter slot.
assert earliest_b - latest_a >= 0.4
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The magic number 0.4 represents an expected delay based on rate limiting. Consider defining this as a named constant to improve code clarity and maintainability, similar to the issue in lines 239-240.

Copilot uses AI. Check for mistakes.
@blefo blefo marked this pull request as ready for review November 12, 2025 10:00
@blefo blefo marked this pull request as draft November 13, 2025 08:47
@blefo blefo closed this Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants