-
Notifications
You must be signed in to change notification settings - Fork 10
fix: Brave API web search RPS limits #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors web search rate limiting from a global Redis-based approach to a local per-process rate limiter using the aiolimiter library. The change moves rate limiting enforcement from the general rate limiting middleware into the web search handler itself, providing more granular control over Brave API request rates.
- Introduces
AsyncLimiterfor client-side rate limiting of Brave Search API requests - Removes web search RPS checks from the global rate limiting middleware
- Adds comprehensive test coverage for the new rate limiting behavior
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| nilai-api/src/nilai_api/handlers/web_search.py | Implements local rate limiting with AsyncLimiter, adds _get_brave_rate_limiter() and _send_brave_api_request() functions, and handles provider 429 responses |
| nilai-api/src/nilai_api/rate_limiting.py | Removes web search RPS bucket check and unused CONFIG import |
| tests/unit/nilai_api/test_web_search.py | Adds new tests for rate limiting behavior including RPS configuration, rejection, disabled state, and provider 429 handling; updates existing test patches |
| tests/unit/nilai_api/test_rate_limiting.py | Removes web search RPS test and CONFIG import (coverage moved to test_web_search.py) |
| nilai-api/pyproject.toml | Adds aiolimiter>=1.0.0 dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @lru_cache(maxsize=1) | ||
| def _get_brave_rate_limiter() -> AsyncLimiter | None: | ||
| rps = CONFIG.web_search.rps | ||
| if rps is None or rps <= 0: | ||
| return None | ||
| return AsyncLimiter(max_rate=rps, time_period=1.0) |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @lru_cache decorator on _get_brave_rate_limiter() creates a concurrency issue. The cached AsyncLimiter instance will be shared across all concurrent requests, but lru_cache itself is not thread-safe and doesn't synchronize access. Additionally, if CONFIG.web_search.rps changes at runtime (e.g., via config reload), the cached limiter won't reflect the new value until the cache is manually cleared.
Consider using a module-level variable with proper synchronization or a thread-safe singleton pattern to ensure the rate limiter is safely shared across concurrent requests.
…and documentation - Added detailed docstrings for rate limiter and API request functions. - Removed immediate rejection of requests exceeding rate limit, allowing for queuing. - Introduced unit tests to validate rate limiting behavior across multiple users and request scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert sorted_times[2] - sorted_times[0] >= 0.4 | ||
| assert sorted_times[3] - sorted_times[1] >= 0.4 |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The magic numbers 0.4 represent an expected delay based on the rate limit of 2 RPS (0.5 seconds between requests). Consider defining these as named constants (e.g., EXPECTED_MIN_DELAY = 0.4) to improve code clarity and make the relationship to the configured RPS of 2 more explicit.
| latest_a = max(call_log[q] for q in user_a_queries) | ||
| earliest_b = min(call_log[q] for q in user_b_queries) | ||
| # Even though user B starts after A finished, they still wait for the global limiter slot. | ||
| assert earliest_b - latest_a >= 0.4 |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The magic number 0.4 represents an expected delay based on rate limiting. Consider defining this as a named constant to improve code clarity and maintainability, similar to the issue in lines 239-240.
No description provided.