-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: retry on anthropic overloaded #356
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 915bd15 in 1 minute and 24 seconds
More details
- Looked at
63
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. gptme/llm/llm_anthropic.py:32
- Draft comment:
Theretry_generator_on_overloaded
decorator does not handle partial results from the generator function. Consider implementing logic to handle cases where the generator yields some results before an error occurs. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment raises a valid technical point - the retry mechanism does restart the generator from scratch. However, in this specific context, this is being used for the Anthropic API streaming which is idempotent. The stream() function yields chunks of a response, and restarting from the beginning on a 503 error is actually the desired behavior since we want the complete response. The comment is technically correct but not practically relevant for this use case.
I could be wrong about the idempotency assumptions. There might be edge cases where partial results need to be preserved that I haven't considered.
The code is specifically handling Anthropic API streaming where restarting the stream from the beginning on a 503 error is appropriate. The current implementation matches the needs of this specific use case.
While technically valid, the comment raises a concern that isn't relevant for this specific API streaming use case where restarting from the beginning is appropriate. The comment should be removed.
Workflow ID: wflow_UlhJbe5RFIlwSlI6
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
❌ 1 Tests Failed:
View the top 1 failed tests by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Based on the code changes and the description provided, this pull request has a good chance of addressing the issue described in #345. Here's an analysis of the changes:
However, there are a few points to consider:
To fully confirm if this fixes #345, it would be helpful to:
Despite these considerations, the implemented solution is a step in the right direction and should help mitigate issues with Anthropic API overload errors. It's a good improvement that addresses the described problem with a well-thought-out retry strategy. |
@gptme could you also add the decorator to |
Thank you for the suggestion. I've added the
These changes should provide more robust handling of Anthropic API overload errors for both streaming and non-streaming API calls. This comprehensive approach should help address the issue described in #345 more effectively. To further improve the solution, consider:
These changes should significantly improve the reliability of interactions with the Anthropic API, especially during periods of high load or temporary service issues. |
I'm sorry, I could not fulfill your request. Please check the log of this run for more information. Conversation log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5b44061 in 53 seconds
More details
- Looked at
52
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. gptme/llm/llm_anthropic.py:50
- Draft comment:
Theretry_on_overloaded
decorator is missing a return statement after the retry loop. If the function does not raise an exception but also does not return a value within the loop, it will returnNone
. Consider adding a return statement after the loop to handle this case. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment suggests adding a return statement after the retry loop, but this is incorrect. The function will either return the wrapped function's result via 'return func(*args, **kwargs)', raise an exception on the last retry, or continue retrying. There is no path where it silently returns None.
Could there be some edge case I'm missing? What if the function being wrapped returns None intentionally?
Even if the wrapped function returns None, that's handled by the 'return func(*args, **kwargs)' line - it would properly return that None value. The decorator is transparent to the return value.
The comment is incorrect - the decorator properly handles all return paths and cannot silently return None. The comment should be deleted.
Workflow ID: wflow_OwAm6CzR60tFPIKe
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
A single failing test in
and for the openai case:
wtf? where is it getting "gpt-o4" from?! |
@gptme why does it happen? i find no references to "o4" in the codebase or anything? |
I'm sorry, I could not fulfill your request. Please check the log of this run for more information. |
Fixed in dec7a60 |
Attempt at fixing #345
Untested since it's hard to reproduce.
Important
Adds retry mechanisms with exponential backoff for handling Anthropic API overload errors in
llm_anthropic.py
.retry_on_overloaded
andretry_generator_on_overloaded
decorators inllm_anthropic.py
for retrying onAPIStatusError
with status code 503 using exponential backoff.retry_on_overloaded
tochat()
andretry_generator_on_overloaded
tostream()
inllm_anthropic.py
to handle Anthropic API overloads.time
andwraps
imports inllm_anthropic.py
for implementing retry logic.This description was created by for 5b44061. It will automatically update as commits are pushed.