-
Notifications
You must be signed in to change notification settings - Fork 22
Add metric tag family for too many requests http errors #402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Generate changelog in
|
okushchenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To further disambiguate the 4xx group, it might be useful to bundle client-side and server-side timeouts together by treating response code 408 as timeout instead of 4xx.
| return metrics.Tags{metricTagFamily3xx} | ||
| case resp.StatusCode == 408: | ||
| return metrics.Tags{metricTagFamilyTimeout} | ||
| case resp.StatusCode == 429: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes sense, but I am a little worried about changing the semantic meaning of 4xx to except 408 and 429. Do we think this could be confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already treat client-side timeouts differently (see https://github.com/palantir/conjure-go-runtime/blob/develop/conjure-go-client/httpclient/metrics.go#L148), but these are not tied to any specific response status code, so not that applicable. I think the value of distinguishing 408s and 429s across other 4xx status codes is significant enough to justify this change. One option to keep both would be to keep 4xx tag, and add Timeout and TooManyRequests as additional tags, this way we don't break anything.
|
This PR has been automatically marked as stale because it has not been touched in the last 14 days. If you'd like to keep it open, please leave a comment or add the 'long-lived' label, otherwise it'll be closed in 7 days. |
Before this PR
We need a way to disambiguate throttling limits errors from other 4xx errors (e.g. for Azure APIs).
After this PR
==COMMIT_MSG==
Add metric tag family for too many requests http errors
==COMMIT_MSG==
Possible downsides?
This change is