-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-647: Update apiserver tracing KEP to beta for 1.24 #3161
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
for Instrumentation, leaving PRR to @wojtek-t
/hold to prevent premature merge from PRR |
|
||
* **What specific metrics should inform a rollback?** | ||
|
||
* API Server [SLOs](https://github.com/kubernetes/community/tree/master/sig-scalability/slos) are the signals that should guide a rollback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you maybe explicitly link one of those two metrics:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go#L99-L124
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - how about increased number of errors maybe (timeouts come to my mind first)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a link to those metrics. For errors, were you thinking of pointing to apiserver_request_post_timeout_total? Or something else?
- Components exposing the metric: | ||
- [ ] Other (treat as last resort) | ||
- Details: | ||
- OpenTelemetry does not currently expose metrics about the number of traces sent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a useful optional feature. Can you maybe open a feature request on the OT for doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - can we somehow mitigate it on our side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
open-telemetry/opentelemetry-go#2547. I don't know of a way to work around it right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link this issue in the KEP too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor nit - I will approve once fixed.
- Components exposing the metric: | ||
- [ ] Other (treat as last resort) | ||
- Details: | ||
- OpenTelemetry does not currently expose metrics about the number of traces sent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link this issue in the KEP too?
Thanks for pushing this @dashpole ! I'm so happy to see this effort progressing :) /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dashpole, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Cancelling hold given Elana`s lgtm from sig -instrumentation. /hold cancel |
One-line PR description: Update apiserver tracing to beta for 1.24.
Issue link: API Server tracing #647
Other comments:
/assign @ehashman