Skip to content

Adding http metrics to calculate upstream response times #342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 24, 2025

Conversation

davidcollom
Copy link
Collaborator

Many of the time, performance issues can be related to the fact that the upstream is taking a lot longer to respond,
this can be when the number of container tags are published. There's hope here that we can monitor and observe the many API Calls performed to upstream services

@@ -51,18 +51,23 @@ func NewCommand(ctx context.Context) *cobra.Command {
return fmt.Errorf("failed to build kubernetes client: %s", err)
}

metrics := metrics.New(log)
if err := metrics.Run(opts.MetricsServingAddress); err != nil {
metricsServer := metrics.NewServer(log)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to prevent a name collision from the metrics package

@hawksight
Copy link
Member

There's a bit too much going on for me to fully review. @davidcollom could you maybe add some screenshots of the metric added so I can see the benefit from a user perspective?

@davidcollom
Copy link
Collaborator Author

@hawksight Appreciate that there's a lot here! 🙈 , the outputted metrics are similar to the following:

# HELP http_client_in_flight_requests A gauge of in-flight requests for the wrapped client.
# TYPE http_client_in_flight_requests gauge
http_client_in_flight_requests 0
# HELP http_client_request_duration_seconds A histogram of request durations.
# TYPE http_client_request_duration_seconds gauge
http_client_request_duration_seconds{domain="nvcr.io",method="GET"} 0.540476459
http_client_request_duration_seconds{domain="registry.redhat.io",method="GET"} 0.100040708
# HELP http_client_requests_total A counter for requests from the wrapped client.
# TYPE http_client_requests_total counter
http_client_requests_total{code="OK",domain="nvcr.io",method="GET"} 2
http_client_requests_total{code="Unauthorized",domain="nvcr.io",method="GET"} 2
http_client_requests_total{code="Unauthorized",domain="registry.redhat.io",method="GET"} 3
# HELP http_dns_duration_seconds Trace DNS latency histogram.
# TYPE http_dns_duration_seconds gauge
http_dns_duration_seconds{domain="nvcr.io",event="dns_done"} 0.001039958
http_dns_duration_seconds{domain="registry.redhat.io",event="dns_done"} 0.002216666
# HELP http_tls_duration_seconds Trace TLS latency histogram.
# TYPE http_tls_duration_seconds gauge
http_tls_duration_seconds{domain="nvcr.io",event="tls_done"} 0.155622833
http_tls_duration_seconds{domain="registry.redhat.io",event="tls_done"} 0.049225542

@davidcollom davidcollom enabled auto-merge (squash) March 24, 2025 12:55
@davidcollom davidcollom merged commit a6fd2a1 into main Mar 24, 2025
5 checks passed
@davidcollom davidcollom deleted the metrics/http branch March 24, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants