Skip to content

Conversation

guscarreon
Copy link
Contributor

Adds a metric to keep track of the size in bytes of requests that hit the /auction, /amp and /video endpoints. Should we add the cookie_sync endpoint?

Prometheus

To anyone who reviews, please let me know if you agree with the buckets (1000, 5000, 10000, 15000, 20000, 40000, 75000, and 100000 bytes respectively)

  pbs_sub_request_queue_time_bucket{request_status="requestRejectedLabel",request_type="video",le="240"} 0
  pbs_sub_request_queue_time_bucket{request_status="requestRejectedLabel",request_type="video",le="300"} 0
  pbs_sub_request_queue_time_bucket{request_status="requestRejectedLabel",request_type="video",le="+Inf"} 0
  pbs_sub_request_queue_time_sum{request_status="requestRejectedLabel",request_type="video"} 0
  pbs_sub_request_queue_time_count{request_status="requestRejectedLabel",request_type="video"} 0
+ # HELP pbs_sub_request_size_bytes Count that keeps track of incoming request size in bytes labeled by endpoint.
+ # TYPE pbs_sub_request_size_bytes histogram
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="1000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="5000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="10000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="15000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="20000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="40000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="75000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="100000"} 1
+ pbs_sub_request_size_bytes_bucket{request_endpoint="auction",le="+Inf"} 1
+ pbs_sub_request_size_bytes_sum{request_endpoint="auction"} 809
+ pbs_sub_request_size_bytes_count{request_endpoint="auction"} 1
  # HELP pbs_sub_request_time_seconds Seconds to resolve successful Prebid Server requests labeled by type.
  # TYPE pbs_sub_request_time_seconds histogram
  pbs_sub_request_time_seconds_bucket{request_type="amp",le="0.05"} 0
  pbs_sub_request_time_seconds_bucket{request_type="amp",le="0.1"} 0

@guscarreon guscarreon changed the title Metric to account for request size Metric for incoming request sizes Aug 20, 2025
Copy link
Collaborator

@bsardo bsardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the metric for any other endpoints. Auction, amp and video endpoints are sufficient.

@@ -168,6 +170,7 @@ func NewMetrics(cfg config.PrometheusMetrics, disabledMetrics config.DisabledMet
priceBuckets := []float64{250, 500, 750, 1000, 1500, 2000, 2500, 3000, 3500, 4000}
queuedRequestTimeBuckets := []float64{0, 1, 5, 30, 60, 120, 180, 240, 300}
overheadTimeBuckets := []float64{0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
requestSizeBuckets := []float64{1000, 5000, 10000, 15000, 20000, 40000, 75000, 100000}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How were these buckets chosen? Are they based off of any real requests? Have we seen at least one request that falls into each of these buckets or at least most of them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to say... My test request that invokes a lot of request-related features (Stored bid responses, FPD, multi-type imp, category mapping, etc) takes ~4500 bytes.
Should we first see how production data will look like and adjust buckets accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How were these buckets chosen?

Arbitrarily. These have to be discussed and carefully chosen first.

Comment on lines 678 to 682
endpoint := metrics.GetEndpointFromRequestType(labels.RType)
m.requestsSize.With(prometheus.Labels{
requestEndpointLabel: string(endpoint),
}).Observe(float64(labels.RequestSize))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this PR with AMP endpoint and found that if stored request was not found and request Len was never measured, we still write 0 bytes to metrics.
It falls into the very first basket and looks like this in result:

# TYPE pbs_sub_request_size_bytes histogram
pbs_sub_request_size_bytes_bucket{request_size="amp",le="1000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="5000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="10000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="15000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="20000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="40000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="75000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="100000"} 1
pbs_sub_request_size_bytes_bucket{request_size="amp",le="+Inf"} 1

It can be misinterpreted that request had less than 1000 bytes, where in fact that was an error.
Should we add the check for the labels.RequestStatus? Only write metric if labels.RequestStatus == metrics.RequestStatusOK?
Screenshot 2025-08-22 at 12 31 26 AM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this PR with AMP endpoint and found that if stored request was not found and request Len was never measured, we still write 0 bytes to metrics.

Really? Let me go in there with the debugger. Do you mind sharing the amp request you used for this test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the AMP endpoint need to be updated to set this label in loadRequestJSONForAmp? I don't think we are recording the size on this endpoint at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the AMP endpoint need to be updated

Yes. Corrected

@@ -168,6 +170,7 @@ func NewMetrics(cfg config.PrometheusMetrics, disabledMetrics config.DisabledMet
priceBuckets := []float64{250, 500, 750, 1000, 1500, 2000, 2500, 3000, 3500, 4000}
queuedRequestTimeBuckets := []float64{0, 1, 5, 30, 60, 120, 180, 240, 300}
overheadTimeBuckets := []float64{0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
requestSizeBuckets := []float64{600, 700, 800, 900, 1000, 1500, 3000, 5000, 6000, 7500, 10000, 15000}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bucket range allows for a max request size of ~15kb. I think that's too small. I've seen very large requests in the wild, especially when privacy signals are involved. Everyone's mileage will vary, but how about we start with 100kb as a max size?

Something like...
requestSizeBuckets := []float64{100, 500, 1000, 2000, 5000, 10000, 20000, 50000, 75000, 100000}

Copy link
Collaborator

@bsardo bsardo Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline.
requestSizeBuckets := []float64{100, 500, 750, 1000, 2000, 4000, 7000, 10000, 15000, 20000, 50000, 75000}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected. New buckets:

   pbs_sub_request_queue_time_sum{request_status="requestRejectedLabel",request_type="video"} 0
   pbs_sub_request_queue_time_count{request_status="requestRejectedLabel",request_type="video"} 0
 + # HELP pbs_sub_request_size_bytes Count that keeps track of incoming request size in bytes labeled by endpoint.
 + # TYPE pbs_sub_request_size_bytes histogram
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="100"} 0
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="500"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="750"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="1000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="2000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="4000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="7000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="10000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="15000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="20000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="50000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="75000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="auction",le="+Inf"} 1
 + pbs_sub_request_size_bytes_sum{request_size="auction"} 217
 + pbs_sub_request_size_bytes_count{request_size="auction"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="100"} 0
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="500"} 0
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="750"} 0
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="1000"} 0
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="2000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="4000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="7000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="10000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="15000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="20000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="50000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="75000"} 1
 + pbs_sub_request_size_bytes_bucket{request_size="video",le="+Inf"} 1
 + pbs_sub_request_size_bytes_sum{request_size="video"} 1790
 + pbs_sub_request_size_bytes_count{request_size="video"} 1
   # HELP pbs_sub_request_time_seconds Seconds to resolve successful Prebid Server requests labeled by type.
   # TYPE pbs_sub_request_time_seconds histogram
   pbs_sub_request_time_seconds_bucket{request_type="amp",le="0.05"} 0

@@ -442,6 +442,7 @@ func (deps *endpointDeps) parseRequest(httpRequest *http.Request, labels *metric
errs = []error{err}
return
}
labels.RequestSize = len(requestJson)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are capturing the size of the request that comes in over the wire prior to merging in stored requests for both the auction and video endpoints while the amp endpoint captures the request size after stored requests are merged. IMO this is a bit confusing. Do we even need the amp request size since it is just based on stored requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified to not log these metrics oh the amp endpoint

@bsardo bsardo changed the title Metric for incoming request sizes Metrics: Record incoming request sizes Sep 4, 2025
@bsardo bsardo merged commit f63f77a into master Sep 4, 2025
4 checks passed
@@ -169,6 +171,7 @@ func NewMetrics(cfg config.PrometheusMetrics, disabledMetrics config.DisabledMet
priceBuckets := []float64{250, 500, 750, 1000, 1500, 2000, 2500, 3000, 3500, 4000}
queuedRequestTimeBuckets := []float64{0, 1, 5, 30, 60, 120, 180, 240, 300}
overheadTimeBuckets := []float64{0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1}
requestSizeBuckets := []float64{100, 500, 750, 1000, 2000, 4000, 7000, 10000, 15000, 20000, 50000, 75000}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to pretty hardcoded sizes
I would use max_request_size / 10 buckets
https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#LinearBuckets

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants