-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance vs. Prometheus SDK #5542
Comments
For me it looks like a correct usage of OTel Metrics SDK. |
To summarize my findings in #5544.
|
This appears to be because of the time.Now() call for each measurement. We should at least consider moving the time.Now call into the exemplar reservoir so that it is only invoked when we are actually recording an exemplar. |
I also found that the benchmark did not change if I swapped out the OTel prometheus exporter with a manual reader (which is expected). I'm removing the prometheus exporter label. |
#5545 is a ~45% performance improvement for the zero-attributes case, and a ~20% performance improvement for the single-attribute case. |
) Part of addressing #5542. ### Motivation This removes the `time.Now()` call from filtered-out Exemplars by only invoking `time.Now()` after the filtering decision is made. This improvement is especially noticeable for measurements without any attributes. ``` goos: linux goarch: amd64 pkg: go.opentelemetry.io/otel/sdk/metric cpu: AMD EPYC 7B12 │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ SyncMeasure/NoView/Int64Counter/Attributes/0-24 158.20n ± 4% 99.83n ± 1% -36.90% (p=0.000 n=10) SyncMeasure/NoView/Int64Counter/Attributes/1-24 333.3n ± 4% 274.8n ± 1% -17.55% (p=0.000 n=10) SyncMeasure/NoView/Int64Counter/Attributes/10-24 1.640µ ± 1% 1.600µ ± 1% -2.41% (p=0.000 n=10) SyncMeasure/NoView/Float64Counter/Attributes/0-24 159.0n ± 3% 101.3n ± 0% -36.27% (p=0.000 n=10) SyncMeasure/NoView/Float64Counter/Attributes/1-24 340.0n ± 2% 272.0n ± 1% -20.00% (p=0.000 n=10) SyncMeasure/NoView/Float64Counter/Attributes/10-24 1.661µ ± 1% 1.597µ ± 0% -3.85% (p=0.000 n=10) SyncMeasure/NoView/Int64UpDownCounter/Attributes/0-24 159.8n ± 1% 103.1n ± 0% -35.50% (p=0.000 n=10) SyncMeasure/NoView/Int64UpDownCounter/Attributes/1-24 339.5n ± 1% 273.1n ± 0% -19.57% (p=0.000 n=10) SyncMeasure/NoView/Int64UpDownCounter/Attributes/10-24 1.656µ ± 0% 1.589µ ± 0% -4.05% (p=0.000 n=10) SyncMeasure/NoView/Float64UpDownCounter/Attributes/0-24 159.3n ± 2% 100.8n ± 0% -36.74% (p=0.000 n=10) SyncMeasure/NoView/Float64UpDownCounter/Attributes/1-24 337.9n ± 2% 271.8n ± 1% -19.55% (p=0.000 n=10) SyncMeasure/NoView/Float64UpDownCounter/Attributes/10-24 1.657µ ± 0% 1.593µ ± 1% -3.83% (p=0.000 n=10) SyncMeasure/NoView/Int64Histogram/Attributes/0-24 144.65n ± 4% 89.38n ± 0% -38.21% (p=0.000 n=10) SyncMeasure/NoView/Int64Histogram/Attributes/1-24 235.7n ± 2% 183.5n ± 0% -22.15% (p=0.000 n=10) SyncMeasure/NoView/Int64Histogram/Attributes/10-24 900.8n ± 1% 836.8n ± 0% -7.10% (p=0.000 n=10) SyncMeasure/NoView/Float64Histogram/Attributes/0-24 145.60n ± 5% 93.48n ± 1% -35.80% (p=0.000 n=10) SyncMeasure/NoView/Float64Histogram/Attributes/1-24 240.9n ± 1% 183.0n ± 0% -24.06% (p=0.000 n=10) SyncMeasure/NoView/Float64Histogram/Attributes/10-24 905.6n ± 1% 826.3n ± 0% -8.76% (p=0.000 n=10) SyncMeasure/DropView/Int64Counter/Attributes/0-24 20.33n ± 0% 20.35n ± 0% ~ (p=0.302 n=10) SyncMeasure/DropView/Int64Counter/Attributes/1-24 26.46n ± 0% 26.45n ± 1% ~ (p=0.868 n=10) SyncMeasure/DropView/Int64Counter/Attributes/10-24 26.50n ± 0% 26.47n ± 0% ~ (p=0.208 n=10) SyncMeasure/DropView/Float64Counter/Attributes/0-24 20.34n ± 1% 20.27n ± 0% -0.34% (p=0.009 n=10) SyncMeasure/DropView/Float64Counter/Attributes/1-24 26.55n ± 0% 26.60n ± 1% ~ (p=0.109 n=10) SyncMeasure/DropView/Float64Counter/Attributes/10-24 26.59n ± 1% 26.57n ± 1% ~ (p=0.926 n=10) SyncMeasure/DropView/Int64UpDownCounter/Attributes/0-24 20.38n ± 1% 20.38n ± 0% ~ (p=0.725 n=10) SyncMeasure/DropView/Int64UpDownCounter/Attributes/1-24 26.39n ± 0% 26.44n ± 0% ~ (p=0.238 n=10) SyncMeasure/DropView/Int64UpDownCounter/Attributes/10-24 26.52n ± 0% 26.42n ± 0% -0.36% (p=0.049 n=10) SyncMeasure/DropView/Float64UpDownCounter/Attributes/0-24 20.30n ± 0% 20.25n ± 0% ~ (p=0.196 n=10) SyncMeasure/DropView/Float64UpDownCounter/Attributes/1-24 26.57n ± 0% 26.54n ± 1% ~ (p=0.540 n=10) SyncMeasure/DropView/Float64UpDownCounter/Attributes/10-24 26.57n ± 0% 26.51n ± 1% ~ (p=0.643 n=10) SyncMeasure/DropView/Int64Histogram/Attributes/0-24 20.37n ± 0% 20.36n ± 1% ~ (p=1.000 n=10) SyncMeasure/DropView/Int64Histogram/Attributes/1-24 26.41n ± 0% 26.50n ± 0% +0.32% (p=0.007 n=10) SyncMeasure/DropView/Int64Histogram/Attributes/10-24 26.44n ± 0% 26.55n ± 1% +0.42% (p=0.012 n=10) SyncMeasure/DropView/Float64Histogram/Attributes/0-24 20.30n ± 0% 20.45n ± 0% +0.74% (p=0.000 n=10) SyncMeasure/DropView/Float64Histogram/Attributes/1-24 26.52n ± 0% 26.48n ± 0% ~ (p=0.127 n=10) SyncMeasure/DropView/Float64Histogram/Attributes/10-24 26.55n ± 0% 26.48n ± 0% -0.26% (p=0.002 n=10) SyncMeasure/AttrFilterView/Int64Counter/Attributes/0-24 170.5n ± 2% 110.8n ± 0% -35.03% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64Counter/Attributes/1-24 402.5n ± 1% 331.5n ± 1% -17.64% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64Counter/Attributes/10-24 1.363µ ± 1% 1.281µ ± 1% -6.02% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64Counter/Attributes/0-24 170.6n ± 1% 111.5n ± 1% -34.64% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64Counter/Attributes/1-24 397.1n ± 1% 335.9n ± 0% -15.41% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64Counter/Attributes/10-24 1.371µ ± 1% 1.279µ ± 1% -6.71% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64UpDownCounter/Attributes/0-24 170.1n ± 1% 112.2n ± 0% -34.09% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64UpDownCounter/Attributes/1-24 397.5n ± 1% 330.2n ± 0% -16.93% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64UpDownCounter/Attributes/10-24 1.371µ ± 1% 1.289µ ± 1% -5.95% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64UpDownCounter/Attributes/0-24 171.4n ± 2% 112.9n ± 0% -34.13% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64UpDownCounter/Attributes/1-24 397.0n ± 3% 336.4n ± 0% -15.24% (p=0.000 n=10) SyncMeasure/AttrFilterView/Float64UpDownCounter/Attributes/10-24 1.383µ ± 1% 1.305µ ± 1% -5.61% (p=0.000 n=10) SyncMeasure/AttrFilterView/Int64Histogram/Attributes/0-24 157.30n ± 2% 98.58n ± 1% -37.33% (p=0.000 n=6+10) ``` ### Changes * Introduce `exemplar.Filter`, which is a filter function based on the context. It will not be user-facing, so we can always add other parameters later if needed. * Introduce `exemplar.FilteredReservoir`, which is similar to a reservoir, except it does not receive a timestamp. It gets the current time after the filter decision has been made. It uses generics to avoid the call to exemplar.NewValue(), since it is internal-only. * The `exemplar.Reservoir` is left as-is, so that it can be made public when exemplars are stable. It still includes a timestamp argument. * Unit tests are updated to expect a much lower number of calls to time.Now * `exemplar.Drop` is now an `exemplar.FilteredReservoir` instead of a `Reservoir`, since we don't need a Reservoir to store things in if the measurement is always dropped. Co-authored-by: Sam Xie <[email protected]> Co-authored-by: Tyler Yahn <[email protected]>
Description
Jaeger is in the process of migrating away from Prometheus SDK towards OTEL SDK. We're currently blocked by a massive performance degradation, as illustrated by this benchmark jaegertracing/jaeger#5676. Are we not using OTEL SDK correctly? We're seeing 10-25x slowdown compared to Prometheus SDK.
Environment
Steps To Reproduce
jaegertracing/jaeger#5676
Expected behavior
Expecting to see counter bumps to be in the ballpark with Prometheus counters.
The text was updated successfully, but these errors were encountered: