-
Notifications
You must be signed in to change notification settings - Fork 329
Add consistent sampling via knuth's method #5386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Overall package sizeSelf size: 8.96 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.5.0 | 29.83 MB | 29.83 MB | | @datadog/native-appsec | 8.5.0 | 19.26 MB | 19.26 MB | | @datadog/native-iast-taint-tracking | 3.3.0 | 13.77 MB | 13.78 MB | | @datadog/pprof | 5.5.1 | 9.79 MB | 10.17 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.4.0 | 2.77 MB | 5.42 MB | | @datadog/native-iast-rewriter | 2.8.0 | 2.6 MB | 2.74 MB | | @datadog/native-metrics | 3.1.0 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.13.1 | 117.64 kB | 839.26 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.2 | 53.63 kB | 53.63 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | dc-polyfill | 0.1.6 | 24.56 kB | 24.56 kB | | shell-quote | 1.8.2 | 23.54 kB | 23.54 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5386 +/- ##
==========================================
- Coverage 80.59% 78.89% -1.70%
==========================================
Files 494 329 -165
Lines 22082 13324 -8758
==========================================
- Hits 17797 10512 -7285
+ Misses 4285 2812 -1473 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Datadog ReportBranch report: ✅ 0 Failed, 800 Passed, 0 Skipped, 10m 58.93s Total Time |
BenchmarksBenchmark execution time: 2025-03-14 17:53:39 Comparing candidate commit 89b99e6 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 916 metrics, 17 unstable metrics. |
061fb6b
to
e559aab
Compare
e559aab
to
89b99e6
Compare
toTraceIdNumber (get128BitId = false) { | ||
return parseInt(this.toTraceId(get128BitId), 16) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually going to be much slower than directly getting the number. This does number -> string -> number
instead of just read number
.
return this._rate === 1 || Math.random() < this._rate | ||
isSampled (context) { | ||
return this._rate === 1 || | ||
((context.toTraceIdNumber(false) * KNUTH_FACTOR) % MAX_TRACE_ID) <= this._sampling_id_threshold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to understand how this shall work.
As far as I understand the Knuth sampling, it has a reservoir of samples and it replaces those during a processing time window depending on the input. This is done randomly with a decreasing chance of replacing items already in the reservoir. Here, the chance seems constant, if I am not mistaken. I read it as if it never takes into account how many we have already processed in a specific time window.
I might also misunderstand how it should work.
What does this PR do?
Motivation
Plugin Checklist
Additional Notes