Add consistent sampling via knuth's method by bm1549 · Pull Request #5386 · DataDog/dd-trace-js

bm1549 · 2025-03-10T19:56:27Z

What does this PR do?

Motivation

Plugin Checklist

Additional Notes

github-actions · 2025-03-10T19:57:19Z

Overall package size

Self size: 8.96 MB
Deduped: 101.5 MB
No deduping: 102.01 MB

Dependency sizes

| name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.5.0 | 29.83 MB | 29.83 MB | | @datadog/native-appsec | 8.5.0 | 19.26 MB | 19.26 MB | | @datadog/native-iast-taint-tracking | 3.3.0 | 13.77 MB | 13.78 MB | | @datadog/pprof | 5.5.1 | 9.79 MB | 10.17 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.4.0 | 2.77 MB | 5.42 MB | | @datadog/native-iast-rewriter | 2.8.0 | 2.6 MB | 2.74 MB | | @datadog/native-metrics | 3.1.0 | 1.06 MB | 1.46 MB | | @opentelemetry/api | 1.8.0 | 1.21 MB | 1.21 MB | | import-in-the-middle | 1.13.1 | 117.64 kB | 839.26 kB | | source-map | 0.7.4 | 226 kB | 226 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | lru-cache | 7.18.3 | 133.92 kB | 133.92 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 5.3.2 | 53.63 kB | 53.63 kB | | istanbul-lib-coverage | 3.2.0 | 29.34 kB | 29.34 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | @isaacs/ttlcache | 1.4.1 | 25.2 kB | 25.2 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | dc-polyfill | 0.1.6 | 24.56 kB | 24.56 kB | | shell-quote | 1.8.2 | 23.54 kB | 23.54 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | koalas | 1.0.2 | 6.47 kB | 6.47 kB | | module-details-from-path | 1.0.3 | 4.47 kB | 4.47 kB |

_{🤖 This report was automatically generated by heaviest-objects-in-the-universe}

codecov · 2025-03-10T19:57:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.89%. Comparing base (4cfe991) to head (89b99e6).
Report is 125 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5386      +/-   ##
==========================================
- Coverage   80.59%   78.89%   -1.70%     
==========================================
  Files         494      329     -165     
  Lines       22082    13324    -8758     
==========================================
- Hits        17797    10512    -7285     
+ Misses       4285     2812    -1473

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

datadog-datadog-prod-us1 · 2025-03-10T20:03:59Z

Datadog Report

Branch report: brian.marks/knuth-sampling
Commit report: da6105d
Test service: dd-trace-js-integration-tests

✅ 0 Failed, 800 Passed, 0 Skipped, 10m 58.93s Total Time

pr-commenter · 2025-03-10T20:04:21Z

Benchmarks

Benchmark execution time: 2025-03-14 17:53:39

Comparing candidate commit 89b99e6 in PR branch brian.marks/knuth-sampling with baseline commit 4cfe991 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 916 metrics, 17 unstable metrics.

BridgeAR · 2025-03-17T12:53:27Z

packages/dd-trace/src/opentracing/span_context.js

+  toTraceIdNumber (get128BitId = false) {
+    return parseInt(this.toTraceId(get128BitId), 16)
+  }


This is actually going to be much slower than directly getting the number. This does number -> string -> number instead of just read number.

BridgeAR · 2025-03-17T13:28:56Z

packages/dd-trace/src/sampler.js

-    return this._rate === 1 || Math.random() < this._rate
+  isSampled (context) {
+      return this._rate === 1 ||
+          ((context.toTraceIdNumber(false) * KNUTH_FACTOR) % MAX_TRACE_ID) <= this._sampling_id_threshold


I fail to understand how this shall work.

As far as I understand the Knuth sampling, it has a reservoir of samples and it replaces those during a processing time window depending on the input. This is done randomly with a decreasing chance of replacing items already in the reservoir. Here, the chance seems constant, if I am not mistaken. I read it as if it never takes into account how many we have already processed in a specific time window.

I might also misunderstand how it should work.

bm1549 force-pushed the brian.marks/knuth-sampling branch from 061fb6b to e559aab Compare March 10, 2025 20:13

bm1549 mentioned this pull request Mar 11, 2025

Adds JSDoc types to much of the sampling code #5392

Merged

6 tasks

rebase

89b99e6

bm1549 force-pushed the brian.marks/knuth-sampling branch from e559aab to 89b99e6 Compare March 14, 2025 17:45

BridgeAR reviewed Mar 17, 2025

View reviewed changes

bm1549 closed this Apr 25, 2025

bm1549 deleted the brian.marks/knuth-sampling branch April 25, 2025 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add consistent sampling via knuth's method#5386

Add consistent sampling via knuth's method#5386
bm1549 wants to merge 1 commit intomasterfrom
brian.marks/knuth-sampling

bm1549 commented Mar 10, 2025

Uh oh!

github-actions bot commented Mar 10, 2025 •

edited

Loading

Uh oh!

codecov bot commented Mar 10, 2025 •

edited

Loading

Uh oh!

datadog-datadog-prod-us1 bot commented Mar 10, 2025 •

edited

Loading

Uh oh!

pr-commenter bot commented Mar 10, 2025 •

edited

Loading

Uh oh!

BridgeAR Mar 17, 2025

Uh oh!

BridgeAR Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bm1549 commented Mar 10, 2025

What does this PR do?

Motivation

Plugin Checklist

Additional Notes

Uh oh!

github-actions bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall package size

Uh oh!

codecov bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

datadog-datadog-prod-us1 bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Datadog Report

Uh oh!

pr-commenter bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

BridgeAR Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

BridgeAR Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 10, 2025 •

edited

Loading

codecov bot commented Mar 10, 2025 •

edited

Loading

datadog-datadog-prod-us1 bot commented Mar 10, 2025 •

edited

Loading

pr-commenter bot commented Mar 10, 2025 •

edited

Loading