Commit c971199
Create
## Summary of changes
Creates a .NET 6+ only implementation of `IRuntimeMetricsListener` that
uses the `System.Diagnostics.Metrics` (and other) APIs
## Reason for change
.NET Core (probably all versions, but at least .NET 6+) has a memory
leak with the event pipes, which means if we enable runtime metrics, we
likely have a slow memory leak 😬 [This was raised ~1 year ago with .NET
team](dotnet/runtime#111368), specifically
citing dd-trace-dotnet. but doesn't have a fix yet. Also a PR has been
open on the .NET repo with a tentative fix for ~2 months, so _at best_
this _might_ be fixed in .NET 11.
Separately, the `System.Diagnostics.Metrics` APIs were introduced in
.NET 6, with support for aspnetcore-based metrics added in .NET 8, and
support for "runtime" metrics in .NET 9.
This PR introduces a new (experimental for now)
`IRuntimeMetricsListener` implementation that doesn't use
`EventListener`, and instead uses the `System.Diagnostics.Metrics` APIs,
aiming to provide essentially the same runtime metrics we currently do,
just using a different source.
## Implementation details
- Created a new `IRuntimeMetricsListener` implementation,
`DiagnosticsMetricsRuntimeMetricsListener`
- Added a config to enable it in .NET 6+ only,
`DD_RUNTIME_METRICS_DIAGNOSTICS_METRICS_API_ENABLED`
- Open to suggestions here. Other options include having an "enum" type
for listener instead of just this one. That's harder to consume for
customers, but more extensible theoretically.
- Added tests
To give as wide compatibility as possible, and to avoid any additional
overhead, whenever the built-in runtime metrics use existing APIs (e.g.
via `GC` calls), we use those instead of the metrics.
In summary:
Thread metrics:
- `runtime.dotnet.threads.workers_count`: via `ThreadPool.ThreadCount`
(same as `RuntimeEventListener`)
- `runtime.dotnet.threads.contention_count: via
`Monitor.LockContentionCount`
GC metrics:
- `runtime.dotnet.gc.size.gen#` from info in `GC.GetGCMemoryInfo()`,
which mirrors [the built-in
approach](https://github.com/dotnet/runtime/blob/v10.0.1/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/RuntimeMetrics.cs#L185).
- `runtime.dotnet.gc.memory_load` was a tricky one as the built-in uses
a new API, but I think the info we get in `GC.GetGCMemoryInfo()` is
broadly good enough
- `runtime.dotnet.gc.count.gen#` uses `GC.CollectionCount()`, same as
[built-in
approach](https://github.com/dotnet/runtime/blob/v10.0.1/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/RuntimeMetrics.cs#L159)
- `runtime.dotnet.gc.pause_time` this was also a tricky one, more on it
below...
`runtime.dotnet.gc.pause_time` is a runtime metric that's available in
.NET 9, so when we're running in .NET 9 we just use that value. There's
actually also a public API introduce in .NET 6,
`GetTotalPauseDuration()`, but [it's _only_ available from
6.0.21](dotnet/runtime#87143), so we can't
directly reference it. Resorted to using a simple `CreateDelegate` call
to invoke it in these cases. We could use duck typing, but didn't seem
worth it. If we're running < 6.0.21, there's no feasible way to get the
value, so we just don't emit it.
ASP.NET Core metrics:
- `runtime.dotnet.aspnetcore.requests.current`
- `runtime.dotnet.aspnetcore.requests.failed`
- `runtime.dotnet.aspnetcore.requests.total`
- `runtime.dotnet.aspnetcore.requests.queue_length`
- `runtime.dotnet.aspnetcore.connections.current`
- `runtime.dotnet.aspnetcore.connections.queue_length`
- `runtime.dotnet.aspnetcore.connections.total`
Note that the `.total` and `.failed` requests are recorded as _gauges_
(which monotonically increase), which doesn't feel right to me (they
should be counters, surely), but that's what `RuntimeEventListener` is
using, so we have to stick to the same thing (metric types are global by
metric, so we can't change it). It means there's a risk of overflow
there, but that's already the case for `RuntimeEventListener` so I guess
we just ignore it 🤷♂️
I couldn't find a way to get the following metrics at all without using
`EventListener`:
- `runtime.dotnet.threads.contention_time`
## Test coverage
Added unit and integration tests for the listener behavior.
I also manually ran an aspnetcore app in a loop with both the
`RuntimeEventListener` and the new listener producing metrics (hacked
in, we wont ever do this in "normal" execution), and did a manual
comparison of the metrics. Overall, the values were in broad agreement
(slightly off, due to skew in sampling time) and helped identify some
cases where I'd made incorrect assumptions (e.g. aspnetcore `.total`
metrics are never "reset" to 0.
## Other details
Relates to:
-
#5862 (comment)
- dotnet/runtime#111368
- dotnet/runtime#118415
- https://datadoghq.atlassian.net/browse/LANGPLAT-916
---------
Co-authored-by: Steven Bouwkamp <steven.bouwkamp@datadoghq.com>System.Diagnostics.Metrics-based runtime metrics listener (#8027)1 parent 243f2c8 commit c971199
File tree
17 files changed
+673
-15
lines changed- tracer
- src
- Datadog.Trace.Trimming/build
- Datadog.Trace
- Configuration
- Generated
- net461/Datadog.Trace.SourceGenerators/ConfigurationKeysGenerator
- net6.0/Datadog.Trace.SourceGenerators/ConfigurationKeysGenerator
- netcoreapp3.1/Datadog.Trace.SourceGenerators/ConfigurationKeysGenerator
- netstandard2.0/Datadog.Trace.SourceGenerators/ConfigurationKeysGenerator
- RuntimeMetrics
- test
- Datadog.Trace.ClrProfiler.IntegrationTests
- Datadog.Trace.Tests
- RuntimeMetrics
- Telemetry
17 files changed
+673
-15
lines changedLines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
614 | 614 | | |
615 | 615 | | |
616 | 616 | | |
| 617 | + | |
617 | 618 | | |
618 | 619 | | |
619 | 620 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
179 | 179 | | |
180 | 180 | | |
181 | 181 | | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
182 | 192 | | |
183 | 193 | | |
184 | 194 | | |
| |||
1053 | 1063 | | |
1054 | 1064 | | |
1055 | 1065 | | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
1056 | 1075 | | |
1057 | 1076 | | |
1058 | 1077 | | |
| |||
Lines changed: 7 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
555 | 555 | | |
556 | 556 | | |
557 | 557 | | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
558 | 565 | | |
559 | 566 | | |
560 | 567 | | |
| |||
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
868 | 868 | | |
869 | 869 | | |
870 | 870 | | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
871 | 876 | | |
872 | 877 | | |
873 | 878 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
218 | 227 | | |
219 | 228 | | |
220 | 229 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
218 | 227 | | |
219 | 228 | | |
220 | 229 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
218 | 227 | | |
219 | 228 | | |
220 | 229 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
215 | 215 | | |
216 | 216 | | |
217 | 217 | | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
218 | 227 | | |
219 | 228 | | |
220 | 229 | | |
| |||
0 commit comments