docs: Add metric doc #2946

cijothomas · 2025-04-29T19:27:15Z

This is put into a new docs location. I am open to suggestion on where is the best place to host this.

codecov · 2025-04-29T19:30:29Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.3%. Comparing base (7f8adcc) to head (5146aa6).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #2946   +/-   ##
=====================================
  Coverage   81.3%   81.3%           
=====================================
  Files        126     126           
  Lines      24251   24251           
=====================================
  Hits       19736   19736           
  Misses      4515    4515

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

docs/metrics.md

utpilla · 2025-04-29T21:01:16Z

docs/metrics.md

+frequently. Instruments are fairly expensive and meant to be reused. For most
+applications, instruments can be created once and re-used. Instruments can also
+be cloned to create multiple handles to the same instrument, but the cloning
+should not be on hot path, but instead the cloned instance should be stored and


It's not that bad to clone the instrument as it should only be a matter of incrementing the Arc atomic value, right?

its not as bad as creating new one repeatedly, but the best option is to create/clone and re-use.

I agree that the best thing to do is create/clone and re-use. In my opinion, I find exaggerating the dangers of a not-so-harmful operation a bit odd. We ideally want to mention only those things that we want the users to look out for and not overstate things to lessen the burden on the user.

docs/metrics.md

utpilla · 2025-04-29T22:32:09Z

docs/metrics.md

+
+#### Cardinality Limit - Implications
+
+Cardinality limits are enforced during each export interval, meaning the metrics


Remove this sentence from this section since this is only true for Delta?

utpilla · 2025-04-29T23:11:18Z

docs/metrics.md

+
+  * **Delta Temporality**: The SDK "forgets" the state after each
+    collection/export cycle. This means in each new interval, the SDK can track
+    up to the cardinality limit of completely different attribute combinations.


Rephrasing suggestion:

"the SDK can track as many unique attribute combinations as the metric's cardinality limit."

I see that you have used "distinct" in the text below. It'd be nice to stay consistent with the choice of word. "distinct" vs "unique"

"the SDK can track as many distinct attribute combinations as the metric's cardinality limit."

utpilla · 2025-04-29T23:15:08Z

docs/metrics.md

+    even when overflow occurs.
+
+  * **Attribute-Based Query Limitations**: Any metric query based on specific
+    attributes could be misleading, as it's possible those dimensions were


Let's be more specific here. This is related to the query example of "How many red apples were sold?", right?

Suggested change

attributes could be misleading, as it's possible those dimensions were

attributes could be misleading, as it's possible that measurements recorded with a superset of those dimensions were

utpilla · 2025-04-29T23:25:24Z

docs/metrics.md

+    folded into the overflow bucket due to cardinality capping.
+
+  * **All Attributes Affected**: When overflow occurs, it's not just
+    high-cardinality attributes that are affected. The entire attribute


This description is not so different from the above one. I think we can rearrange these sections for simpler understanding:

This unpredictability creates several important considerations when querying metrics in any backend system:

Total Accuracy: ...

Attributes-based querying

Only partial information retained: (this would be the "How many red apples were sold?" example). Measurements with a superset of dimensions could be folded into overflow. We only retained information for Downtown based measurements here. Value returned by query suggests that we at least sold those many red apples. It could have been more.

No information retained: This would be the "How many items were sold in Midtown?" example. All measurements related to Midtown were folded into overflow. Value returned by the query is zero and it doesn't help as we may or may not have sold items in Midtown.

Does that make sense?

utpilla · 2025-04-29T23:34:54Z

docs/metrics.md

+  appropriate, see [modelling attributes](#modelling-metric-attributes) for
+  details. In the above example, if a process only sells fruits from a
+  particular store, then store_location attribute should be modelled as a
+  Resource - this is not only efficient, but reduced the cardinality capping


Attributes which can be modeled as Meter attributes or Resource attributes are static and do not affect cardinality. This should be done primarily for performance (reducing lookup costs). We should have this point, but it should be placed outside of cardinality capping section.

utpilla · 2025-04-29T23:37:30Z

docs/metrics.md

+quickly lead to cardinality issues, resulting in metrics being capped.
+
+A better alternative is to use a concept in OpenTelemetry called
+[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).


Suggested change

[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).

[Exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).

scottgerring

This is cool! I think user-focussed documentation is great :)
Couple of minor comments inline.

scottgerring · 2025-04-30T06:30:08Z

docs/metrics.md

@@ -0,0 +1,587 @@
+# OpenTelemetry Rust Metrics
+


Should we introduce the purpose of the doc?

scottgerring · 2025-04-30T06:30:49Z

docs/metrics.md

+## Metrics API
+
+### Meter
+


Suggested change

A [meter](otel arch linky) is the mechanism used to emit metrics in OTel.

or something introductory perhaps

scottgerring · 2025-04-30T06:33:28Z

docs/metrics.md

+
+:heavy_check_mark: You should understand and pick the right instrument type.
+
+> [!NOTE] Picking the right instrument type for your use case is crucial to


This doesn't render right but i'm not sure if it's 1/ a syntax issue or 2/ something to do with the context in which it is rendered? I believe [!NOTE] and friends are part of github-flavoured markdown.

scottgerring · 2025-04-30T06:35:18Z

docs/metrics.md

+should NOT create multiple instances of MeterProvider unless you have some
+unusual requirement of having different export strategies within the same
+application. Using multiple instances of MeterProvider requires users to
+exercise caution..


Suggested change

exercise caution..

exercise caution.

scottgerring · 2025-04-30T06:36:51Z

docs/metrics.md

+2. [**Cardinality Limits**](#cardinality-limits): the aggregation logic respects
+   [cardinality
+   limits](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#cardinality-limits),
+   so the SDK does not use indefinite amount of memory when there is cardinality


Suggested change

so the SDK does not use indefinite amount of memory when there is cardinality

so the SDK does not use an indefinite amount of memory in the event of a cardinality explosion.

scottgerring · 2025-04-30T06:38:38Z

docs/metrics.md

+  * attributes: {name = `lemon`, color = `yellow`}, count: `10`
+
+Note that the start time is advanced after each export, and only the delta since
+last export is exported, allowing SDK to "forget" previous state.


Suggested change

last export is exported, allowing SDK to "forget" previous state.

last export is exported, allowing the SDK to "forget" previous state.

scottgerring · 2025-04-30T06:47:30Z

docs/metrics.md

+last export is exported, allowing SDK to "forget" previous state.
+
+### Pre-Aggregation
+


This might go a bit to intended audience ( I think it is end users, right ? ) - but can we be more prescriptive here?

You should generally do X unless you need to do why, then do Z. Here's some more details on that in depth: [....]

So you don't have to read through and fully grok everything to get there

scottgerring · 2025-04-30T06:48:39Z

docs/metrics.md

+quickly lead to cardinality issues, resulting in metrics being capped.
+
+A better alternative is to use a concept in OpenTelemetry called
+[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).


cijothomas added 2 commits April 29, 2025 12:17

docs: Add metrics document

5896c13

nit

089ce37

cijothomas requested a review from a team as a code owner April 29, 2025 19:27

Merge branch 'main' into cijothomas/metric-doc1

ff9756a

utpilla reviewed Apr 29, 2025

View reviewed changes