Skip to content

docs: Add metric doc #2946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

cijothomas
Copy link
Member

@cijothomas cijothomas commented Apr 29, 2025

This is put into a new docs location. I am open to suggestion on where is the best place to host this.

Fixes #2902 #1060

@cijothomas cijothomas requested a review from a team as a code owner April 29, 2025 19:27
Copy link

codecov bot commented Apr 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.3%. Comparing base (7f8adcc) to head (5146aa6).

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #2946   +/-   ##
=====================================
  Coverage   81.3%   81.3%           
=====================================
  Files        126     126           
  Lines      24251   24251           
=====================================
  Hits       19736   19736           
  Misses      4515    4515           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

docs/metrics.md Outdated
frequently. Instruments are fairly expensive and meant to be reused. For most
applications, instruments can be created once and re-used. Instruments can also
be cloned to create multiple handles to the same instrument, but the cloning
should not be on hot path, but instead the cloned instance should be stored and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that bad to clone the instrument as it should only be a matter of incrementing the Arc atomic value, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not as bad as creating new one repeatedly, but the best option is to create/clone and re-use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the best thing to do is create/clone and re-use. In my opinion, I find exaggerating the dangers of a not-so-harmful operation a bit odd. We ideally want to mention only those things that we want the users to look out for and not overstate things to lessen the burden on the user.

docs/metrics.md Outdated

#### Cardinality Limit - Implications

Cardinality limits are enforced during each export interval, meaning the metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this sentence from this section since this is only true for Delta?

@cijothomas cijothomas changed the title doc: Add metric doc docs: Add metric doc Apr 29, 2025

* **Delta Temporality**: The SDK "forgets" the state after each
collection/export cycle. This means in each new interval, the SDK can track
up to the cardinality limit of completely different attribute combinations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrasing suggestion:

"the SDK can track as many unique attribute combinations as the metric's cardinality limit."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you have used "distinct" in the text below. It'd be nice to stay consistent with the choice of word. "distinct" vs "unique"

"the SDK can track as many distinct attribute combinations as the metric's cardinality limit."

even when overflow occurs.

* **Attribute-Based Query Limitations**: Any metric query based on specific
attributes could be misleading, as it's possible those dimensions were
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be more specific here. This is related to the query example of "How many red apples were sold?", right?

Suggested change
attributes could be misleading, as it's possible those dimensions were
attributes could be misleading, as it's possible that measurements recorded with a superset of those dimensions were

folded into the overflow bucket due to cardinality capping.

* **All Attributes Affected**: When overflow occurs, it's not just
high-cardinality attributes that are affected. The entire attribute
Copy link
Contributor

@utpilla utpilla Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description is not so different from the above one. I think we can rearrange these sections for simpler understanding:

This unpredictability creates several important considerations when querying metrics in any backend system:

  • Total Accuracy: ...
  • Attributes-based querying
    • Only partial information retained: (this would be the "How many red apples were sold?" example). Measurements with a superset of dimensions could be folded into overflow. We only retained information for Downtown based measurements here. Value returned by query suggests that we at least sold those many red apples. It could have been more.
    • No information retained: This would be the "How many items were sold in Midtown?" example. All measurements related to Midtown were folded into overflow. Value returned by the query is zero and it doesn't help as we may or may not have sold items in Midtown.

Does that make sense?

appropriate, see [modelling attributes](#modelling-metric-attributes) for
details. In the above example, if a process only sells fruits from a
particular store, then store_location attribute should be modelled as a
Resource - this is not only efficient, but reduced the cardinality capping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attributes which can be modeled as Meter attributes or Resource attributes are static and do not affect cardinality. This should be done primarily for performance (reducing lookup costs). We should have this point, but it should be placed outside of cardinality capping section.

quickly lead to cardinality issues, resulting in metrics being capped.

A better alternative is to use a concept in OpenTelemetry called
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).
[Exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).

Copy link
Contributor

@scottgerring scottgerring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool! I think user-focussed documentation is great :)
Couple of minor comments inline.

@@ -0,0 +1,587 @@
# OpenTelemetry Rust Metrics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we introduce the purpose of the doc?

## Metrics API

### Meter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A [meter](otel arch linky) is the mechanism used to emit metrics in OTel.

or something introductory perhaps


:heavy_check_mark: You should understand and pick the right instrument type.

> [!NOTE] Picking the right instrument type for your use case is crucial to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render right but i'm not sure if it's 1/ a syntax issue or 2/ something to do with the context in which it is rendered? I believe [!NOTE] and friends are part of github-flavoured markdown.

should NOT create multiple instances of MeterProvider unless you have some
unusual requirement of having different export strategies within the same
application. Using multiple instances of MeterProvider requires users to
exercise caution..
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exercise caution..
exercise caution.

2. [**Cardinality Limits**](#cardinality-limits): the aggregation logic respects
[cardinality
limits](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#cardinality-limits),
so the SDK does not use indefinite amount of memory when there is cardinality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
so the SDK does not use indefinite amount of memory when there is cardinality
so the SDK does not use an indefinite amount of memory in the event of a cardinality explosion.

* attributes: {name = `lemon`, color = `yellow`}, count: `10`

Note that the start time is advanced after each export, and only the delta since
last export is exported, allowing SDK to "forget" previous state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
last export is exported, allowing SDK to "forget" previous state.
last export is exported, allowing the SDK to "forget" previous state.

last export is exported, allowing SDK to "forget" previous state.

### Pre-Aggregation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might go a bit to intended audience ( I think it is end users, right ? ) - but can we be more prescriptive here?

You should generally do X unless you need to do why, then do Z. Here's some more details on that in depth: [....]

So you don't have to read through and fully grok everything to get there

quickly lead to cardinality issues, resulting in metrics being capped.

A better alternative is to use a concept in OpenTelemetry called
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Doc: Tutorial showing cardinality capping and its impact
4 participants