-
Notifications
You must be signed in to change notification settings - Fork 516
docs: Add metric doc #2946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: Add metric doc #2946
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2946 +/- ##
=====================================
Coverage 81.3% 81.3%
=====================================
Files 126 126
Lines 24251 24251
=====================================
Hits 19736 19736
Misses 4515 4515 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
docs/metrics.md
Outdated
frequently. Instruments are fairly expensive and meant to be reused. For most | ||
applications, instruments can be created once and re-used. Instruments can also | ||
be cloned to create multiple handles to the same instrument, but the cloning | ||
should not be on hot path, but instead the cloned instance should be stored and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not that bad to clone the instrument as it should only be a matter of incrementing the Arc atomic value, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not as bad as creating new one repeatedly, but the best option is to create/clone and re-use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the best thing to do is create/clone and re-use. In my opinion, I find exaggerating the dangers of a not-so-harmful operation a bit odd. We ideally want to mention only those things that we want the users to look out for and not overstate things to lessen the burden on the user.
docs/metrics.md
Outdated
|
||
#### Cardinality Limit - Implications | ||
|
||
Cardinality limits are enforced during each export interval, meaning the metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this sentence from this section since this is only true for Delta?
|
||
* **Delta Temporality**: The SDK "forgets" the state after each | ||
collection/export cycle. This means in each new interval, the SDK can track | ||
up to the cardinality limit of completely different attribute combinations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrasing suggestion:
"the SDK can track as many unique attribute combinations as the metric's cardinality limit."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you have used "distinct" in the text below. It'd be nice to stay consistent with the choice of word. "distinct" vs "unique"
"the SDK can track as many distinct attribute combinations as the metric's cardinality limit."
even when overflow occurs. | ||
|
||
* **Attribute-Based Query Limitations**: Any metric query based on specific | ||
attributes could be misleading, as it's possible those dimensions were |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be more specific here. This is related to the query example of "How many red apples were sold?", right?
attributes could be misleading, as it's possible those dimensions were | |
attributes could be misleading, as it's possible that measurements recorded with a superset of those dimensions were |
folded into the overflow bucket due to cardinality capping. | ||
|
||
* **All Attributes Affected**: When overflow occurs, it's not just | ||
high-cardinality attributes that are affected. The entire attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description is not so different from the above one. I think we can rearrange these sections for simpler understanding:
This unpredictability creates several important considerations when querying metrics in any backend system:
- Total Accuracy: ...
- Attributes-based querying
- Only partial information retained: (this would be the "How many red apples were sold?" example). Measurements with a superset of dimensions could be folded into overflow. We only retained information for Downtown based measurements here. Value returned by query suggests that we at least sold those many red apples. It could have been more.
- No information retained: This would be the "How many items were sold in Midtown?" example. All measurements related to Midtown were folded into overflow. Value returned by the query is zero and it doesn't help as we may or may not have sold items in Midtown.
Does that make sense?
appropriate, see [modelling attributes](#modelling-metric-attributes) for | ||
details. In the above example, if a process only sells fruits from a | ||
particular store, then store_location attribute should be modelled as a | ||
Resource - this is not only efficient, but reduced the cardinality capping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attributes which can be modeled as Meter
attributes or Resource
attributes are static and do not affect cardinality. This should be done primarily for performance (reducing lookup costs). We should have this point, but it should be placed outside of cardinality capping section.
quickly lead to cardinality issues, resulting in metrics being capped. | ||
|
||
A better alternative is to use a concept in OpenTelemetry called | ||
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar). | |
[Exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool! I think user-focussed documentation is great :)
Couple of minor comments inline.
@@ -0,0 +1,587 @@ | |||
# OpenTelemetry Rust Metrics | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we introduce the purpose of the doc?
## Metrics API | ||
|
||
### Meter | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A [meter](otel arch linky) is the mechanism used to emit metrics in OTel. | |
or something introductory perhaps
|
||
:heavy_check_mark: You should understand and pick the right instrument type. | ||
|
||
> [!NOTE] Picking the right instrument type for your use case is crucial to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't render right but i'm not sure if it's 1/ a syntax issue or 2/ something to do with the context in which it is rendered? I believe [!NOTE]
and friends are part of github-flavoured markdown.
should NOT create multiple instances of MeterProvider unless you have some | ||
unusual requirement of having different export strategies within the same | ||
application. Using multiple instances of MeterProvider requires users to | ||
exercise caution.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exercise caution.. | |
exercise caution. |
2. [**Cardinality Limits**](#cardinality-limits): the aggregation logic respects | ||
[cardinality | ||
limits](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#cardinality-limits), | ||
so the SDK does not use indefinite amount of memory when there is cardinality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the SDK does not use indefinite amount of memory when there is cardinality | |
so the SDK does not use an indefinite amount of memory in the event of a cardinality explosion. |
* attributes: {name = `lemon`, color = `yellow`}, count: `10` | ||
|
||
Note that the start time is advanced after each export, and only the delta since | ||
last export is exported, allowing SDK to "forget" previous state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last export is exported, allowing SDK to "forget" previous state. | |
last export is exported, allowing the SDK to "forget" previous state. |
last export is exported, allowing SDK to "forget" previous state. | ||
|
||
### Pre-Aggregation | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might go a bit to intended audience ( I think it is end users, right ? ) - but can we be more prescriptive here?
You should generally do X unless you need to do why, then do Z. Here's some more details on that in depth: [....]
So you don't have to read through and fully grok everything to get there
quickly lead to cardinality issues, resulting in metrics being capped. | ||
|
||
A better alternative is to use a concept in OpenTelemetry called | ||
[exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#exemplar). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL!
This is put into a new docs location. I am open to suggestion on where is the best place to host this.
Fixes #2902 #1060