-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Own-telemetry: Ability to enabled/disable individual metrics #10769
Comments
Thanks for opening the issue. One question I have is how this works w/ views, because configuration of views would allow this same functionality. I'd rather not have multiple ways of doing the same thing if possible |
I'm not familiar enough with views to give a good answer here. Are they exposed to our users via the collector config? Or are they an implementation detail? For that matter, is there a reason we don't use views instead of in-code checks against metric level? |
@djaglowski views will be configurable in the same way as they are for SDK via config, see https://github.com/open-telemetry/opentelemetry-configuration/blob/cc7cd375d893e440afdc294a00f98038ebaa4eab/examples/kitchen-sink.yaml#L221 as an example The main reason I can see wanting to keep a telemetry level configuration option is for convenience to end users. It would be easier to configure one setting than to have to configure views for all metrics. But then, if we have multiple ways to configured things, we get into the issue of precedence. |
I agree but couldn't these be implemented as views? (Then for example component developers don't have to check If we did that, does that normalize configuration in a way where users could add or subtract from our predefined views by adding their own? |
There's no concept of levels built into views. I considered adding a An alternative proposed was to have a different MeterProvider for everything level which would again remove the need to check a level before recording telemetry. This would mean that the component developer would have to make a decision on what they consider appropriate for the different levels.
We could definitely publish a configuration for each level that users could modify as they see fit. This would be somewhat easy to generate via mdatagen in the documentation as well |
I hope I'm not getting sidetracked from the original purpose of this issue, but is this line in the Please let me know if I'm missing something. |
Yes, that's basically what I suggested @crobert-1. I don't care if it looks just like that or not, but it should be possible to enable or disable individual metrics produced by the collector. |
Hello! 👋 I could take a look into implementing this, but I'm still not yet sure what a good solution would look like. One solution is to use views to disable (but not enable) metrics using views. The user defined views would override the settings configured by the service:
telemetry:
metrics:
level: basic
views:
- selector:
instrument_name: otelcol_exporter_send_failed_spans
stream:
aggregation:
drop: Another solution could be to make the
However, I'm not sure how ergonomic both of these approaches are. I suppose most users want to specify the metrics they are interested in, not the ones they are not interested in. That's because if you have a dashboard which uses a particular metric, you want to make sure the metric is always present. |
Re-reading this I am not sure what the decision if any was. Do we want to have a separate mechanism for configuring inidividual metrics different from views or |
I feel like the This is not consistent with the rest of our configuration so I filed open-telemetry/opentelemetry-specification/issues/4344 |
If I'm understanding correctly, the
I don't think we have consensus. Admittedly, I am not familiar enough with views to understand what has been suggested. Perhaps we could continue this conversation starting from what the actual configuration would look like: Let's just say we have two metrics service:
telemetry:
metrics:
otelcol.foo.enabledbydefault: false
otelcol.foo.disabledbydefault: true Can someone please show me how the collector configuration would look if we used views to change only these two metrics to non-default settings? |
@djaglowski So here is my proposal with some examples. The issue I linked above about My pitch is that we should conceptualize this as having two incompatible modes for configuring telemetry: a basic, level-based configuration and an advanced, views/meter config based configuration for advanced users that really want to configure things with a lot of detail. The basic mode would be equivalent to the advanced mode with a concrete set of views. Users would be able to get this concrete list of views for a particular level through some sort of mechanism (initially docs, in the longer term it could be a subcommand, this doesn't need to be ready for 1.0), so they can have a starting template. To make this concrete, let's say we have Examples and table
service:
telemetry:
metrics:
readers:
- periodic:
exporter:
otlp_http:
endpoint: http://localhost:4318/v1/metrics
service:
telemetry:
metrics:
level: detailed
service:
telemetry:
metrics:
views:
- selector:
instrument_name: otelcol.foo.basic
stream:
aggregation:
drop:
service:
telemetry:
metrics:
level: detailed
views:
- selector:
instrument_name: otelcol.foo.basic
stream:
aggregation:
drop: Summary table:
|
@mx-psi This proposal is probably the simplest to implement (either use the specified views or the list corresponding to a level), but I fear this may be a big pain point for users. Users who just want to drop one metric from the defaults will probably try to simply add a drop view for it. If they didn't have a level configured, they'll suddenly see the amount of exported telemetry explode (everything except the one they dropped), and will need to notice the warning to understand why. Then, they'll have to write a gigantic config file just to get back to the previous behavior. And once views are set up, they would no longer be able to quickly enable more telemetry for debugging purposes, and would have to remove views for specifically the metrics they want. I think the ideal solution from a user's perspective would be:
Even with This would certainly be more complicated to implement, but I think the only feasibility concerns are:
|
IMO any system that has multiple ways of configuring the same thing runs into this risk of confusion. If needed we could have a (name to be bikeshedded)
Our config system is flexible enough to make this a nonissue, if we get merge rules right they could just include a file and merge it with their config. They would just generate the views config into a
Following the above, they would just have to change the file they include
I agree that views can be complex. This however feels like a problem we should solve at the specification level, if it's complex for our users it would be complex for users of any other application.
It would be interesting to understand the exact semantics of having overlapping views. I am not categorically opposed to this option, but my initial gut feeling is that this could be as confusing as the solution I am proposing. |
I half-agree; I think a system with a "broad" configuration and a "detailed" configuration that customizes it is relatively intuitive in comparison with a system where setting the latter essentially changes the former to "everything on". If we go with your proposal, I'm in support of something like
Concatenating the default list of views with a custom one by applying two configuration files will have the same confusing semantics as doing it automatically in the code, it just requires more work on the user's part.
Definitely agree with that part. I think the way that conflicting Views interact is surprising and confusing in the spec. I also think the inclusion of MetricConfig is a sign that others agree that using Views to enable/disable metrics is too complicated, and that another way to do it would be useful. I'm not sure if the interaction between the two is specified? If the spec goes with a "Views take priority over MetricConfig.enabled" solution and adds a YAML-based way to specify MetricConfig, my proposal could be implemented by simply setting a default value for the MetricConfigs based on the level. |
Hm, I think the important difference between the concatenate-views solution and the basic-and-advanced-modes solution is that users are not required to use the default config in the basic-and-advanced-modes solution; they can just write their own thing if they want to. |
@jade-guiton-dd This is how the |
Summary of the new `MeterConfig` and `MeterConfigurator` APIThe SDK spec has an experimental spec for a
and a
The PR linked above proposes the following example YAML config to customize the meter_provider: # This corresponds to `service::telemetry::metrics` in the OTelCol config
meter_configurator:
default_config: # Configure the default meter config used there is no matching entry in .meter_configurator.meters.
disabled: true
meters:
- name: io.opentelemetry.contrib.* # Meter names to match, which can include '?' and '*' placeholders.
config:
disabled: false This hasn't been implemented in the SDK yet, but presumably this implies that when the With this in mind, my updated proposal would be to allow specifying both However, there is a big issue with this proposal in the form of I think the best way to do that would be to require setting Example configsLet's assume we have the same example Example 1: Oops, All Defaults service:
telemetry:
metrics:
readers: [...] → Example 2: Level up service:
telemetry:
metrics:
readers: [...]
level: detailed → Example 3: Customized defaults service:
telemetry:
metrics:
readers: [...]
meter_configurator:
meters:
- name: otelcol.foo.basic
config:
disabled: true → Example 4: Customized defaults 2: Electric Boogaloo service:
telemetry:
metrics:
readers: [...]
meter_configurator:
meters:
- name: otelcol.foo.detailed
config:
disabled: false → Same as above. Example 5: I am altering the defaults, pray I do not alter them any further service:
telemetry:
metrics:
readers: [...]
level: custom
meter_configurator:
default_config:
disabled: false → Because Example 6: Conflicting directives service:
telemetry:
metrics:
readers: [...]
level: normal
meter_configurator:
default_config:
disabled: false → Important note: If the SDK team decides to make the |
After discussing it with @mx-psi, I think we could start with a simple "no views or (once implemented) meter_configurator unless level is set to custom/detailed" approach, and come back to this discussion once meter_configurator is actually implemented in the SDK; it should be easier to decide when the API is more concrete. |
I think there's been a misunderstanding on my part here: So if we want a more ergonomic approach to toggling metrics than "enable everything then filter with views" in the future, we will have to go back to my original "merge the views" idea, and/or add our own |
I got that wrong too, sorry :/ |
I don't think there was ever a direct answer to this question and I want to highlight this because IMO it means we haven't identified a solution that's going to be easy enough for the average user. I think anything which requires users to reason about views, readers, meters, or anything other than "this one metric that I care about" is going to be useful only to a tiny fraction of users.
In my opinion, this is the only sensible user facing way to enable and disable metrics, whatever the implementation may be.
Apologies for opining without much consideration for the implementation here, but the discussion seems to be headed towards exposing the user to internal concerns more than providing them with a simple solution. |
To answer your question directly: Let's assume that Under Pablo's proposal, using
Coming up with this requires knowledge of views and config merging, as well as undocumented and unstandardized knowledge about how conflicting views interact with each other in the SDK. (If I remember correctly from reading the code last time: If there is a matching non-drop view, apply the first one in the list. Otherwise, if there is a matching drop view, apply it. Otherwise, apply the default aggregation.) The simpler alternative users are likely to take if they get past step 2 is to avoid merging configs and directly modify the default list of views. But this means you now have a very large, custom config file. |
I've opened a draft PR (#12433) implementing the part that the various proposals seem to agree on, ie. allowing users to set |
Is your feature request related to a problem? Please describe.
Our notion of basic/normal/detailed telemetry levels is a quick and convenient way for users to tune the volume of the collector's own telemetry, but it is quite crude compared to what we provide to control telemetry not generated by the collector. For example, our
mdatagen
library provides the ability to enable or disable any individual metric. Not having this ability for the collector's own telemetry means that users may not be able to get the metrics they need without enabling all metrics.Describe the solution you'd like
In the telemetry configuration, we should offer the ability to enable or disable individual metrics, regardless of telemetry level.
The telemetry level should be used as the starting point for which metrics are enabled or disabled, but then individual metric settings can override.
The text was updated successfully, but these errors were encountered: