Skip to content

Implicit behavior for Counter metric #15

@connorjacobsen

Description

@connorjacobsen

I recently noticed that the reporter throws away the measurement field of Counter metrics. This can lead to some surprising outcomes.

As an illustration, consider the following metric definitions:

# Metrics to be passed to the TelemetryMetricsStatsd reporter.
metrics = [
  Telemetry.Metrics.counter([:foo, :bar, :baz], tags: [:a, :b]),
  Telemetry.Metrics.counter([:foo, :bar, :bat], tags: [:b, :c]),
]

Based on my understanding of the current implementation of the handle_event/4 function, the following code would trigger a metric to be published for both of the above metrics (one of which would have the incorrect tags and, I think, may actually cause the handler to detach):

:telemetry.execute([:foo, :bar, :baz], %{}, tags: %{a: 1, b: 2})

This can be avoid if you explicitly pass the event_name option, or if you ensure that the default event_name of your metric is unique (ie using the count suffix on everything would do the trick).

I am struggling to quickly reproduce this in the tests for this project (without explicitly specifying event name), but in that case it's simply not publishing anything so it may be a separate problem. I can revisit and try to break things properly over the weekend.

But the offending code seems to be: https://github.com/arkgil/telemetry_metrics_statsd/blob/0f79e2af612caa834ac522da2f6209cd930f5967/lib/telemetry_metrics_statsd/event_handler.ex#L88-L91

The counter here will loop through every metric with the shared prefix and publish a payload without ensuring that the metric in question matches the measurement of the fired event. Contrast this with the Telemetry.Metrics.ConsoleReporter module which will return nil in its fetch_measurement/2 equivalent and not publish if the event_name is the same but the measurement is different: https://github.com/beam-telemetry/telemetry_metrics/blob/v0.3.0/lib/telemetry_metrics/console_reporter.ex#L98

This behavior can be mitigated by post-fixing all counter metrics with.count, but that is not explicitly mentioned in the documentation (that I have been able to find).

If there's something going on that I'm just completely missing, I would be very happy to be wrong here 😄

PS. If you know how to get tests to work without explicitly specifying the event_name option to given_counter, it would help me reproduce this with a test faster. My initial discovery was from a test reporter module I wrote by copy-pasting the bones of this reporter. So while I am not certain this bug exists, I'm very suspicious.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions