Skip to content

4.x Add warning message if more than one MeterRegistry is created #10858

@tjquinno

Description

@tjquinno

Helidon Version: 4.x

Task Description

In Helidon MP, the last part of the MetricsCdiExtension start-up uses configuration to set up the behavior of the metrics subsystem and then creates the global MeterRegistry.

In two cases we know about, MP user apps and/or libraries have either their own CDI extension which registers meters or trigger async work that registers meters, causing a race condition registering meters before the Helidon MP metrics extension has initialized metrics according to the configuration. This causes two Helidon meter registries to be created only one of which respects the user-specified configuration.

The outwardly-observable symptom in these two cases has been a deadlock partly in Micrometer code, caused by the race condition creating the two global Helidon MMeterRegistry instances which share the same single underlying Micrometer meter registry instance. Helidon does its own locking to prevent harmful concurrent access to its meter registry, but in this scenario there are two instances so two locks. These two instances of the Helidon meter registry are unaware of each other's use of the same underlying Micrometer meter registry which does its own locking, hence the deadlock.

Using multiple meter registries is a legitimate use case but should be very rare. (Some of our own unit tests use this feature precisely to isolate meter registrations in different tests from each other. Overwhelmingly, production apps should use a single meter registry.)

This issue proposes that Helidon log a warning message when any additional MMeterRegistry is instantiated.

The first warning would include two stack traces, one for the first instantiation and one for the second. Any subsequent instantiations would trigger a warning including a stack trace for only the new instantiation. This would allow users to see exactly what code is triggering the multiple instantiations. That is in contrast to the current deadlock situations we have seen; the deadlock is partly caused by async code and the thread dump's stack trace for the asynchronous work therefore does not directly show the code that caused the instantiation of the second meter registry.

A new configuration setting metrics.warn-on-multiple-registries will default to true but could be set to false to suppress the warnings.

This problem would at least partly--perhaps wholly--be resolved once Helidon metrics migrates to use the service registry to store the "global" meter registry rather than the current implementation which uses statics. But this migration would affect some delicate areas of the code beyond the MMeterRegistry constructor, hence this proposal for a less-intrusive approach that could be easily removed once the migration to use the service registry actually occurs.

Metadata

Metadata

Assignees

Labels

Type

Projects

Status

Sprint Scope

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions