Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a way to tag clusters (and, optionally, nodes) #12552

Open
michaelklishin opened this issue Oct 19, 2024 · 3 comments
Open

Introduce a way to tag clusters (and, optionally, nodes) #12552

michaelklishin opened this issue Oct 19, 2024 · 3 comments
Assignees

Comments

@michaelklishin
Copy link
Member

michaelklishin commented Oct 19, 2024

Per discussion with @SimonUnge but also @stefanmoser @mkuratczyk.

Problem Definition

Larger users can have thousands of clusters used for all kinds of purposes. Sometimes it may be
necessary to perform certain operations (e.g. upgrades) on a subset of them, e.g. those that are
used by development environments only.

Right now there only so many ways of doing it:

  1. Keep this cluster metadata in an external store. This is not always an option
  2. Prefix cluster name with production-* or something like that. This is fragile and allows for only a couple of tags

Cluster Tags

One solution can be configuring cluster metadata with a map (a set of pairs) in rabbitmq.con:

cluster_tags.environment = production

cluster_tags.region = us-east
cluster_tags.az = us-east-3

This map can be then added to GET /api/overview HTTP API responses. In Prometheus responses, this could look something like this (please read the comments, with Prometheus it is much more nuanced than it sounds):

rabbitmq_identity_cluster_tags{environment = "production", region = "us-east", az = "us-east-3"}

This data can be stored in the node's application environment or in a global runtime parameter
(for "last write wins" consistency).

Node Tags

The same idea can be extended to node tags that will only be stored in the application
environment since this data is local to the node.

node_tags.environment = production

node_tags.region = us-east
node_tags.az = us-east-3

Which could look approximately like this in Prometheus output (please read the comments, with Prometheus it is much more nuanced than it sounds):

rabbitmq_identity_node_tags{environment = "production", region = "us-east", az = "us-east-3"}

The usefulness of this is less clear.

@ansd
Copy link
Member

ansd commented Oct 20, 2024

From a pure Prometheus point of view, I see two potential anti-patterns here:

The suggested new Prometheus metrics rabbitmq_identity_cluster_tags and rabbitmq_identity_node_tags are info metrics. According to Prometheus best practices:

The gauge should have the suffix _info

Hence, instead of introducing new Prometheus info metrics, I think it's better to re-use the existing Prometheus metric rabbitmq_identity_info.

These labels are meant to be target labels rather than instrumentation labels. In other words, RabbitMQ itself should probably not emit labels such as {environment = "production", region = "us-east", az = "us-east-3"}.
This is explained in "Prometheus: Up & Running":

Labels come from two sources, instrumentation labels and target labels. When you are
working in PromQL there is no difference between the two, but it’s important to dis‐
tinguish between them in order to get the most benefits from labels.
Instrumentation labels, as the name indicates, come from your instrumentation.
They are about things that are known inside your application or library, such as the
type of HTTP requests it receives, which databases it talks to, and other internal
specifics.
Target labels identify a specific monitoring target; that is, a target that Prometheus
scrapes. A target label relates more to your architecture and may include which appli‐
cation it is, what datacenter it lives in, if it is in a development or production environ‐
ment, which team owns it, and of course, which exact instance of the application it is.
Target labels are attached by Prometheus as part of the process of scraping metrics.
Different Prometheus servers run by different teams may have different views of what
a “team,” “region,” or “service” is, so an instrumented application should not try to
expose such labels itself
. Accordingly, you will not find any features in client libraries
to add labels across all metrics of a target. Target labels come from service discovery
and relabelling and are discussed further in Chapter 8.

@michaelklishin
Copy link
Member Author

@ansd this feature is not 100% Prometheus-specific, so even if the Prometheus view of the world assumes that these are "deployment labels", RabbitMQ will have to log and expose them e.g. in the HTTP API and CLI tools in order to make them useful.

@SimonUnge will configuring your monitoring and upgrade tooling be possible to comply with the monitoring target concept quoted above?

@ansd
Copy link
Member

ansd commented Oct 21, 2024

@michaelklishin yes, my comment above was purely about Prometheus' point of view. Tagging/labelling RabbitMQ nodes with key/value metadata and exposing this metadata via the HTTP API, CLI tools and logs makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants