Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching external metrics from Prometheus #1711

Merged
merged 27 commits into from
Sep 19, 2023

Conversation

faderskd
Copy link
Contributor

No description provided.

@faderskd faderskd temporarily deployed to ci August 28, 2023 15:20 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 29, 2023 13:27 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 29, 2023 15:08 — with GitHub Actions Inactive
@faderskd faderskd marked this pull request as ready for review August 29, 2023 16:24
@faderskd faderskd temporarily deployed to ci August 29, 2023 16:26 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 29, 2023 16:52 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 29, 2023 17:15 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 30, 2023 14:44 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 30, 2023 15:06 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 30, 2023 16:27 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 30, 2023 16:27 — with GitHub Actions Inactive
@faderskd faderskd temporarily deployed to ci August 31, 2023 12:12 — with GitHub Actions Inactive
@JsonProperty("data") Data data) {

boolean isSuccess() {
return status.equals("success") && data.isVector();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: It seems that we're only interested in vector metrics, right?

}

@JsonIgnoreProperties(ignoreUnknown = true)
record Result(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow-up suggestion: If we are interested only in vector metrics, then maybe we can change this record name to VectorResult? Accordingly, values (L26) could be renamed to vector.

Copy link
Contributor Author

@faderskd faderskd Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we are only interested in vectors, good idea to change its name.

this.prometheusMetricsCache = CacheBuilder.newBuilder()
.ticker(ticker)
.expireAfterWrite(cacheTtlInSeconds, SECONDS)
.maximumSize(cacheSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potential issue: Default cache size is 100 000. Isn't it too much (for Prometheus metrics)?

Prometheus cached metrics take more memory compared to Graphite cached metrics.

Graphite cached metric:

public class MetricDecimalValue {
    private final boolean available;
    private final String value;
    ...
}

Prometheus cached metric:

public class MonitoringMetricsContainer {
    private final Map<String, MetricDecimalValue> metrics; // <- A Map - it can store multiple elements per one cache entry
    private final boolean isAvailable;
    ...
}

What do you think about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference should not be huge, because:

  1. CachingGraphiteClient fetches multiple metrics by readMetrics(String... multipleMetricPaths) and dispatches them into multiple enitres:
cache[singleMetricPath1] = MetricDecimalValue(...)
cache[singleMetricPath2] = MetricDecimalValue(...)
  1. CachingPromertheusClient fetches multiple metrics with one query readMetrics(String query) and put them into single element
cache[query] = Map<String, MetricsDecimalValue(...)>

The cache hit will be the same in both cases despite the fact that CachingGraphiteClient stores metrics more granularly: because we always query in the context of single subscription/topic so no metrics paths are shared between different queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now we don't have any cache metrics. but we should probably add a few in the future.

@pitagoras3
Copy link
Contributor

question: One more thing - what about the documentation? :) Would you like to approach it in separate PR, or you can add docs about the Prometheus metrics in this PR?

pitagoras3
pitagoras3 previously approved these changes Sep 8, 2023
moscicky
moscicky previously approved these changes Sep 19, 2023
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class PrometheusMetricsProvider implements MonitoringSubscriptionMetricsProvider, MonitoringTopicMetricsProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: we should probably name this one VictoriaMetricsMetricsProvider


import java.util.function.Supplier;

import static org.apache.commons.lang.exception.ExceptionUtils.getRootCauseMessage;
import static pl.allegro.tech.hermes.common.metric.HermesMetrics.escapeDots;

@Component
Copy link
Collaborator

@moscicky moscicky Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to be able to disable the whole monitoring but we can do that in the future

@faderskd faderskd dismissed stale reviews from moscicky and pitagoras3 via c29fe79 September 19, 2023 10:34
@faderskd faderskd merged commit 079f0a1 into master Sep 19, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants