feat(helm v5): Remove prometheus and telegraf receiver#4081
feat(helm v5): Remove prometheus and telegraf receiver#4081
Conversation
|
|
||
| '{{ include "metric.endpoints" . }}' | ||
| */}} | ||
| {{- define "metric.endpoints" -}} |
There was a problem hiding this comment.
metric.endpoints are used in telegraf receiver in metadata metrics layer , we have removed telegraf receiver, so removing this as well.
| | `metadata.metrics.waitForMetadataTimeout` | Wait for Metadata timeout | `10s` | | ||
| | `metadata.metrics.config.merge` | Configuration for metrics metadata otelcol, merged with defaults. See also https://github.com/SumoLogic/sumologic-otel-collector/blob/main/docs/configuration.md. | {} | | ||
| | `metadata.metrics.config.override` | Configuration for metrics metadata otelcol, replaces defaults.See also https://github.com/SumoLogic/sumologic-otel-collector/blob/main/docs/configuration.md. | {} | | ||
| | `metadata.metrics.config.additionalEndpoints` | List of additional endpoints for Open Telemetry Metadata Pod. | `[]` | |
There was a problem hiding this comment.
Used to define metric.endpoints used by telegraf receiver , it's removed now.
| override: {} | ||
|
|
||
| ## List of additional endpoints to be handled by Metrics Metadata Pods | ||
| additionalEndpoints: [] |
There was a problem hiding this comment.
Used to define metric.endpoints used by telegraf receiver(which is removed now)
…logic-kubernetes-collection into prometheus-remove
There was a problem hiding this comment.
Pull request overview
This PR finalizes Helm chart v5’s metrics pipeline by removing Prometheus Operator / Prometheus remote-write and the Telegraf receiver from the metadata-metrics components, leaving OpenTelemetry as the sole supported path for metrics ingestion.
Changes:
- Removed Prometheus remote-write/receiver plumbing (9888 ports, nginx proxy route, Telegraf receiver config) from Helm templates and rendered golden files.
- Deleted Prometheus-focused examples and integration/unit tests (remote-write validation, Prometheus metrics integration tests).
- Updated docs and chart metadata to reflect Prometheus deprecation/removal in v5 (plus changelog + config-key checker adjustments).
Reviewed changes
Copilot reviewed 38 out of 39 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vagrant/scripts/yq/avalanche-remote-write.yaml | Removes Prometheus remote-write override example. |
| tests/integration/values/values_helm_default_ot_namespaceoverride.yaml | Drops integration values file that relied on Prometheus for namespaceOverride. |
| tests/integration/helm_prometheus_metrics_test.go | Removes Prometheus-metrics integration test. |
| tests/integration/helm_ot_default_namespaceoverride_test.go | Removes OT default namespaceOverride integration test that validated Prometheus metrics path. |
| tests/helm/testdata/remotewrite/remotewrites-defined-prometheus-enabled.yaml | Removes Prometheus remote-write test values. |
| tests/helm/testdata/remotewrite/remotewrites-defined-prometheus-disabled.yaml | Removes Prometheus remote-write test values. |
| tests/helm/testdata/remotewrite/metrics-disabled.yaml | Removes remote-write-related test values for “metrics disabled”. |
| tests/helm/testdata/goldenfile/remote_write_proxy/full_configmap.output.yaml | Updates expected nginx configmap output to remove Prometheus upstream/server. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc_statefulset/pvcpolicyenabled.output.yaml | Updates expected ports by removing prom-write (9888). |
| tests/helm/testdata/goldenfile/metadata_metrics_otc_statefulset/custom.output.yaml | Updates expected ports by removing prom-write (9888). |
| tests/helm/testdata/goldenfile/metadata_metrics_otc_statefulset/basic.output.yaml | Updates expected ports by removing prom-write (9888). |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/filtered_app_metrics.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/debug_with_sumologic_mock_http_routing_connector.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/debug_with_sumologic_mock_http.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/debug_with_sumologic_mock.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/debug.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/custom_routing_connector.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/basic.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/allow_histograms.output.yaml | Removes Telegraf receiver from expected OTC config output. |
| tests/helm/testdata/goldenfile/metadata_metrics_otc/additional_endpoints.output.yaml | Removes Telegraf receiver + additionalEndpoints handling from expected OTC output. |
| tests/helm/remotewrite_validation_test.go | Removes Helm test validating remote-write behavior. |
| tests/helm/prometheus_test.go | Removes ServiceMonitor rendering tests that depended on Prometheus/kube-prometheus-stack templates. |
| examples/kube_prometheus_stack/values.yaml | Removes kube-prometheus-stack example values file. |
| examples/kube_prometheus_stack/values-prometheus.yaml | Removes large Prometheus-focused example values file. |
| docs/prometheus.md | Adds v5 deprecation notice and retitles Prometheus docs as “till v4”. |
| docs/kube-prometheus.md | Adds v5 deprecation notice and retitles kube-prometheus mixin docs as “till v4”. |
| deploy/helm/sumologic/values.yaml | Removes Prometheus operator/prometheusSpec config blocks; updates comments; removes additionalEndpoints field. |
| deploy/helm/sumologic/templates/metrics/remote-write-proxy/service.yaml | Removes Prometheus service port (9888); keeps OTLP port (4318). |
| deploy/helm/sumologic/templates/metrics/otelcol/statefulset.yaml | Removes prom-write container port (9888) from metadata-metrics otelcol. |
| deploy/helm/sumologic/templates/metrics/common/service.yaml | Removes prom-write service port (9888). |
| deploy/helm/sumologic/templates/metrics/common/service-headless.yaml | Removes prom-write service port (9888). |
| deploy/helm/sumologic/templates/_helpers/_metrics.tpl | Removes helper that generated Telegraf endpoint list (remote-write endpoints + additionalEndpoints). |
| deploy/helm/sumologic/conf/metrics/remote-write-proxy/remote-write-proxy.conf | Removes Prometheus listener/upstream and keeps only OTLP proxying. |
| deploy/helm/sumologic/conf/metrics/otelcol/pipeline.yaml | Removes Telegraf receiver from metrics pipeline. |
| deploy/helm/sumologic/conf/metrics/otelcol/config.yaml | Removes Telegraf receiver configuration entirely. |
| deploy/helm/sumologic/README.md | Removes Prometheus/prometheusOperator and additionalEndpoints keys from the documented values list. |
| ci/check_configuration_keys.py | Skips documentation-key validation for prometheus/prometheusOperator enabled flags. |
| .changelog/4081.changed.text | Adds changelog entry for v5 removal of Prometheus operator + Telegraf receiver. |
Comments suppressed due to low confidence (1)
deploy/helm/sumologic/conf/metrics/remote-write-proxy/remote-write-proxy.conf:13
- remote-write-proxy.conf now only listens on 4318, but the remote-write-proxy Deployment still exposes and probes
.Values.sumologic.metrics.remoteWriteProxy.config.port(default 8080). With this change, liveness/readiness probes will fail and the extra port is unused. Please reconcile the listen/probe/service ports (e.g., probe 4318 and remove the configurable port, or make nginx listen on the configured port and expose it accordingly).
server {
listen 4318 default_server;
{{- if not .Values.sumologic.metrics.remoteWriteProxy.config.enableAccessLogs }}
access_log off;
{{- end }}
location / {
client_body_buffer_size {{ .Values.sumologic.metrics.remoteWriteProxy.config.clientBodyBufferSize }};
proxy_pass http://remote_otel;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
|
As per claude
File: deploy/helm/sumologic/conf/metrics/remote-write-proxy/remote-write-proxy.conf:13 Issue: Copilot flagged this - the nginx config now only listens on port 4318, but the remote-write-proxy Deployment still has liveness/readiness probes configured for port 8080 (.Values.sumologic.metrics.remoteWriteProxy.config.port). Impact: Health checks will fail, pods won't become ready. Comment to add:
Otherwise, the remote-write-proxy pods won't pass health checks. |
Prometheus operator(from kube-prometheus-stack) and Telegraf receiver in metrics metadata pods removed.
Large PR, In short the changes done are,
In helm v4 release, we already made Opentelemetry operator as default for metrics and disabled prometheus operator.
In v5 release, we are removing prometheus operator support and use Otel operator as the single source for metrics collection.
Customer impact:
Those who are using in helm v4 default settings(Open telemetry operator) for metrics collection -> No impact
Those who are using Prometheus operator for metrics by overriding (kube-prometheus-stack.prometheusOperator.enabled) -> https://www.sumologic.com/help/docs/send-data/kubernetes/v4/how-to-upgrade/#metrics-migration
Only functionality we are removing is, Those who are using standalone prometheus and sending data to Sumologic using remote write -> Not supported now. We need to ask those customers to Let otel collect metrics using scrape annotations or service monitors.
Otel operator collects metrics the same way Prometheus operator collects metrics , ie) using Scrape annotations in pod (or) Service monitors to identify endpoints to scrape.
When to Merge: Get it reviewed and merge once other changes for helm v5 are done as well. As once this is merged in main, we can't do v4 release. So better to keep all v5 changes ready and then merge them together.
Checklist