feat(prometheus): expose controlplane connectivity state as a gauge #14020

aryan9600 · 2024-12-13T11:14:32Z

Summary

Add a new Prometheus gauge metric control_plane_connected. Similar to datastore_reachable gauge, 0 means the connection is not healthy; 1 means that the connection is healthy. We mark the connection as unhealthy under the following circumstances:

Failure while establihing a websocket connection
Failure while sending basic information to controlplane
Failure while sending ping to controlplane
Failure while receiving a packet from the websocket connection

This is helpful for users running a signficant number of gateways to be alerted about potential issues any gateway(s) may be facing while talking to the controlplane.

Checklist

The Pull Request has tests
A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

Fix #[issue number]

kong/plugins/prometheus/exporter.lua

kong/clustering/data_plane.lua

kong/plugins/prometheus/exporter.lua

spec/03-plugins/26-prometheus/04-status_api_spec.lua

kong/clustering/data_plane.lua

kong/plugins/prometheus/exporter.lua

spec/03-plugins/26-prometheus/04-status_api_spec.lua

Add a new Prometheus gauge metric `control_plane_connected`. Similar to `datastore_reachable` gauge, 0 means the connection is not healthy; 1 means that the connection is healthy. We mark the connection as unhealthy under the following circumstances: * Failure while establihing a websocket connection * Failure while sending basic information to controlplane * Failure while sending ping to controlplane * Failure while receiving a packet from the websocket connection This is helpful for users running a signficant number of gateways to be alerted about potential issues any gateway(s) may be facing while talking to the controlplane. Signed-off-by: Sanskar Jaiswal <[email protected]>

flrgh

Nicely done!

flrgh · 2025-02-11T18:55:16Z

/cherry-pick
(does this work?) edit: no it doesn't

pull-request-size bot added the size/M label Dec 13, 2024

github-actions bot assigned aryan9600 Dec 13, 2024

github-actions bot added core/clustering plugins/prometheus cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee labels Dec 13, 2024

aryan9600 force-pushed the cp-conn-prom-metric branch 3 times, most recently from c2d6278 to 697c1e6 Compare December 23, 2024 13:32

pull-request-size bot added size/L and removed size/M labels Dec 23, 2024

aryan9600 requested review from chronolaw and bungle December 23, 2024 13:32

aryan9600 marked this pull request as ready for review December 23, 2024 16:21

aryan9600 force-pushed the cp-conn-prom-metric branch from 697c1e6 to de4a868 Compare January 2, 2025 11:08

RobSerafini requested review from gszr and flrgh January 7, 2025 19:22

hbagdi reviewed Jan 7, 2025

View reviewed changes

kong/plugins/prometheus/exporter.lua Outdated Show resolved Hide resolved

aryan9600 force-pushed the cp-conn-prom-metric branch from de4a868 to ee922f2 Compare January 10, 2025 10:24