Skip to content

Conversation

@xeniape
Copy link
Member

@xeniape xeniape commented Oct 15, 2025

Description

Part of stackabletech/issues#747
This PR adds a metrics service with the additional Prometheus annotations. It also adds some documentation on monitoring for the TLS case since, similar to NiFi, HBase also exposes metrics behind a port which gets secured by TLS

Follow up monitoring Stack PR because moving to the metrics service, the port name changed: stackabletech/demos#316

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@xeniape xeniape moved this to Development: Waiting for Review in Stackable Engineering Oct 15, 2025
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me, just some minor suggestions and a few questions.

I only looked at the Rust code - let me know if I should look at the docs as well.

Comment on lines 551 to 565
pub fn metrics_ports(&self, role: &HbaseRole) -> Vec<(String, u16)> {
match role {
HbaseRole::Master => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_MASTER_METRICS_PORT,
)],
HbaseRole::RegionServer => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_REGIONSERVER_METRICS_PORT,
)],
HbaseRole::RestServer => {
vec![(HBASE_METRICS_PORT_NAME.to_string(), HBASE_REST_METRICS_PORT)]
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: The mapping of role to metrics port exists in two places: Here and below at line 575 which inevitably contains the risk to drift apart. I think it makes sense to contain this mapping in a single place instead.

One possible solution is to drop the associated metrics_port function and user .map() to extract only the port numbers when needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a little redundant, will check how to clean it up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&rolegroup.role_group,
))
.context(ObjectMetaSnafu)?
.with_label(Label::try_from(("prometheus.io/scrape", "true")).context(LabelBuildSnafu)?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Does it make sense to pull this key out into a constant? Or make it an associated function on Label, like Label::prometheus_scrape()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that and did not think about that. This would mean touching every operator again... i could prepare it in operator-rs and maybe put it on a prelease bump list or something?

Comment on lines 847 to 865
fn prometheus_annotations(hbase: &v1alpha1::HbaseCluster, hbase_role: &HbaseRole) -> Annotations {
Annotations::try_from([
("prometheus.io/path".to_owned(), "/prometheus".to_owned()),
(
"prometheus.io/port".to_owned(),
hbase.metrics_port(hbase_role).to_string(),
),
(
"prometheus.io/scheme".to_owned(),
if hbase.has_https_enabled() {
"https".to_owned()
} else {
"http".to_owned()
},
),
("prometheus.io/scrape".to_owned(), "true".to_owned()),
])
.expect("should be valid annotations")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Similar question to above - does it make sense to pull these out into contants or associated function on Annotation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, i like the idea.

@Techassi Techassi moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Oct 17, 2025
@sbernauer sbernauer assigned maltesander and unassigned xeniape Oct 20, 2025
@maltesander maltesander requested a review from Techassi October 27, 2025 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

4 participants