-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 'ceph_disk_occupation' query expressions #2812
base: main
Are you sure you want to change the base?
Fix 'ceph_disk_occupation' query expressions #2812
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aruniiird The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@weirdwiz , @jmolmo , @umangachapagain please take a look. |
0b43c92
to
7f6fa87
Compare
I think that the change is ok. As you can see, this metric never had the label "exported_instance", So the change in the label name probably comes from the ODF side. Probably you will need to check and understand when and why this label changed. And after that review that it does not impact in another metrics. |
Need to address changes in 'ceph_disk_occupation' metric labels. What is the change in 'ceph_disk_occupation' metric? 'ceph_disk_occupation' result no longer has 'exported_instance' label, instead it has 'instance' label. What is the issue we are facing because of it? We are hitting 'PrometheusRuleFailures' due to this new label changes in our alerts / rules, where this metric is used. Second issue is that we are not seeing any results for some of the query expressions. What is the solution? Update the query expressions, change 'exported_instance' to 'instance'. Any 'label_replace' action which changes 'exported_instance' label to 'instance' label is no longer required (as the 'instance' label is directly available now) Signed-off-by: Arun Kumar Mohan <[email protected]>
7f6fa87
to
2d544db
Compare
@aruniiird: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@aruniiird Would this be a blocker in 4.17? |
Correct @jmolmo . Checked in the ODF / OCS side, couldn't find much. There might be a chance that this records/alerts where not working for a long time. Current changes are working (with this PR) thus enabling those named records and alerts from now on wards. |
@malayparida2000 , this won't be a blocker (as the query may not have worked for some time), but this is a good candidate for a 4.17 z-stream release and for newer (4.18) releases |
Need to address changes in 'ceph_disk_occupation' metric labels.
What is the change in 'ceph_disk_occupation' metric?
'ceph_disk_occupation' result no longer has 'exported_instance' label, instead it has 'instance' label.
What is the issue we are facing because of it?
We are hitting 'PrometheusRuleFailures' due to this new label change in our alerts / rules.
Second issue is that we are not seeing any results for some of the query expressions.
What is the solution?
Update the query expressions, change 'exported_instance' to 'instance'. Any 'label_replace' action which changes 'exported_instance' label to 'instance' label is no longer required (as the 'instance' label is directly available now)