-
Notifications
You must be signed in to change notification settings - Fork 32
K8SPS-601: set Normal event type for non-error events
#1155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR changes the event type from EventTypeWarning to EventTypeNormal for two specific events in the status reconciliation logic:
- When async replication is not ready during initialization
- When the cluster state changes
Key Changes
- Event type changed from Warning to Normal for "AsyncReplicationNotReady" events
- Event type changed from Warning to Normal for "ClusterStateChanged" events
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| log.Info(fmt.Sprintf("Async replication not ready: %s", msg)) | ||
| r.Recorder.Event(cr, corev1.EventTypeWarning, "AsyncReplicationNotReady", msg) | ||
| r.Recorder.Event(cr, corev1.EventTypeNormal, "AsyncReplicationNotReady", msg) |
Copilot
AI
Nov 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this event from Warning to Normal may be inappropriate. This event is triggered when async replication is not ready, causing the state to be set to StateInitializing (line 105). While the code continues with normal flow, this represents a condition where the system is not yet operational. In Kubernetes, Warning events typically indicate issues or non-ready states that administrators should be aware of. Consider whether this transient failure condition should remain as EventTypeWarning to maintain visibility into operational issues.
| r.Recorder.Event(cr, corev1.EventTypeNormal, "AsyncReplicationNotReady", msg) | |
| r.Recorder.Event(cr, corev1.EventTypeWarning, "AsyncReplicationNotReady", msg) |
| if status.State != initialState { | ||
| log.Info("Cluster state changed", "previous", initialState, "current", status.State) | ||
| r.Recorder.Event(cr, corev1.EventTypeWarning, "ClusterStateChanged", fmt.Sprintf("%s -> %s", initialState, status.State)) | ||
| r.Recorder.Event(cr, corev1.EventTypeNormal, "ClusterStateChanged", fmt.Sprintf("%s -> %s", initialState, status.State)) |
Copilot
AI
Nov 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this event from Warning to Normal is problematic because it fires for all state transitions, including transitions to error states. For example, if the state changes from Ready to Error (e.g., line 194 sets state to StateError on full cluster crash), this would generate a Normal event for what is actually a critical problem. The event type should either remain as Warning, or the code should conditionally use Warning for transitions to error/initializing states and Normal only for transitions to ready states.
| r.Recorder.Event(cr, corev1.EventTypeNormal, "ClusterStateChanged", fmt.Sprintf("%s -> %s", initialState, status.State)) | |
| eventType := corev1.EventTypeNormal | |
| if status.State == apiv1.StateError || status.State == apiv1.StateInitializing { | |
| eventType = corev1.EventTypeWarning | |
| } | |
| r.Recorder.Event(cr, eventType, "ClusterStateChanged", fmt.Sprintf("%s -> %s", initialState, status.State)) |
commit: 268a9e7 |
egegunes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time understanding the motivation behind these changes.
We use According to the Kubernetes code, In my opinion, |
https://perconadev.atlassian.net/browse/K8SPS-601
CHANGE DESCRIPTION
Problem:
Short explanation of the problem.
Cause:
Short explanation of the root cause of the issue if applicable.
Solution:
Short explanation of the solution we are providing with this PR.
CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
Config/Logging/Testability