Open
Description
Even the node-fencing controller is another source of interesting activity in the cluster. For that, we should generate various metrics such as:
- how many times a given node was restarted
- how many times a node bad behaviour was caused by a hardware failure
- how many times a squirrel attack caused a network failure
- ...
So operators can see which nodes are sensitive to failures and possible track which failures are happening regularly.
Metadata
Metadata
Assignees
Labels
No labels