Skip to content

feat: enhance theliv investigators for kubernetes to analyze kubernetes events #100

@rajarajanpsj

Description

@rajarajanpsj

Why do you want this feature:
theliv investigator functions are supposed to analyze the alerts deeply and provide actionable insights/next steps to the users. This means investigator functions should analyze kubernetes events in combination with the alert information and provide more information to the user.

Describe the solution you'd like:
Theliv provides an investigation framework on top of prometheus alerts. This means it will analyze alerts from prometheus, dive deeper to provide actionable insights to the user. E.g. when a crashloop backoff alert is triggered, typically a sre or a devops member would dive deeper to figure out the root cause. Many a times, that involves analyzing the kubernetes events.

  1. theliv has an investigator for crash loopbackoff which needs to be enhanced to analyze the kubernetes events and use that information to provide more information to user. E.g. it could provide more information to user based on the exit code etc.
  2. the same goes for other investigators as well.
  3. events are maintained in etcd usually for an hour. So the investigator function will work on a best effort basis i.e. if the user is using theliv to debug within that 1 hour, they will be provided with more information. If they use the app after an hour, the investigator function would not be able to analyze the events and hence would do its best to add more information on top of what is already provided by the alert.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions