Closed
Description
It takes a while to understand what is wrong in GoCD. Following up on an alert requires VPN and finding failure log in GoCD, and then knowing how to search for the actual failure, depending on where it failed. It would be great if the alerts had more context.
AC:
Timeboxed effort -- 1 day.
- Some useful extract of the logs shows up in the Opsgenie alert (so that we can tell if it's a known/unknown issue, etc.)
Questions/Notes:
- This work would only help in situations where you don't have to go on GoCD to re-run a stage anyhow (e.g. self-closing alerts).
- We want to switch to ArgoCD & Kubernetes relatively soon; are there quick improvements to get more context in alerts, or are there improvements that would carry over?
- Could we get the error details into the alert so VPN and GoCD login isn’t required?
- The Runbook has some notes that can be referenced (or added to) for searching to find errors in logs of various stages.
Metadata
Metadata
Assignees
Type
Projects
Status
Done - Long Term Storage