Skip to content

Conversation

@PythonGermany
Copy link
Contributor

@PythonGermany PythonGermany commented Dec 14, 2025

Summary

Implement the requested severity feature by introducing custom states.

If implementation moves forward as suggested:
Closes #227, close #1274.

If we structure it correctly we could start working on #1444 as a followup if this PR gets merged to make it possible to configure multiple themes for each state to allow adding colorblind and other themes which can then be selected via the UI.

Checklist

Implementation tasks:

The following logic needs to be implemented (backend):

  • New config section states defining the available states
    • No custom config is given -> Set default states healthy, unhealthy and maintenance
    • Merge custom config with default config
    • Validate config
      • Check name conflicts
      • Check priority conflicts
  • New UI section configuring colors/themes for states
    • No custom config is given -> Set default color config for default states
    • Merge custom config with default config
    • Validate config
      • Verify that names in state and color config match
      • Validate color code format
  • Extend alerting config
    • Add state priority threshold below which no alerting should take place (set state in config and resolve to priority at runtime)
  • Logic to evaluate state of result/endpoint
    • In maintenance state
    • Default healthy/unhealthy states -> One properly implemented also covered by severity logic
    • Custom states -> Input will be condition evaluation results
      • Select correct state based on highest priority
  • New API functionality
    • Send color and UI relevant information for states
    • Send new state information of result
    • Include state in events to display useful info and correct color in UI
  • DB Migration adding new result state column if not present and set appropriate state based on Success value

The following logic has to be implemented (frontend):

  • Add new placeholders to embed color config when starting gatus
  • Handle current status UI text value in endpoint details view
  • Use new API state fields
    • to display correct color and text in status badges
    • to display correct color and text in endpoint events
    • to display correct color and text in endpoint result history bar
    • to display correct color and text for events in response time trend chart
  • Handle absence of custom color config (to prevent issues in development environment or state config changes)
    • Handled by using a hardcoded frontend local color used to communicate missing custom state color config

Outstanding issues/tasks/questions

  • Colors are broken when in frontend development mode -> npm run serve
  • Add new event on state change and not just on success change?
    • Make configurable?
  • Test impacts of color stuff on remote instance experimental feature
  • Reserve unknown state and allow state color config?
  • Show state for result in tooltip
  • Set success field based on new config option configurable for each state
  • Apply custom color for default states to everything relevant regarding suites
  • How is config file merging handled? First merged then validated? (interest because of new config sections)
    • readme says first merged then validated
  • How to handle external endpoints, haven't looked into so far
    • Maybe just also add the state field and let the pushing code decide the state?
    • and/or if no state is given set default state based on existing success field

Final tasks

  • Tested and/or added tests to validate that the changes work as intended, if applicable.
  • Updated documentation in README.md, if applicable.

@github-actions github-actions bot added the feature New feature or request label Dec 14, 2025
@TwiN
Copy link
Owner

TwiN commented Dec 14, 2025

Just FYI before you commit time to implementing this - there has been a few attempts at implementing severity, and the issue is usually that it's a breaking change and I don't like the implementation. I understand some people would love a way to add some level of severity, but to me, I'm not even convinced I would want this feature in Gatus in the first place because of all the confusion it adds around health % calculation, uptime calculation, the increase in complexity for specifying what severity a specific condition impacts, etc.

I've seen the comment you posted on the issue, and it's a good idea to avoid breaking changes, but I think it would be too difficult to understand for most people (and also states is perhaps not the right word here for this configuration, as it's not really self-explanatory like severities or something similar would be), and again, it would add a lot of complexity for people to understand in the configuration.

Long story short, it adds a whole lot of complexity for what I think is very little advantage, and for that reason, at this stage, I'm not sure I'm interested in adding this to Gatus

@PythonGermany
Copy link
Contributor Author

Thanks for the heads up! I understand that working on this might lead nowhere and that it's a complicated thing to add since it impacts a lot of core components. Currently I'm just exploring and trying to figure out what the biggest road blocks are towards a fully working implementation/prototype.

Out of my perspective it's a good topic because this way I need to have a look at many different core components and features of Gatus. So even if it leads nowhere it will not be wasted time for me, I'm sure.

The issue is usually that it's a breaking change

Breaking change in configuration compatibility is probably the most critical I imagine. How problematic are breaking changes in API and database structure (e.g. added state member, in addition to the currently existing success field)?

I'm not even convinced I would want this feature in Gatus in the first place because of all the confusion it adds around health % calculation, uptime calculation

My goal would be to not change the default behavior in any way. Only people trying to would have to think about the implications of customizing it.

In regards to added complexity I definitely see your point. However I think is the structure on how to configure it is well designed one detailed entry in the FAQ section of the readme would be enough for most people to understand.

I'll keep this PR in draft as long as I'm experimenting and once the base of the implementation is solid and thorough enough I'll open it. Maybe you'll see the potential once I have hopefully solved some of the bigger issues, and if not so be it. I know what I signed up for!

@TwiN
Copy link
Owner

TwiN commented Dec 14, 2025

Making a breaking change on the database schema would only imply removing or modifying existing fields. It wouldn't be preferable, but it's not a big deal if we have to do that. There is no public database API contract for Gatus, meaning there is no guarantee that the database schema will not have breaking changes.

The REST API, however, is a different story.
In your case, you'd likely be adding to the API, not removing or changing existing fields in a way that would actually constitute a breaking change.

Changes are just breaking if users currently relying on a specific field are no longer able to rely on that field due to a change. Such a change would be breaking, and would require a major version bump, which can be done if absolutely necessary.

If you're not sure whether something is a breaking change, feel free to ping me and ask.

@PythonGermany
Copy link
Contributor Author

PythonGermany commented Dec 15, 2025

states is perhaps not the right word here for this configuration, as it's not really self-explanatory like severities or something similar would be

I've thought about this and I'm not sure if severities is more self-explanatory. I thought about calling it endpoint-states but that doesn't really make sense either. The best suggestion I currently have would be to call the option result-states, EDIT: New idea, maybe health-states?

@PythonGermany
Copy link
Contributor Author

image

I've gotten the prototype far enough that it already provides the following functionality:

  • Default states (healthy, unhealty and maintenance)
  • States can be linked to conditions and if the condition is not successful the state will be selected
  • Configurable colors for each state in the UI

@PythonGermany PythonGermany force-pushed the add-custom-state-support branch from 2d32cd9 to eb1f2a9 Compare December 16, 2025 08:54
@PythonGermany PythonGermany force-pushed the add-custom-state-support branch from b5ad733 to 45b22d9 Compare December 17, 2025 12:37
@PythonGermany PythonGermany force-pushed the add-custom-state-support branch from 45b22d9 to 57a7668 Compare December 17, 2025 12:38
@PythonGermany
Copy link
Contributor Author

PythonGermany commented Dec 17, 2025

Events are now displayed with a custom text and color:

Showcase screenshot image

In development mode the template placeholder for the window.config is not replaced and no custom colors are available. The current implementation uses violet if the color for the displayed state is unknown.

Could also be changed to fallback on the Success field of the endpoint results. This situation should only arise during development if api and webserver instance are bundled as one, since the state and color config link is validated at server startup:

Showcase development screenshot image

@PythonGermany
Copy link
Contributor Author

PythonGermany commented Dec 17, 2025

@TwiN I'm getting ready for a feedback phase.

There are still some parts (and many small details) to be implemented/discussed and most of the tests are still missing but enough is already working to showcase it in a usable way.

Long story short, it adds a whole lot of complexity for what I think is very little advantage, and for that reason, at this stage, I'm not sure I'm interested in adding this to Gatus

I'd move forward with implementing/discussing the details and cleaning up the rough edges if you are more interested after looking at the rough implementation for this feature. After working on it for a little while I see the added complexity but also the potential.

I've added new columns to the corresponding tables in the db schemas, and I am not sure if it will break things for existing DBs. I just did a short test locally and when switching to the feature branch with an old db gatus complains about missing columns, which makes sense. I'm just not sure how to move forward to mitigate that. Figured it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issues with maintenance-windows feat: Add support for severity

2 participants