-
Notifications
You must be signed in to change notification settings - Fork 461
Application Health Monitoring
We collect metrics about the usage of OBS, such as logins of users, creation of packages and projects and alike.
The monitoring dashboards are hosted at https://obs-measure.opensuse.org/. You can login with your GitHub account and should get the Editor role.
The openSUSE RabbitMQ is running at https://rabbit.opensuse.org/.
Two dashboards are particularly important: The Application Health Overview and the Detailed Errors Dashboard
This dashboard gives a general overview about the health status of the application. You could say if the application is up or not by looking at the following panels:
This panel tracks the application traffic from the application's point of view.

If the number of successful requests gets too low, it means we may have a problem that prevents users from working, it will send an
This panel tracks requests with an http status error code.

If the number of errors gets too high, it means something is happening. Our exception tracker collects some of them.
This panel monitor burst of authentication failures within 10 minutes.

This panel tracks request creation and request state changes.

This panel tracks projects destroyed and created within a minute.

This panel tracks packages destroyed and created.

This panel tracks the total amount of projects that were created and destroyed.

This panel tracks the total amount of packages that were created and destroyed.

This panel tracks the total amount of users that were created within an hour.

This panel tracks the total amount of users who joined and left the beta program.

This dashboard gives a detailed picture of the errors happening in the application. Each type of error has its own panel:
-
500 (Internal server error):
⚠️ It will send an alert when there are more than 10 errors per minute during 2 minutes. - 400 (Bad Request)
- 401 (Unauthorized)
- 403 (Forbidden)
- 404 (Not found) / min
- 408 (Request Timeout)
- 422 (Unprocessable Entity)
Our AHM stack consists of:
Metrics we collect are sent to RabbitMQ to the metrics queue.
Telegraf fetches these metrics and reports them to InfluxDB.
InfluxDB stores the time series data we collect (database telegraf).
Grafana is used to create graphs to visualize the collected data.
Instructions for setting up the development environment for AHM can be found in our docker documentation
Go to Grafana frontend, http://localhost:8000, and login (admin/admin).
Add a new data source by adding following data:
Type: InfluxDB
URL: http://influx:8086
Database: telegraf
User: grafana
Password: grafana
- Development Environment Overview
- Development Environment Tips & Tricks
- Spec-Tips
- Code Style
- Rubocop
- Testing with VCR
- Test in kanku
- Authentication
- Authorization
- Autocomplete
- BS Requests
- Events
- ProjectLog
- Notifications
- Feature Toggles
- Build Results
- Attrib classes
- Flags
- The BackendPackage Cache
- Maintenance classes
- Cloud uploader
- Delayed Jobs
- Staging Workflow
- StatusHistory
- OBS API
- Owner Search
- Search
- Links
- Distributions
- Repository
- Data Migrations
- Package Versions
- next_rails
- Ruby Update
- Rails Profiling
- Remote Pairing Setup Guide
- Factory Dashboard
- osc
- Setup an OBS Development Environment on macOS
- Run OpenQA smoketest locally
- Responsive Guidelines
- Importing database dumps
- Problem Statement & Solution
- Kickoff New Stuff
- New Swagger API doc
- Documentation and Communication
- GitHub Actions
- Brakeman
- How to Introduce Software Design Patterns
- Query Objects
- Services
- View Components
- RFC: Core Components
- RFC: Decorator Pattern
- RFC: Backend models
- RFC: Hotwire Turbo Frames Pattern