A lightweight system monitoring tool for macOS that collects system metrics, exposes them via Prometheus, and visualizes them with Grafana.
Current Version: 0.0.2
- Real-time monitoring of system resources:
- CPU usage
- Load average
- Memory usage and availability
- Disk utilization
- Network bandwidth (RX/TX)
- Network errors and dropped packets
- Network latency and packet loss
- Context switches
- Page faults
- Selective metrics collection (enable/disable specific metrics)
- Threshold-based alerting (configurable thresholds)
- Prometheus metrics endpoint
- Pre-configured Grafana dashboard
- Memory stress testing tool
- Go 1.18+
- Prometheus
- Grafana
-
Clone this repository:
git clone https://github.com/yourusername/observT.git cd observT
-
Build the monitoring tool:
go build -o observT main.go config.go
Configuration is managed in config.go
. The default configuration includes:
Setting | Default Value | Description |
---|---|---|
ThresholdCpuUsage | 90.0% | CPU usage threshold |
ThresholdLoadMultiplier | 2.0 | Load average threshold = CPU_CORES * multiplier |
ThresholdMemUsage | 95.0% | Memory usage threshold |
ThresholdMemAvailablePercent | 5.0% | Available memory threshold (critical if below) |
ThresholdDiskUtil | 90.0% | Disk utilization threshold |
ThresholdNetUtil | 70.0% | Network utilization threshold |
ThresholdContextSwitchPerCore | 10000 | Context switches per second per core |
ThresholdPageFaultRate | 500 | Page faults per second |
SampleInterval | 1 | Sampling interval in seconds |
SampleCount | 3 | Number of samples to collect |
You can modify these settings directly in config.go
before building the application.
./observT
By default, ObservT exposes metrics on http://localhost:9095/metrics.
ObservT supports the following command-line flags to customize its behavior:
Flag | Default | Description |
---|---|---|
-config |
"config.yaml" | Path to YAML configuration file |
-cpu |
true | Enable/disable CPU metrics collection |
-load |
true | Enable/disable load average metrics collection |
-memory |
true | Enable/disable memory metrics collection |
-disk |
true | Enable/disable disk utilization metrics collection |
-network |
true | Enable/disable network bandwidth metrics collection |
-ctx-switch |
true | Enable/disable context switch metrics collection |
-page-fault |
true | Enable/disable page fault metrics collection |
-all |
true | Master switch to enable/disable all metrics |
Examples:
# Collect only CPU and memory metrics
./observT -all=false -cpu=true -memory=true
# Collect all metrics except context switches
./observT -ctx-switch=false
# Run with all default metrics (everything enabled)
./observT
# Use a different configuration file
./observT -config=/path/to/custom-config.yaml
-
Navigate to the prometheus directory:
cd prometheus
-
Update the prometheus.yml file if needed to point to your metrics endpoint (default is configured)
-
Start Prometheus with the configuration:
prometheus --config.file=prometheus.yml
-
Start Grafana:
brew services start grafana
-
Access Grafana at http://localhost:3000 (default credentials: admin/admin)
-
Add Prometheus as a data source:
- URL: http://localhost:9090
- Access: Browser
-
Import the dashboard:
- Navigate to Dashboards > Import
- Upload the JSON file from grafana/system_monitor_dashboard.json
The system exposes the following Prometheus metrics:
system_cpu_usage_percent
- Current CPU usage in percentsystem_load_average_15m
- 15-minute load averagesystem_memory_used_percent
- Memory usage in percentsystem_memory_available_percent
- Available memory in percentsystem_disk_utilization_percent{disk="..."}
- Disk utilization in percent per disksystem_network_rx_mbps{interface="..."}
- Network receive bandwidth in Mbps per interfacesystem_network_tx_mbps{interface="..."}
- Network transmit bandwidth in Mbps per interfacesystem_context_switches_per_second
- Context switches per secondsystem_page_faults_per_second
- Page faults per second
These metrics emit 1 when a threshold is breached, 0 otherwise:
system_cpu_threshold_breached
- CPU usage threshold breachsystem_load_threshold_breached
- Load average threshold breachsystem_memory_usage_threshold_breached
- Memory usage threshold breachsystem_memory_available_threshold_breached
- Available memory threshold breachsystem_disk_threshold_breached{disk="..."}
- Disk utilization threshold breachsystem_network_rx_threshold_breached{interface="..."}
- Network receive threshold breachsystem_network_tx_threshold_breached{interface="..."}
- Network transmit threshold breachsystem_context_switches_threshold_breached
- Context switches threshold breachsystem_page_faults_threshold_breached
- Page faults threshold breach
This project is licensed under a custom license that allows free use with attribution for individuals and small organizations. Enterprise users must obtain explicit permission for commercial use. See the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.