A comprehensive Kubernetes monitoring and observability stack featuring Prometheus, Grafana, Loki, Alloy, and integrated log aggregation solutions.
This repository contains production-ready Kubernetes manifests and Helm configurations for deploying a complete observability solution with metrics collection, log aggregation, visualization, and alerting capabilities.
- Prometheus Operator: Automated prometheus deployment and configuration via CRDs
- Prometheus: Time-series metrics collection and storage
- AlertManager: Alert routing and management
- Node Exporter: Hardware and kernel metrics
- Installation via Helm with customizable values
- Custom alert rules and ServiceMonitor configurations
Files:
install.sh- Installation script for prometheus-operator and AlertManagerkube-prometheus-stack.yaml- Kubernetes manifestsvalues.yaml- Helm chart values configuration
- Grafana: Multi-source data visualization dashboard
- Loki: Lightweight log aggregation system
- Alloy: Observability data collector (formerly Grafana Agent)
- Custom datasource configurations
- Pre-built dashboards and alerting rules
- Integration with Prometheus and Loki data sources
Key Files:
grafana-loki/grafana-loki.sh- Installation script for Loki, Alloy, and Grafanagrafana-loki/alloy.yaml- Alloy configuration for log collectiongrafana-loki/loki-single.yaml- Single-instance Loki configurationgrafana-loki/grafana-custom.yaml- Grafana custom configuration
- Fluent Bit: Lightweight log processor and shipper
- MinIO: S3-compatible object storage for logs
- HTTP input for receiving logs (port 9880)
- Record filtering to reduce log volume
- Integration with Loki for log shipping
- Configurable health checks and resource limits
Features:
- Removes unnecessary fields from logs
- Kubernetes label auto-discovery
- Service monitoring and health checks
- Configurable resource limits and requests
- Elasticsearch: Distributed search and analytics engine
- Kibana: Log analysis and visualization
- Fluent Bit: Log collection and forwarding
- LoadBalancer services for external access
- Credentials management via Kubernetes secrets
- Parseable: Log storage and query system
- MinIO: S3 backend storage for logs
- API-driven log stream creation
- Webhook integration with MinIO audit logs
- RESTful API for log ingest and retrieval
- Basic authentication for security
Features:
- HTTP PUT/GET APIs for log management
- Support for MinIO audit webhook events
- Queue directory for failed deliveries
- Optional TLS configuration
- Community-maintained Prometheus configurations
- Alternative deployment values
- Reference implementations
- Sample Go microservice with built-in observability
- Multiple endpoints for testing:
/healthz,/readyz,/status/*,/delay/* - Prometheus metrics exposure
- Includes CLI tool (
podcli) - Kubernetes deployment manifests (Kustomize)
- Docker image build configuration
- OpenTelemetry support via Docker Compose
Components:
cmd/podinfo/- Main applicationcmd/podcli/- CLI utilitydeploy/- Kustomize and Helm deployment overlaysotel/- OpenTelemetry integrationcharts/podinfo/- Helm chart
- Load simulation script for testing monitoring stack
- Tests multiple endpoints on target service
- Real-time statistics collection
- Simulates various HTTP response codes and latencies
- Useful for validating metrics collection and alerting
- Grafana Meta Monitoring: Monitor the monitoring stack itself
- Observability of Prometheus, Grafana, and Loki
- MinIO integration for storage metrics
- Self-monitoring capabilities
Installation: meta-install.sh
- Combined deployment of Prometheus, Grafana, and Loki
- Alloy configuration for unified data collection
- Prometheus scrape configurations with MinIO integration
- Prometheus adapter for custom metrics
- Loki gateway with authentication
- Fluent Bit integration for log forwarding
Key Files:
alloy.yaml- Alloy data collection configurationloki-single.yaml- Single-instance Loki setupprometheus-adapter.yaml- Prometheus adapter rulesprometheus-minio-scraps.yaml- MinIO scrape configurationfluent-bit-loki-values.yaml- Fluent Bit values
- Multi-Tenant Observability: Prometheus metrics, Loki logs, and Grafana dashboards
- Cloud-Native: Fully containerized with Kubernetes integration
- Scalable Storage: MinIO provides S3-compatible object storage
- Log Processing: Fluent Bit for lightweight log collection and filtering
- Alerting: AlertManager with multi-channel routing
- Visualization: Grafana dashboards with multiple data source support
- Monitoring the Monitor: Meta-monitoring for stack health
- Traffic Generation: Built-in load testing capabilities
- Sample Application: Podinfo for testing and validation
./grafana-prom-loki.sh # Deploys complete stackcd minio-fluent-bit
# Configure and deploy Fluent Bit with MinIO backendcd minio-parseable
# Deploy Parseable with MinIO S3 storagecd meta
./meta-install.sh # Monitor the monitoring stack./generate-traffic.sh http://podinfo-service:8080- Metrics Path: Application → Prometheus → AlertManager → Grafana
- Logs Path: Application → Fluent Bit → Loki/Elasticsearch/Parseable → Grafana
- Storage: MinIO provides S3-compatible backend for multiple systems
- Collection: Alloy unifies metrics and logs collection
- Visualization: Grafana as central dashboard platform
values-kube-prometheus-stack.yaml- Kube-prometheus-stack Helm valuescustom_kube_prometheus_stack.yml- Custom Prometheus configurationdebug-pod.yaml- Debugging pod for troubleshootingloki-microservice.yaml- Loki microservice deploymentpodinfo-servicemonitor.yaml- ServiceMonitor for Podinfo application
- Kubernetes cluster (1.20+)
- Helm 3.0+
- kubectl configured to access cluster
- Sufficient storage for logs and metrics retention
- Review and customize Helm values for your environment
- Deploy core monitoring stack:
./grafana-prom-loki.sh - Deploy log aggregation solution of choice (Fluent Bit, Elastic, or Parseable)
- Deploy sample application (Podinfo) for testing
- Access Grafana and configure dashboards
- Generate traffic for testing:
./generate-traffic.sh - Configure alerts and notification channels in AlertManager
- MinIO: S3-compatible object storage (default for Fluent Bit, Parseable)
- Elasticsearch: Full-text search and analytics (EFK stack)
- Loki: Lightweight, horizontally scalable log aggregation
- Prometheus: Time-series database for metrics (TSDB)
✅ Metrics collection (Prometheus) ✅ Log aggregation (Loki/Elasticsearch/Parseable) ✅ Visualization (Grafana) ✅ Alerting (AlertManager) ✅ Log processing (Fluent Bit) ✅ Data collection (Alloy) ✅ Stack monitoring (Meta Monitoring) ✅ Health checks & readiness probes ✅ Traffic simulation & testing
Repository Structure: Modular organization with separate namespaces and deployment directories for easy customization and isolation.