-
-
Notifications
You must be signed in to change notification settings - Fork 0
Monitoring
This page describes the monitoring and observability setup for the Cloud-Native E-commerce Platform.
Our platform uses a comprehensive monitoring stack to ensure visibility into system performance, errors, and behavior:
- Elasticsearch: Log storage and indexing
- Kibana: Log visualization and analysis
- Prometheus: Metrics collection
- Grafana: Metrics visualization and dashboards
- Jaeger: Distributed tracing
- Kiali: Service mesh visualization (when using Istio)
After deployment, you can access the monitoring tools at:
Tool | URL | Credentials |
---|---|---|
Prometheus | http://localhost:9090 | - |
Grafana | http://localhost:3000 | admin/prom-operator |
Kibana | http://localhost:5601 | - |
Jaeger | http://localhost:16686 | - |
Kiali | http://localhost:20001 | - |
We use the Elastic Stack (ELK) for centralized logging:
- Application logs: Generated by services using Serilog
- Log shipping: Logs are sent to Elasticsearch
- Log storage: Elasticsearch stores and indexes logs
- Log visualization: Kibana provides dashboards and search
Each microservice is configured to use Serilog with structured logging:
// Program.cs
public static IHostBuilder CreateHostBuilder(string[] args) =>
Host.CreateDefaultBuilder(args)
.UseSerilog((context, config) =>
{
config
.ReadFrom.Configuration(context.Configuration)
.Enrich.FromLogContext()
.Enrich.WithMachineName()
.WriteTo.Console()
.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri(context.Configuration["ElasticConfiguration:Uri"]))
{
AutoRegisterTemplate = true,
IndexFormat = $"{context.Configuration["ApplicationName"]}-logs-{context.HostingEnvironment.EnvironmentName?.ToLower().Replace(".", "-")}-{DateTime.UtcNow:yyyy-MM}"
});
});
We use the following log levels:
- Verbose: Detailed debugging information
- Debug: Debugging information
- Information: General information
- Warning: Non-critical issues
- Error: Errors that need attention
- Fatal: Critical errors that cause application failure
- Open Kibana at http://localhost:5601
- Navigate to "Discover" in the left sidebar
- Create an index pattern matching your service logs
- Use the search bar to filter logs by service, level, or content
We use Prometheus for metrics collection:
- Application metrics: Exposed by services using Prometheus .NET Client
- Infrastructure metrics: Collected by Prometheus Node Exporter
- Kubernetes metrics: Collected by kube-state-metrics
We monitor the following key metrics:
- Request Rate: Requests per second by service
- Error Rate: Errors per second by service
- Latency: Response time percentiles (p50, p90, p99)
- CPU Usage: CPU usage by service and node
- Memory Usage: Memory usage by service and node
- Disk Usage: Disk usage by node
- Network Traffic: Network I/O by service and node
Prometheus is configured to scrape metrics from:
- Kubernetes API server
- Kubernetes nodes
- Microservices (via annotations)
- Service mesh (when using Istio)
- Open Prometheus at http://localhost:9090
- Use the "Expression" field to query metrics
- View graphs or tables of results
We provide pre-configured Grafana dashboards:
- Platform Overview: High-level system health
- Microservices: Detailed service metrics
- Node Resources: Infrastructure metrics
- API Gateway: Gateway-specific metrics
- Database Performance: Database metrics
To access dashboards:
- Open Grafana at http://localhost:3000
- Log in with admin/prom-operator
- Navigate to "Dashboards" in the left sidebar
We use Jaeger for distributed tracing:
- Trace generation: Services use OpenTelemetry to generate traces
- Trace collection: Jaeger Collector receives traces
- Trace storage: Jaeger stores traces
- Trace visualization: Jaeger UI provides trace analysis
Each microservice is configured to send traces to Jaeger:
// Program.cs
services.AddOpenTelemetryTracing(builder =>
{
builder
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(Configuration["ApplicationName"]))
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddSource("MediatR")
.AddJaegerExporter(options =>
{
options.AgentHost = Configuration["Jaeger:AgentHost"];
options.AgentPort = int.Parse(Configuration["Jaeger:AgentPort"]);
});
});
- Open Jaeger at http://localhost:16686
- Select a service from the dropdown
- Configure search parameters
- View and analyze traces
When using Istio service mesh:
- Open Kiali at http://localhost:20001
- View service mesh topology
- Analyze traffic flow between services
- Monitor service health
Istio provides additional metrics:
- Request Volume: Requests per second by service
- Success Rate: Percentage of successful requests
- Latency: Response time by service
- TCP Traffic: TCP metrics by service
We use Prometheus Alertmanager for alerts:
- Alert rules: Defined in Prometheus
- Alert processing: Handled by Alertmanager
- Notifications: Sent via configured channels (email, Slack, etc.)
We have pre-configured alerts for:
- Service Down: When a service is not responding
- High Error Rate: When error rate exceeds threshold
- High Latency: When response time exceeds threshold
- Resource Saturation: When CPU/memory usage is high
- Disk Space Low: When disk space is running out
Each microservice implements health checks:
- Liveness: Verifies service is running
- Readiness: Verifies service can handle requests
- Startup: Verifies service has started correctly
Kubernetes uses these health checks to manage container lifecycle.
To add custom metrics:
- Add Prometheus metrics to your service:
// Define metrics
private static readonly Counter OrdersProcessed = Metrics.CreateCounter(
"orders_processed_total",
"Number of processed orders");
// Use metrics
OrdersProcessed.Inc();
- Ensure your service exposes a metrics endpoint at
/metrics
- Add scrape configuration to Prometheus
- Create dashboards in Grafana for your metrics