Monitoring

Monitoring and Observability

This page describes the monitoring and observability setup for the Cloud-Native E-commerce Platform.

Monitoring Stack

Our platform uses a comprehensive monitoring stack to ensure visibility into system performance, errors, and behavior:

Elasticsearch: Log storage and indexing
Kibana: Log visualization and analysis
Prometheus: Metrics collection
Grafana: Metrics visualization and dashboards
Jaeger: Distributed tracing
Kiali: Service mesh visualization (when using Istio)

Accessing Monitoring Tools

After deployment, you can access the monitoring tools at:

Tool	URL	Credentials
Prometheus	http://localhost:9090	-
Grafana	http://localhost:3000	admin/prom-operator
Kibana	http://localhost:5601	-
Jaeger	http://localhost:16686	-
Kiali	http://localhost:20001	-

Logging

Logging Infrastructure

We use the Elastic Stack (ELK) for centralized logging:

Application logs: Generated by services using Serilog
Log shipping: Logs are sent to Elasticsearch
Log storage: Elasticsearch stores and indexes logs
Log visualization: Kibana provides dashboards and search

Log Configuration

Each microservice is configured to use Serilog with structured logging:

// Program.cs
public static IHostBuilder CreateHostBuilder(string[] args) =>
    Host.CreateDefaultBuilder(args)
        .UseSerilog((context, config) =>
        {
            config
                .ReadFrom.Configuration(context.Configuration)
                .Enrich.FromLogContext()
                .Enrich.WithMachineName()
                .WriteTo.Console()
                .WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri(context.Configuration["ElasticConfiguration:Uri"]))
                {
                    AutoRegisterTemplate = true,
                    IndexFormat = $"{context.Configuration["ApplicationName"]}-logs-{context.HostingEnvironment.EnvironmentName?.ToLower().Replace(".", "-")}-{DateTime.UtcNow:yyyy-MM}"
                });
        });

Log Levels

We use the following log levels:

Verbose: Detailed debugging information
Debug: Debugging information
Information: General information
Warning: Non-critical issues
Error: Errors that need attention
Fatal: Critical errors that cause application failure

Viewing Logs in Kibana

Open Kibana at http://localhost:5601
Navigate to "Discover" in the left sidebar
Create an index pattern matching your service logs
Use the search bar to filter logs by service, level, or content

Metrics

Metrics Collection

We use Prometheus for metrics collection:

Application metrics: Exposed by services using Prometheus .NET Client
Infrastructure metrics: Collected by Prometheus Node Exporter
Kubernetes metrics: Collected by kube-state-metrics

Key Metrics

We monitor the following key metrics:

Request Rate: Requests per second by service
Error Rate: Errors per second by service
Latency: Response time percentiles (p50, p90, p99)
CPU Usage: CPU usage by service and node
Memory Usage: Memory usage by service and node
Disk Usage: Disk usage by node
Network Traffic: Network I/O by service and node

Prometheus Configuration

Prometheus is configured to scrape metrics from:

Kubernetes API server
Kubernetes nodes
Microservices (via annotations)
Service mesh (when using Istio)

Viewing Metrics in Prometheus

Open Prometheus at http://localhost:9090
Use the "Expression" field to query metrics
View graphs or tables of results

Grafana Dashboards

We provide pre-configured Grafana dashboards:

Platform Overview: High-level system health
Microservices: Detailed service metrics
Node Resources: Infrastructure metrics
API Gateway: Gateway-specific metrics
Database Performance: Database metrics

To access dashboards:

Open Grafana at http://localhost:3000
Log in with admin/prom-operator
Navigate to "Dashboards" in the left sidebar

Distributed Tracing

Tracing Infrastructure

We use Jaeger for distributed tracing:

Trace generation: Services use OpenTelemetry to generate traces
Trace collection: Jaeger Collector receives traces
Trace storage: Jaeger stores traces
Trace visualization: Jaeger UI provides trace analysis

Trace Configuration

Each microservice is configured to send traces to Jaeger:

// Program.cs
services.AddOpenTelemetryTracing(builder =>
{
    builder
        .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService(Configuration["ApplicationName"]))
        .AddAspNetCoreInstrumentation()
        .AddHttpClientInstrumentation()
        .AddSource("MediatR")
        .AddJaegerExporter(options =>
        {
            options.AgentHost = Configuration["Jaeger:AgentHost"];
            options.AgentPort = int.Parse(Configuration["Jaeger:AgentPort"]);
        });
});

Viewing Traces in Jaeger

Open Jaeger at http://localhost:16686
Select a service from the dropdown
Configure search parameters
View and analyze traces

Service Mesh Monitoring

When using Istio service mesh:

Kiali for Service Mesh Visualization

Open Kiali at http://localhost:20001
View service mesh topology
Analyze traffic flow between services
Monitor service health

Istio Metrics

Istio provides additional metrics:

Request Volume: Requests per second by service
Success Rate: Percentage of successful requests
Latency: Response time by service
TCP Traffic: TCP metrics by service

Alerts and Notifications

Alert Configuration

We use Prometheus Alertmanager for alerts:

Alert rules: Defined in Prometheus
Alert processing: Handled by Alertmanager
Notifications: Sent via configured channels (email, Slack, etc.)

Key Alerts

We have pre-configured alerts for:

Service Down: When a service is not responding
High Error Rate: When error rate exceeds threshold
High Latency: When response time exceeds threshold
Resource Saturation: When CPU/memory usage is high
Disk Space Low: When disk space is running out

Health Checks

Each microservice implements health checks:

Liveness: Verifies service is running
Readiness: Verifies service can handle requests
Startup: Verifies service has started correctly

Kubernetes uses these health checks to manage container lifecycle.

Custom Monitoring

To add custom metrics:

Add Prometheus metrics to your service:

// Define metrics
private static readonly Counter OrdersProcessed = Metrics.CreateCounter(
    "orders_processed_total", 
    "Number of processed orders");

// Use metrics
OrdersProcessed.Inc();

Ensure your service exposes a metrics endpoint at /metrics
Add scrape configuration to Prometheus
Create dashboards in Grafana for your metrics

Uh oh!

Monitoring

Monitoring and Observability

Monitoring Stack

Accessing Monitoring Tools

Logging

Logging Infrastructure

Log Configuration

Log Levels

Viewing Logs in Kibana

Metrics

Metrics Collection

Key Metrics

Prometheus Configuration

Viewing Metrics in Prometheus

Grafana Dashboards

Distributed Tracing

Tracing Infrastructure

Trace Configuration

Viewing Traces in Jaeger

Service Mesh Monitoring

Kiali for Service Mesh Visualization

Istio Metrics

Alerts and Notifications

Alert Configuration

Key Alerts

Health Checks

Custom Monitoring

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally