Skip to content

Feature Request: Storage Backend Connectivity Health Check #22431

@jarqvi

Description

@jarqvi

Is your feature request related to a problem? Please describe.

Currently, Harbor does not actively monitor the health or connectivity of its configured storage backend (e.g., S3, Azure Blob, GCS, filesystem).
If a network issue, DNS problem, expired credentials, or increased latency occurs, Harbor only detects the problem during a push, pull, or replication operation.
This can lead to user-facing failures and makes it difficult for administrators to proactively address backend connectivity issues.


Describe the solution you'd like

I propose adding a periodic storage backend health check feature directly in Harbor.
This would periodically test connectivity from Harbor to the storage backend (e.g., a HEAD request for S3 or a small read test for filesystem).

Expected behavior:

  • Display the backend status in the /api/v2.0/health endpoint.
  • Expose Prometheus metrics such as:
    • harbor_storage_backend_up
    • harbor_storage_backend_latency_seconds
  • Optionally log or trigger alerts if connectivity fails.

This approach allows administrators to detect connectivity issues from Harbor's perspective before they impact user operations.


Describe the main design/architecture of your solution

The proposed solution introduces a Storage Health Check Module inside Harbor Core.

Key Components

  1. Health Check Module

    • Runs as a background job within Harbor Core.
    • Periodically performs a lightweight connectivity test to the configured storage backend (e.g., HEAD request for S3 or small read/write for filesystem).
    • Records the result, status, and latency.
  2. API Integration

    • Reports the backend status in /api/v2.0/health alongside other Harbor components.
  3. Prometheus Integration

    • Exposes metrics such as harbor_storage_backend_up and harbor_storage_backend_latency_seconds for monitoring.
  4. Configuration

    • All settings (enable/disable, interval, timeout) are configurable via harbor.yml.

High-Level Flow


[Harbor Core]
|
v
[Storage Health Check Module] ---> [Backend Storage (S3 / FS)]
|
+--> /api/v2.0/health
+--> Prometheus metrics

This architecture ensures that Harbor can proactively monitor backend connectivity, report health status, and integrate seamlessly with existing monitoring tools.


Describe the development plan you've considered

Phase 1:

  • Implement the basic probe module and report results in /api/v2.0/health.

Phase 2:

  • Add Prometheus metrics and optional alerting.
  • Add configuration options in harbor.yml (interval, timeout, enable/disable).

Phase 3 (optional):

  • Display backend health status in the Harbor UI (System Health page).
  • Support advanced settings per backend type.

This feature can be implemented incrementally without major architectural changes.


Additional context

This feature will help administrators proactively detect issues such as:

  • Expired or invalid credentials
  • Network or DNS failures
  • Cloud provider endpoint downtime
  • Latency or timeout problems accessing storage

Example /api/v2.0/health output:

{
  "storage": {
    "type": "s3",
    "status": "healthy",
    "latency_ms": 42,
    "last_checked": "2025-10-08T12:45:00Z"
  }
}

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions