-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Is your feature request related to a problem? Please describe.
Currently, Harbor does not actively monitor the health or connectivity of its configured storage backend (e.g., S3, Azure Blob, GCS, filesystem).
If a network issue, DNS problem, expired credentials, or increased latency occurs, Harbor only detects the problem during a push, pull, or replication operation.
This can lead to user-facing failures and makes it difficult for administrators to proactively address backend connectivity issues.
Describe the solution you'd like
I propose adding a periodic storage backend health check feature directly in Harbor.
This would periodically test connectivity from Harbor to the storage backend (e.g., a HEAD request for S3 or a small read test for filesystem).
Expected behavior:
- Display the backend status in the
/api/v2.0/health
endpoint. - Expose Prometheus metrics such as:
harbor_storage_backend_up
harbor_storage_backend_latency_seconds
- Optionally log or trigger alerts if connectivity fails.
This approach allows administrators to detect connectivity issues from Harbor's perspective before they impact user operations.
Describe the main design/architecture of your solution
The proposed solution introduces a Storage Health Check Module inside Harbor Core.
Key Components
-
Health Check Module
- Runs as a background job within Harbor Core.
- Periodically performs a lightweight connectivity test to the configured storage backend (e.g., HEAD request for S3 or small read/write for filesystem).
- Records the result, status, and latency.
-
API Integration
- Reports the backend status in
/api/v2.0/health
alongside other Harbor components.
- Reports the backend status in
-
Prometheus Integration
- Exposes metrics such as
harbor_storage_backend_up
andharbor_storage_backend_latency_seconds
for monitoring.
- Exposes metrics such as
-
Configuration
- All settings (enable/disable, interval, timeout) are configurable via
harbor.yml
.
- All settings (enable/disable, interval, timeout) are configurable via
High-Level Flow
[Harbor Core]
|
v
[Storage Health Check Module] ---> [Backend Storage (S3 / FS)]
|
+--> /api/v2.0/health
+--> Prometheus metrics
This architecture ensures that Harbor can proactively monitor backend connectivity, report health status, and integrate seamlessly with existing monitoring tools.
Describe the development plan you've considered
Phase 1:
- Implement the basic probe module and report results in
/api/v2.0/health
.
Phase 2:
- Add Prometheus metrics and optional alerting.
- Add configuration options in
harbor.yml
(interval, timeout, enable/disable).
Phase 3 (optional):
- Display backend health status in the Harbor UI (System Health page).
- Support advanced settings per backend type.
This feature can be implemented incrementally without major architectural changes.
Additional context
This feature will help administrators proactively detect issues such as:
- Expired or invalid credentials
- Network or DNS failures
- Cloud provider endpoint downtime
- Latency or timeout problems accessing storage
Example /api/v2.0/health
output:
{
"storage": {
"type": "s3",
"status": "healthy",
"latency_ms": 42,
"last_checked": "2025-10-08T12:45:00Z"
}
}