-
Notifications
You must be signed in to change notification settings - Fork 531
Open
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Issue Description
The current /health
endpoint in cognee/api/client.py
only returns a basic 200 status code without checking the actual health of backend components. For production deployments, container orchestration, and monitoring systems, we need a comprehensive health check that verifies all critical backend services are accessible and functioning properly.
Current State
Existing Health Endpoint (/health
):
@app.get("/health")
def health_check():
"""Health check endpoint that returns the server status."""
return Response(status_code=200)
Problems:
- No actual health verification of backend services
- Cannot detect database connectivity issues
- Cannot identify LLM provider failures
- No differentiation between critical and non-critical failures
- Limited monitoring and observability data
- No startup readiness verification
Requirements
1. Backend Components to Health Check
Critical Services (failure should return 503):
- Relational Database: SQLite/PostgreSQL connectivity and schema validation
- Vector Database: LanceDB/Qdrant/PGVector/FalkorDB/ChromaDB connectivity
- Graph Database: Kuzu/Neo4j/FalkorDB/Memgraph connectivity and schema validation
- File Storage: Local filesystem/S3 accessibility and permissions
Non-Critical Services (failure should return 200 with warnings):
- LLM Provider: OpenAI/Ollama/Anthropic/Custom/Gemini API connectivity
- Embedding Service: Embedding engine responsiveness
- Cloud Storage: S3/Azure/GCS extended connectivity (if configured)
2. Health Check Endpoints
Primary Endpoints:
GET /health
- Basic liveness probe (existing, enhanced)GET /health/ready
- Readiness probe for KubernetesGET /health/detailed
- Comprehensive health status with component details
Response Format:
{
"status": "healthy|degraded|unhealthy",
"timestamp": "2024-01-15T10:30:45Z",
"version": "1.0.0",
"uptime": 3600,
"components": {
"relational_db": {
"status": "healthy|unhealthy",
"provider": "sqlite|postgres",
"response_time_ms": 45,
"details": "Connection successful"
},
"vector_db": {
"status": "healthy|unhealthy",
"provider": "lancedb|qdrant|pgvector|falkordb|chromadb",
"response_time_ms": 120,
"details": "Index accessible"
},
"graph_db": {
"status": "healthy|unhealthy",
"provider": "kuzu|neo4j|falkordb|memgraph",
"response_time_ms": 89,
"details": "Schema validated"
},
"file_storage": {
"status": "healthy|unhealthy",
"provider": "local|s3|azure|gcs",
"response_time_ms": 156,
"details": "Storage accessible"
},
"llm_provider": {
"status": "healthy|unhealthy|degraded",
"provider": "openai|ollama|anthropic|custom|gemini",
"response_time_ms": 1250,
"details": "API responding"
},
"embedding_service": {
"status": "healthy|unhealthy",
"provider": "openai|huggingface|custom",
"response_time_ms": 890,
"details": "Embedding generation working"
}
}
}
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed