Skip to content

Commit 30f1657

Browse files
committed
feat: analytics retention
1 parent ec7e07f commit 30f1657

File tree

20 files changed

+2389
-45
lines changed

20 files changed

+2389
-45
lines changed

API_DOCUMENTATION.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,112 @@ Common cases:
352352
- `404 Not Found` — unknown agent id.
353353
- `500 Internal Server Error` — unexpected backend issues.
354354

355+
## Query Insights
356+
357+
The Query Insights API exposes raw interaction logs and lightweight analytics for downstream processing.
358+
359+
### `GET /v1/insights/queries`
360+
361+
Fetch paginated user queries within a specific window.
362+
363+
- `start_date` _(ISO 8601, required)_ — inclusive lower bound.
364+
- `end_date` _(ISO 8601, required)_ — inclusive upper bound.
365+
- `agent_id` _(optional)_ — filter by agent id when provided.
366+
- `limit` _(default `100`)_ — maximum rows returned.
367+
- `offset` _(default `0`)_ — pagination offset.
368+
369+
**Response** `200 OK`
370+
371+
```json
372+
{
373+
"items": [
374+
{
375+
"id": "ad0c2b34-04ab-4d0a-9855-47c19f0f2830",
376+
"created_at": "2024-04-01T12:30:45.123456+00:00",
377+
"agent_id": "cairo-coder",
378+
"final_user_query": "How do I declare a storage variable in Cairo 1?"
379+
}
380+
],
381+
"total": 1,
382+
"limit": 100,
383+
"offset": 0
384+
}
385+
```
386+
387+
### `POST /v1/insights/analyze`
388+
389+
Trigger an asynchronous analysis job. The response returns immediately with the job identifier; the analysis runs in the background.
390+
391+
#### Request
392+
393+
```json
394+
{
395+
"start_date": "2024-04-01T00:00:00Z",
396+
"end_date": "2024-04-15T23:59:59Z",
397+
"agent_id": "cairo-coder"
398+
}
399+
```
400+
401+
**Response** `202 Accepted`
402+
403+
```json
404+
{
405+
"analysis_id": "88ed4a1e-1bda-45b9-a3e8-5c6df8b6f1f1",
406+
"status": "pending"
407+
}
408+
```
409+
410+
### `GET /v1/insights/analyses`
411+
412+
List recent analysis jobs.
413+
414+
**Response** `200 OK`
415+
416+
```json
417+
[
418+
{
419+
"id": "88ed4a1e-1bda-45b9-a3e8-5c6df8b6f1f1",
420+
"created_at": "2024-04-15T12:00:00+00:00",
421+
"status": "completed",
422+
"analysis_parameters": {
423+
"start_date": "2024-04-01T00:00:00+00:00",
424+
"end_date": "2024-04-15T23:59:59+00:00",
425+
"agent_id": "cairo-coder"
426+
}
427+
}
428+
]
429+
```
430+
431+
### `GET /v1/insights/analyses/{analysis_id}`
432+
433+
Fetch a specific analysis job. If the job completed successfully, `analysis_result` contains the summarized metrics; otherwise `error_message` explains the failure.
434+
435+
**Response** `200 OK`
436+
437+
```json
438+
{
439+
"id": "88ed4a1e-1bda-45b9-a3e8-5c6df8b6f1f1",
440+
"created_at": "2024-04-15T12:00:00+00:00",
441+
"status": "completed",
442+
"analysis_parameters": {
443+
"start_date": "2024-04-01T00:00:00+00:00",
444+
"end_date": "2024-04-15T23:59:59+00:00",
445+
"agent_id": "cairo-coder"
446+
},
447+
"analysis_result": {
448+
"total_queries": 42,
449+
"average_word_count": 18.6,
450+
"top_terms": [
451+
["cairo", 7],
452+
["storage", 4]
453+
]
454+
},
455+
"error_message": null
456+
}
457+
```
458+
459+
If the job id is unknown, the server responds with `404 Not Found`.
460+
355461
## Versioning & Compatibility
356462

357463
- Current API version: `1.0.0` (see FastAPI metadata).

python/pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,9 @@ strict_optional = true
142142
testpaths = ["tests"]
143143
pythonpath = ["src"]
144144
asyncio_mode = "auto"
145+
markers = [
146+
"db: marks tests that require a database (run by default, use -m 'not db' to skip)",
147+
]
145148
filterwarnings = [
146149
"ignore::DeprecationWarning",
147150
"ignore::PendingDeprecationWarning",

python/src/cairo_coder/core/rag_pipeline.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,11 @@ def __init__(self, config: RagPipelineConfig):
8282
self._current_processed_query: ProcessedQuery | None = None
8383
self._current_documents: list[Document] = []
8484

85+
@property
86+
def last_retrieved_documents(self) -> list[Document]:
87+
"""Documents retrieved during the most recent pipeline execution."""
88+
return self._current_documents
89+
8590
async def _aprocess_query_and_retrieve_docs(
8691
self,
8792
query: str,
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Database utilities for the Cairo Coder server.
3+
4+
This package exposes helpers for initializing the asyncpg connection pool and
5+
provides Pydantic representations used when persisting query insights data.
6+
"""
7+
8+
from .models import QueryAnalysis, UserInteraction
9+
from .repository import (
10+
create_analysis_job,
11+
create_user_interaction,
12+
get_analysis_job_by_id,
13+
get_analysis_jobs,
14+
get_interactions,
15+
update_analysis_job,
16+
)
17+
from .session import close_pool, execute_schema_scripts, get_pool
18+
19+
__all__ = [
20+
"QueryAnalysis",
21+
"UserInteraction",
22+
"create_analysis_job",
23+
"create_user_interaction",
24+
"get_analysis_job_by_id",
25+
"get_analysis_jobs",
26+
"get_interactions",
27+
"update_analysis_job",
28+
"close_pool",
29+
"execute_schema_scripts",
30+
"get_pool",
31+
]
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
"""
2+
Pydantic models representing rows stored in the query insights database tables.
3+
"""
4+
5+
from __future__ import annotations
6+
7+
import uuid
8+
from datetime import datetime, timezone
9+
from typing import Any, Optional
10+
11+
from pydantic import BaseModel, Field
12+
13+
14+
class UserInteraction(BaseModel):
15+
"""Represents a record in the user_interactions table."""
16+
17+
id: uuid.UUID = Field(default_factory=uuid.uuid4)
18+
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
19+
agent_id: str
20+
mcp_mode: bool = False
21+
chat_history: Optional[list[dict[str, Any]]] = None
22+
query: str
23+
generated_answer: Optional[str] = None
24+
retrieved_sources: Optional[list[dict[str, Any]]] = None
25+
llm_usage: Optional[dict[str, Any]] = None
26+
27+
28+
class QueryAnalysis(BaseModel):
29+
"""Represents a record in the query_analyses table."""
30+
31+
id: uuid.UUID = Field(default_factory=uuid.uuid4)
32+
created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
33+
status: str = "pending"
34+
analysis_parameters: dict[str, Any]
35+
analysis_result: Optional[dict[str, Any]] = None
36+
error_message: Optional[str] = None

0 commit comments

Comments
 (0)