feat: Add Query Insights Infrastructure & Database Layer #87
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces a comprehensive query insights infrastructure for Cairo Coder, enabling persistent logging of user interactions and providing API endpoints for analytics. The implementation includes a new database layer with asyncpg, migration tools for historical data from LangSmith, and extensive test coverage.
Key Features:
Architecture
Database Layer (
python/src/cairo_coder/db/)New modules:
models.py- Pydantic model forUserInteractionwith fields for agent_id, query, chat_history, generated_answer, retrieved_sources, and LLM usage metricsrepository.py- Data access layer with functions for creating and querying interactions, includes upsert support for migrationssession.py- Asyncpg connection pool management with per-event-loop pooling to handle FastAPI TestClient and AnyIO edge casesDatabase schema:
API Endpoints (
python/src/cairo_coder/server/insights_api.py)New endpoint:
GET /v1/insights/queries- Paginated query retrieval with filters:start_date,end_date- Time range filtering (ISO 8601)agent_id- Filter by specific agentquery_text- Text search (case-insensitive)limit,offset- Pagination controlsReturns JSON with structure:
{ "items": [{"id": "...", "created_at": "...", "agent_id": "...", "query": "...", "chat_history": [], "output": "..."}], "total": 123, "limit": 100, "offset": 0 }Server Integration (
python/src/cairo_coder/server/app.py)BackgroundTasksRagPipelineto access retrieved sources for loggingMigration Tools (
python/src/cairo_coder_tools/datasets/migrate_langsmith.py)Comprehensive LangSmith migration:
UserInteractionmodelKey features:
Dataset Analysis (
python/src/cairo_coder_tools/datasets/analysis.py)Testing
python/tests/integration/test_insights_api.py)python/tests/unit/test_migrate_langsmith.py)python/tests/unit/db/test_repository.py)Migration Guide
To migrate historical data from LangSmith:
cd python uv run dataset migrate langsmith --days 14