Ragstar is in public ฮฒeta. Expect rapid changes and occasional rough edges.
Ragstar connects to your dbt project, builds a knowledge base from models & documentation, and lets everyone ask data-related questions in plain English via a beautiful web dashboard or Slack. Under the hood Ragstar combines:
- PostgreSQL +
pgvector
for fast similarity search - Embeddings + LLMs (OpenAI, Anthropic, etc.) for reasoning
- A modern Next.js frontend & Django backend
# โ clone & prepare env file
$ git clone https://github.com/pragunbhutani/ragstar.git && cd ragstar
$ cp .env.example .env && ${EDITOR:-vi} .env # โ edit just the vars shown below
# โก build & run everything
$ docker compose up --build -d
When the containers are healthy:
- Frontend: http://localhost:3000 (Next.js)
- API: http://localhost:8000 (Django/DRF)
- Flower: http://localhost:5555 (background tasks)
Run first-time Django tasks:
# inside the running backend container
$ docker compose exec backend-django \
uv run python manage.py migrate
๐ That's it โ open http://localhost:3000, sign up for a new account and you're ready to start using Ragstar.
Only a handful of variables are truly required for a local/dev install. The rest are advanced overrides.
Ragstar keeps the default stack as lightweight as possible.
For a local docker compose
run you only need 3 variables โ everything else has sane fall-backs.
Var | Example | Purpose |
---|---|---|
NEXTAUTH_SECRET |
openssl rand -base64 32 |
Secret used by next-auth to sign session cookies |
NEXTAUTH_URL |
http://localhost:3000 |
Public URL where the frontend is reachable |
NEXT_PUBLIC_API_URL |
http://localhost:8000 |
Public URL of the Django API exposed to the browser |
Create a .env
file in the repo root and paste the three lines above (adjust URLs if you changed the ports).
Var | Default | When you might set it |
---|---|---|
INTERNAL_API_URL |
http://backend-django:8000 |
Only needed when the frontend talks to the backend across Docker networks or remote hosts. |
ENVIRONMENT |
local |
Switch between local , development , production behaviour inside Django settings. |
APP_HOST |
โ | Extra hostname to append to Django ALLOWED_HOSTS & CORS lists, e.g. your public Ngrok / Vercel host. |
DATABASE_URL |
auto-generated by compose | Point the backend & worker to your own Postgres instance. |
CELERY_BROKER_URL |
redis://redis:6379/0 |
Use an external Redis / RabbitMQ instead of the bundled one. |
AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY , AWS_DEFAULT_REGION |
not set | Only required when you disable LocalStack and want to store secrets in real AWS Parameter Store. |
You need to provide API keys for all the LLM providers you plan on using.
LLM_OPENAI_API_KEY=...
LLM_ANTHROPIC_API_KEY=...
LLM_GOOGLE_API_KEY=...
POSTGRES_DB=ragstar
POSTGRES_USER=user
POSTGRES_PASSWORD=password
POSTGRES_PORT=5432
Set a single EXTERNAL_POSTGRES_URL=<url>
to BYO Postgres (with the pgvector
extension).
After logging into the dashboard you'll be guided through these steps:
- Add a dbt project โ Projects โบ New (dbt Cloud recommended โ just paste the service token). GitHub or local zip upload also supported.
- Pick the dbt models you want to use for answering questions โ choose which model to use for โ questions โ embeddings โ SQL verification.
- Configure Slack (optional)
- Go to Integrations โบ Slack.
- Follow the inline manifest to create a Slack app.
- Paste Bot Token, Signing Secret, App Token.
- Ask questions! Use the chat on the dashboard or
/ask
in Slack.
Other integrations (Metabase, Snowflake, MCP) are available under Integrations but currently ฮฒeta / experimental. MCP server is temporarily disabled while we stabilise streaming support.
Common operations are wrapped in one-liners:
# shell into backend or frontend
$ docker compose exec backend-django bash
$ docker compose exec frontend-nextjs sh
# tail logs
$ docker compose logs -f backend-django
# stop / remove containers
docker compose down # keep volumes
docker compose down -v # destroy DB
- Install Python 3.10+, Node 18+, uv, pnpm, and Postgres16+ with
pgvector
. uv venv && source .venv/bin/activate && uv pip install -e backend_django/
pnpm install --filter frontend_nextjs
- Start services in two terminals:
- Backend โ
cd backend_django && uv run python manage.py runserver 0.0.0.0:8000
- Frontend โ
cd frontend_nextjs && pnpm dev
- Backend โ
- Export the same env vars listed above.
Docker is strongly recommended unless you're hacking on the codebase itself.
We ๐ community PRs. Please file an issue first for major changes. Make sure ruff
, black
, mypy
, and eslint
pass before opening a pull request.
Ragstar includes a Model Context Protocol (MCP) server that allows LLM clients like Claude.ai to directly access your dbt knowledge base. The MCP server provides secure, OAuth-authenticated access to your dbt models and project information.
โ ๏ธ Important: The MCP server can only be used with self-hosted/open source deployments of Ragstar. This is due to the 1:1 relationship between an MCP client and server โ each client needs its own dedicated server instance.
The MCP server acts as a bridge between LLM clients (like Claude.ai) and your Ragstar knowledge base. It provides:
- Secure OAuth 2.0 authentication with PKCE
- Organization-scoped access to dbt models and projects
- Real-time data access without manual imports
- Semantic search capabilities across your dbt documentation
- Detailed model information including SQL, lineage, and metadata
The MCP server exposes these tools to LLM clients:
list_dbt_models
โ Browse and filter dbt models by project, schema, or materializationsearch_dbt_models
โ Semantic search for relevant models using natural languageget_model_details
โ Get detailed information about specific models including SQL and lineageget_project_summary
โ Overview of connected dbt projects and their structure
- Self-hosted Ragstar running with Docker Compose
- dbt project connected and models loaded in your knowledge base
- LLM client that supports MCP (e.g., Claude.ai, ChatGPT with MCP support)
The MCP server runs on port 8080
by default. Add these environment variables to your .env
file:
# MCP Server Configuration
MCP_AUTHORIZATION_BASE_URL=http://localhost:8000 # Your Django backend URL
DJANGO_BACKEND_URL=http://localhost:8000 # Backend URL for MCP server
ALLOWED_ORIGINS=* # Or specific origins for security
The MCP server is included in the Docker Compose stack:
# Start all services including MCP server
docker compose up -d
# Check MCP server health
curl http://localhost:8080/health
The MCP server implements a complete OAuth 2.0 authorization flow with organization-scoped access. Here's how it works:
sequenceDiagram
participant Claude as Claude.ai
participant MCP as MCP Server
participant Django as Django Backend
participant User as User Browser
Claude->>MCP: 1. Request OAuth metadata
MCP->>Django: 2. Proxy metadata request
Django->>MCP: 3. Return OAuth configuration
MCP->>Claude: 4. OAuth server metadata
Claude->>MCP: 5. Authorization request (with PKCE)
MCP->>Django: 6. Proxy authorization request
Django->>User: 7. Redirect to frontend login
User->>Django: 8. Authenticate via Next.js
Django->>MCP: 9. Authorization code
MCP->>Claude: 10. Return authorization code
Claude->>MCP: 11. Exchange code for tokens
MCP->>Django: 12. Validate and exchange
Django->>MCP: 13. Access & refresh tokens
MCP->>Claude: 14. JWT tokens
Claude->>MCP: 15. API calls with Bearer token
MCP->>Django: 16. Validate token & get user context
Django->>MCP: 17. Organization-scoped data
MCP->>Claude: 18. Formatted response
- OAuth Metadata Discovery: MCP server exposes RFC 8414 compliant metadata endpoints
- PKCE Security: Uses Proof Key for Code Exchange for enhanced security
- Auto-Registration: Automatically registers new OAuth clients (like Claude.ai)
- Organization Scoping: All data access is automatically scoped to the user's organization
- JWT Tokens: Secure, stateless authentication using JSON Web Tokens
- PKCE (Proof Key for Code Exchange): Prevents authorization code interception attacks
- Organization Isolation: Users can only access their organization's data
- Token Validation: All API calls require valid JWT tokens
- Automatic Expiry: Access tokens expire after 1 hour, refresh tokens after 7 days
- Go to Claude.ai settings
- Find the "MCP Servers" or "Model Context Protocol" section
- Add a new server with these details:
- Server URL:
http://localhost:8080
(or your public URL) - OAuth: Enable OAuth 2.0 authentication
- Auto-discovery: Enable to automatically discover capabilities
- Server URL:
- Claude.ai will redirect you to your Ragstar login page
- Sign in with your Ragstar account
- You'll be redirected back to Claude.ai with authorization
- The connection will be linked to your organization
You can now ask Claude.ai to:
- "List all my dbt models"
- "Search for revenue-related models"
- "Show me details about the customer_metrics model"
- "What dbt projects do I have connected?"
You: "What dbt models do I have available?"
Claude: [Calls list_dbt_models tool]
Claude: "You have 23 dbt models across 2 projects:
- analytics_prod: 15 models (staging, marts schemas)
- marketing_analytics: 8 models (staging, reporting schemas)
..."
You: "Find models related to customer revenue"
Claude: [Calls search_dbt_models with "customer revenue"]
Claude: "I found 3 relevant models:
- customer_revenue_monthly (similarity: 0.95)
- customer_ltv_calculation (similarity: 0.87)
- revenue_attribution (similarity: 0.82)
..."
You: "Show me the SQL for customer_revenue_monthly"
Claude: [Calls get_model_details for "customer_revenue_monthly"]
Claude: "Here's the customer_revenue_monthly model:
- Schema: marts
- Materialization: table
- SQL: SELECT customer_id, DATE_TRUNC('month', order_date) as month, SUM(amount) as revenue FROM..."
- Authentication Fails: Check that your Ragstar backend is accessible and you're logged in
- No Models Found: Ensure your dbt project is connected and models are loaded
- Permission Errors: Verify the user has access to the organization's dbt projects
- Network Issues: Check firewall settings and port accessibility
# Check MCP server health
curl http://localhost:8080/health
# Test OAuth metadata
curl http://localhost:8080/.well-known/oauth-authorization-server
# View MCP server logs
docker compose logs -f mcp-server
For production use:
- Use HTTPS: Configure SSL certificates for secure connections
- Restrict Origins: Set specific allowed origins instead of
*
- Monitor Usage: Track OAuth token usage and API calls
- Scale Considerations: Each client needs its own server instance
- 1:1 Client-Server Relationship: Each MCP client needs its own server instance
- Organization Scoping: Users can only access their own organization's data
- Token Expiry: Access tokens expire after 1 hour (refresh tokens after 7 days)
- Self-Hosted Only: Not available in hosted/SaaS deployments
We ๐ community PRs. Please file an issue first for major changes. Make sure ruff
, black
, mypy
, and eslint
pass before opening a pull request.
Ragstar is released under the MIT License โ see LICENSE.