Intelligent Document Processing with RAG-Powered Conversations
Nimbus is your Document Mind - a sophisticated AI system that reads, understands, and converses about your documents using advanced RAG (Retrieval-Augmented Generation) technology. Transform any collection of documents into an intelligent knowledge base that you can chat with naturally. Built with Flask, PostgreSQL with pgvector, and Ollama for local LLM inference.
Nimbus acts as your Document Mind - it doesn't just store your documents, it truly understands them. Ask questions in natural language and get intelligent answers backed by your actual content, with full source citations.
- Multi-Model Embeddings: Support for multiple embedding models simultaneously (nomic-embed-text, mxbai-embed-large, all-minilm)
- Intelligent Retrieval: Query multiple embedding models and merge results with deduplication
- Context-Aware Responses: LLM only answers from your documents, preventing hallucinations
- Source Citations: Track which documents were used to generate each answer
-
Multiple Parsers:
- PyMuPDF (fast, standard PDFs)
- PDFPlumber (tables and structured data)
- Unstructured (advanced layout detection)
- OCR Parser (scanned documents, images)
- OCR + Vision (AI-powered image description using LLaVA)
-
Smart Text Splitting:
- Recursive Character Splitter (balanced chunks)
- Token-based Splitter (LLM-optimized)
- Semantic Splitter (embedding-based, natural boundaries)
- Session Management: Organize conversations by topic
- Persistent History: All chats saved to database
- Multi-Model Support: Switch between different LLMs
- Conversation History: Maintains context across messages
- Sidebar Navigation: Quick access to all chat sessions
- Role-based access control (Admin/User)
- Secure password hashing with bcrypt
- User creation, deletion, and password management
- Per-user document isolation
- Custom Nimbus branding with professional logos
- Responsive Bootstrap 5 design
- Dark/light theme support
- Real-time status updates
- Drag-and-drop file upload
- Document preview functionality
Nimbus transforms your documents into an intelligent, searchable knowledge base:
- 📄 Ingestion: Upload documents in various formats (PDF, DOCX, TXT, etc.)
- 🔍 Understanding: Advanced parsers extract text, including OCR for scanned documents
- ✂️ Chunking: Smart text splitting creates semantically meaningful pieces
- 🧭 Vectorization: Multiple embedding models create rich vector representations
- 💬 Conversation: Chat naturally - Nimbus retrieves relevant information and responds intelligently
- 📋 Citation: Every answer includes source references to maintain trust and accuracy
🧠 Nimbus Document Mind
┌─────────────────┐
│ Web Browser │
└────────┬────────┘
│
┌────────▼────────────────────────────────────┐
│ Flask Application (Document Mind) │
│ ┌──────────┬──────────┬──────────────┐ │
│ │ Chat │Documents │ Users │ │
│ │ Blueprint│Blueprint │ Blueprint │ │
│ └──────────┴──────────┴──────────────┘ │
└────────┬────────────────────────────────────┘
│
┌────┴────┐
│ │
┌───▼──┐ ┌──▼──────────┐
│Ollama│ │ PostgreSQL │
│ LLMs │ │ + pgvector │
└──────┘ └─────────────┘
💭 AI Mind 🧠 Memory Bank
Tech Stack:
- Backend: Flask (Python)
- Database: PostgreSQL 16 + pgvector extension
- LLM/Embeddings: Ollama (local inference)
- Document Processing: PyMuPDF, PDFPlumber, Tesseract OCR, Pillow
- Text Splitting: LangChain, custom implementations
- Frontend: Bootstrap 5, vanilla JavaScript
- Containerization: Docker & Docker Compose
- Docker and Docker Compose installed
- Python 3.12+ (if running without Docker)
- Ollama installed and running with desired models
- Clone the repository
git clone <your-repo-url>
cd nimbus
- Choose your deployment method
Option A: Full Docker Deployment (Recommended for Production)
# Starts Nimbus app + PostgreSQL + Ollama
docker compose up -d
Option B: Development Setup (Local Python + Docker Services)
# Only starts PostgreSQL + Ollama, run Nimbus locally
docker compose -f docker-compose.dev.yml up -d
- If using Option B (Development), set up Python environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
- Configure environment variables (Optional)
Create a .env
file in the project root:
# Flask Configuration
FLASK_SECRET_KEY=your-secure-secret-key-here
FLASK_ENV=development
FLASK_DEBUG=true
APP_HOST=0.0.0.0
APP_PORT=8000
# Database
DATABASE_URL=postgresql://postgres:postgres@localhost:5432/nimbus
# Ollama
OLLAMA_URL=http://localhost:11434
# Default Models
DEFAULT_EMBEDDING_MODEL=nomic-embed-text
- Run the application (Development mode only)
# Only needed for Option B
python app.py
- Access the application
Open your browser and navigate to:
http://localhost:8000
Default credentials:
- Username:
admin
- Password:
admin123
⚠️ Security Note: Change the default password immediately after first login!
Nimbus login screen with professional branding
Main dashboard showing the Document Mind interface
Easy drag-and-drop document upload interface
View and manage your uploaded documents
Choose the appropriate parser for your document type
Generate embeddings with different models
Monitor document processing progress
Natural language chat with your documents
📝 Note: For detailed usage instructions with step-by-step screenshots, see the Usage Guide.
Navigate to Documents page:
- Click "Upload Document" or drag & drop files
- Supported formats: PDF, TXT, MD, DOCX, PPTX
- Files are associated with your user account
Choose a parser based on your document type:
- PyMuPDF: Best for standard PDFs with selectable text
- PDFPlumber: Excellent for tables and structured data
- Unstructured: Advanced layout analysis
- OCR: For scanned documents or images
- OCR + Vision: Combines text extraction with AI image description
Select a splitting strategy:
- Recursive: Balanced chunks with configurable size/overlap
- Token-based: Optimized for LLM token limits
- Semantic: Uses embeddings to find natural breakpoints
Choose embedding models to create:
nomic-embed-text
: Fast, efficientmxbai-embed-large
: High accuracyall-minilm
: Compact, good for large datasets
💡 Tip: Generate multiple embedding models for better retrieval!
Toggle documents "enabled" to include them in RAG context
Go to Chat page:
- Select a chat model (e.g.,
llama3.2
,qwen2.5
) - Start asking questions about your documents
- The AI will retrieve relevant chunks and cite sources
- Create multiple sessions to organize conversations
All configuration is centralized in config.py
. Key settings:
RAG_TOP_K_PER_MODEL = 5 # Top chunks per embedding model
RAG_TOP_K_OVERALL = 10 # Total chunks to include in context
RAG_SNIPPET_MAX_CHARS = 800 # Max characters per snippet
DEFAULT_CHUNK_SIZE = 1000 # Characters per chunk
DEFAULT_CHUNK_OVERLAP = 200 # Overlap between chunks
Define which embedding tables to query for each chat model:
MODEL_EMBEDDING_TABLE_MAP = {
'llama3:latest': [
{'table': 'document_embeddings_nomic_embed_text', 'embedding_model': 'nomic-embed-text'},
{'table': 'document_embeddings_mxbai_embed_large', 'embedding_model': 'mxbai-embed-large'}
]
}
Full Stack Deployment:
# Complete deployment with all services
docker compose up -d
Development Setup:
# Only database and Ollama (run Nimbus locally)
docker compose -f docker-compose.dev.yml up -d
python app.py
What's Included:
- 🐘 PostgreSQL with pgvector - Vector database for embeddings
- 🤖 Ollama - Local LLM inference server
- 🧠 Nimbus App - Document Mind application (full deployment only)
- 🗄️ Persistent volumes - Data survives container restarts
- 🌐 Internal networking - Services communicate securely
Database initialization scripts in db/init/
:
01_init.sql
: Creates users table, pgvector extension, and default admin user02_chat_tables.sql
: Creates chat sessions and messages tables with triggers
See DEPLOYMENT.md for detailed deployment options and production setup.
Nimbus queries multiple embedding models simultaneously and intelligently merges results:
- Computes embeddings for user query with each configured model
- Retrieves top-K chunks from each embedding table
- Deduplicates based on content
- Ranks by similarity score
- Sends top results to LLM as context
For image-heavy or scanned documents:
- Converts PDF pages to images (300 DPI for OCR)
- Extracts text using Tesseract OCR
- Optionally uses LLaVA vision model to describe images
- Combines textual and visual information
Uses embeddings to find natural boundaries:
- Calculates similarity between consecutive sentences
- Splits at points where similarity drops (semantic shift)
- Creates more coherent chunks than arbitrary character counts
nimbus/
├── app.py # Main Flask application
├── config.py # Centralized configuration
├── requirements.txt # Python dependencies
├── docker-compose.yml # Docker setup
│
├── apps/ # Modular blueprints
│ ├── chat/ # Chat interface & RAG logic
│ ├── documents/ # Document management
│ │ ├── parsers/ # PDF/OCR parsers
│ │ └── splitters/ # Text splitting strategies
│ └── users/ # User management
│
├── db/ # Database
│ ├── init/ # SQL initialization scripts
│ └── chat.db # SQLite (if using)
│
├── templates/ # HTML templates
├── static/ # CSS, JS, images
└── uploads/ # User-uploaded files
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Please ensure your code follows the existing style and includes appropriate documentation.
- Hybrid Retrieval + Re-ranker: Combine dense and sparse retrieval with sophisticated re-ranking algorithms
- ANN Indexing for pgvector: Implement HNSW/IVFFlat indexing for faster similarity search at scale
- Semantic Caching & Query Expansion: Cache embeddings and expand queries for better retrieval coverage
- Security Hardening: Move beyond default credentials with OAuth, RBAC, API keys, and audit logging
- Async Ingestion Pipeline: Background processing with job queues for large document batches
- Advanced Analytics Dashboard: Usage metrics, performance monitoring, and system insights
- More File Types + Table Extraction: Excel, CSV, HTML, PowerPoint with advanced table parsing
- Multi-language Support: International document processing and multilingual embeddings
- Batch Document Processing: Efficient handling of large document collections
- API Endpoints: RESTful API for programmatic access and third-party integrations
- Export Chat Conversations: Export functionality for conversations and knowledge artifacts
- Docker Image: Complete containerized application for easy deployment
- Advanced User Management: Organizations, teams, and granular permissions
- Document Versioning: Track changes and maintain document history
- Audit Trails: Complete logging for compliance and monitoring
- Custom Model Integration: Support for private/custom LLM and embedding models
- Default admin login:
admin
/admin123
- See SECURITY.md for complete security guidelines
- Follow the security checklist before going live
For security issues, please report responsibly to maintainers directly.
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama for local LLM inference
- pgvector for PostgreSQL vector extension
- LangChain for semantic splitting utilities
- Bootstrap for UI components
- Tesseract OCR for text extraction
For questions, issues, or feature requests:
- Open an issue on GitLab
- Check existing documentation in the
/docs
folder - Review the configuration guide in
CONFIGURATION_GUIDE.md
Built with ❤️ to be your intelligent Document Mind - transforming how you interact with knowledge 🧠📄