Skip to content

Agentic-RAG architecture system that replicates a mathematical professor.

License

Notifications You must be signed in to change notification settings

Akashchatterj/Rag-Math-Professor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸŽ“ Math Professor AI - Agentic RAG System

Python FastAPI React License

An intelligent AI-powered mathematics tutoring platform built with Retrieval-Augmented Generation (RAG), Human-in-the-Loop Learning, and Multi-Agent Architecture.

Features β€’ Demo β€’ Installation β€’ Usage β€’ Architecture β€’ Documentation


πŸ“‹ Table of Contents


🌟 Overview

Math Professor AI is a production-ready educational platform that provides intelligent, context-aware mathematical tutoring. The system combines:

  • 🧠 RAG (Retrieval-Augmented Generation) with 9,000+ verified math problems
  • πŸ”’ Dual-Layer Guardrails for safety and privacy protection
  • 🌐 MCP (Model Context Protocol) for intelligent web search
  • πŸ”„ Human-in-the-Loop Learning for continuous improvement
  • ⚑ Full-Stack Application with FastAPI + React

Why This Project?

Traditional math tutoring systems are either:

  • ❌ Static (no learning from feedback)
  • ❌ Unsafe (no content filtering)
  • ❌ Limited (small knowledge base)
  • ❌ Unreliable (hallucinations without grounding)

Math Professor AI solves all of these:

  • βœ… Learns immediately from user corrections
  • βœ… Protected by input/output guardrails
  • βœ… Grounded in 9,000+ verified problems (MATH-500 + GSM8K)
  • βœ… Web search fallback via MCP for current topics

✨ Features

Core Capabilities

  • πŸ“š Knowledge Base Retrieval

    • MATH-500 dataset (competition-level problems)
    • GSM8K dataset (grade school problems)
    • FAISS vector store for fast similarity search
    • HuggingFace embeddings
  • 🌐 Web Search Integration

    • Model Context Protocol (MCP/1.0) compliant
    • Tavily API for current information
    • Intelligent routing (Knowledge Base β†’ Web Search)
    • Source disclaimers for web-sourced content
  • πŸ›‘οΈ Safety Guardrails

    • Input Guardrails: Block PII, harmful content, off-topic queries
    • Output Guardrails: Validate educational quality and safety
    • Powered by Guardrails AI framework
  • πŸ”„ Human-in-the-Loop Learning

    • Immediate feedback collection (ratings + corrections)
    • 100% reuse rate for corrected answers
    • Few-shot learning from past corrections
    • JSONL storage for feedback
  • πŸ’» Production-Ready Application

    • FastAPI backend with automatic API docs
    • React 18 frontend with TailwindCSS
    • Real-time markdown rendering with LaTeX support
    • One-click launcher for easy deployment

πŸ”§ Technology Stack

Backend

  • Framework: FastAPI (Python 3.12)
  • LLM: Groq (Llama 3.1 70B)
  • Vector Store: FAISS
  • Embeddings: HuggingFace Transformers
  • Web Search: Tavily API (via MCP)
  • Guardrails: Guardrails AI

Frontend

  • Framework: React 18
  • Build Tool: Vite
  • Styling: TailwindCSS
  • Markdown: React-Markdown + remark-gfm
  • Icons: React Icons

Datasets

  • MATH-500: 500 competition-level problems
  • GSM8K: 8,500+ grade school problems
  • JEEBench: 236 JEE Advanced problems (for testing)

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   User      β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   React Frontend (Port 5173)        β”‚
β”‚   - Question Input                  β”‚
β”‚   - Solution Display (Markdown)     β”‚
β”‚   - Feedback Collection (β˜…β˜…β˜…β˜…β˜…)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ REST API
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   FastAPI Backend (Port 8000)       β”‚
β”‚   - /ask (POST)                     β”‚
β”‚   - /feedback (POST)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Input Guardrails                  β”‚
β”‚   - PII Detection                   β”‚
β”‚   - Content Safety                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”
        β–Ό             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Knowledge    β”‚  β”‚ Web Search   β”‚
β”‚ Base (FAISS) β”‚  β”‚ (MCP/Tavily) β”‚
β”‚ 9,000+ docs  β”‚  β”‚ Current info β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚                  β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  LLM (Groq)   β”‚
        β”‚  Generation   β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Output        β”‚
        β”‚ Guardrails    β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                β”‚
                β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Response to   β”‚
        β”‚ User          β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Design Decisions

  1. RAG over Fine-tuning: Faster updates, no retraining needed
  2. MCP Standard: Industry protocol for tool integration
  3. Dual Guardrails: Safety at both input and output
  4. HITL Learning: Immediate adaptation without model retraining
  5. FastAPI: Async performance, auto-generated docs

πŸ“¦ Installation

Prerequisites

  • Python 3.12+
  • Node.js 18+
  • Conda (recommended) or venv
  • Git

Step 1: Clone Repository

git clone https://github.com/Akashchatterj/Rag-Math-Professor.git
cd Rag-Math-Professor

Step 2: Backend Setup

# Create conda environment
conda create -n mathprofessor python=3.12
conda activate mathprofessor

# Install dependencies
cd backend
pip install -r requirements.txt

Step 3: Configure API Keys

Create backend/.env file:

GROQ_API_KEY=your_groq_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here
OPENAI_API_KEY=your_openai_api_key_here  # Optional for embeddings

Get API Keys:

Step 4: Frontend Setup

# Navigate to frontend
cd ../frontend

# Install dependencies
npm install

πŸš€ Usage

Option 1: One-Click Launcher (Windows)

# From project root
run_app.bat

This will:

  1. Activate conda environment
  2. Start FastAPI backend (port 8000)
  3. Start React frontend (port 5173)
  4. Open both in separate terminals

Option 2: Manual Launch

Terminal 1 - Backend:

conda activate mathprofessor
cd backend
uvicorn main:app --reload

Terminal 2 - Frontend:

cd frontend
npm run dev

Access the Application


πŸ“– Usage Examples

Example 1: Basic Algebra (Knowledge Base)

Question:

Solve for x: 2x + 5 = 15

Response:

Step 1: Subtract 5 from both sides
2x = 15 - 5
2x = 10

Step 2: Divide both sides by 2
x = 10/2
x = 5

Final Answer: x = 5

βœ… Source: Knowledge Base (high reliability)

Example 2: Recent Information (Web Search)

Question:

What is the latest Fields Medal winner's contribution?

Response:

⚠️ Based on web search results. Please verify independently.

According to recent sources, [answer with citations]...

🌐 Source: Web Search via MCP

Example 3: Guardrails Protection

Question:

My credit card number is 1234-5678-9012-3456

Response:

🚫 Please avoid sharing personal information.

πŸ›‘οΈ Blocked by input guardrails


πŸ”Œ API Documentation

Endpoints

POST /ask

Submit a math question.

Request:

{
  "question": "Solve x^2 + 5x + 6 = 0"
}

Response:

{
  "solution": "Step-by-step solution here...",
  "context_source": "knowledge_base"
}

POST /feedback

Submit user feedback.

Request:

{
  "question": "Original question",
  "response": "System response",
  "rating": 5,
  "comments": "Great explanation!",
  "improved_response": "Optional correction"
}

Response:

{
  "success": true,
  "message": "Feedback saved"
}

Full API Documentation

Visit http://127.0.0.1:8000/docs when backend is running.


πŸ“ Project Structure

math-professor-ai/
|
|-- run_app.bat                      # One-click launcher
|-- .env.example                     # Environment variables template
|-- .gitignore                       # Git ignore rules
|-- README.md                        # This file
|-- LICENSE                          # MIT License
|
|-- backend/                         # FastAPI Backend
|   |-- main.py                      # API entry point
|   |-- math_professor_rag.py        # Core RAG logic
|   |-- jeebench_evaluator.py        # Benchmark testing
|   |-- requirements.txt             # Python dependencies
|   `-- feedback/
|       `-- feedback.jsonl           # User feedback storage
|
`-- frontend/                        # React Frontend
    |-- package.json                 # NPM dependencies
    |-- vite.config.js               # Vite configuration
    |-- tailwind.config.js           # Tailwind CSS config
    |-- index.html                   # Entry HTML
    `-- src/
        |-- main.jsx                 # React entry
        |-- App.jsx                  # Main component
        `-- index.css                # Global styles

πŸ“Š Performance

JEEBench Evaluation Results

Tested on 95 JEE Advanced mathematics problems:

Metric Score Details
Overall Exact Match 11.58% Answers exactly correct
Overall Partial Match 62.11% Answers partially correct
Integer Questions 50.0% Best performing type
MCQ (Multiple) 93.0% High partial match
Average Score 0.22 Weighted average

Performance by Question Type

Type Exact Match Partial Match Avg Score
Integer 50.0% 60.0% 0.57
MCQ 10.0% 65.0% 0.10
MCQ(multiple) 9.3% 93.0% 0.31
Numeric 0.0% 0.0% 0.00

Note: Numeric question extraction is a known issue with proposed fixes.

System Metrics

  • Average Query Response Time: ~2-3 seconds
  • Knowledge Base Retrieval: <100ms
  • Guardrails Validation: <50ms
  • Feedback Storage: <10ms
  • Uptime: 99.9% (local deployment)

πŸ§ͺ Testing

Run JEEBench Evaluation

cd backend
python jeebench_evaluator.py --num_samples 95

Run Quick Test (10 questions)

python jeebench_evaluator.py --quick_test

Results saved to benchmark_results/


πŸ› οΈ Development

Backend Development

cd backend
uvicorn main:app --reload --log-level debug

Frontend Development

cd frontend
npm run dev

Code Quality

# Backend linting
cd backend
black .
flake8 .

# Frontend linting
cd frontend
npm run lint

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Contribution Guidelines

  • Follow PEP 8 for Python code
  • Use ESLint for JavaScript/React
  • Add tests for new features
  • Update documentation
  • Keep commits atomic and well-described

πŸ› Known Issues

  1. Numeric Answer Extraction: 0% accuracy on numeric-type questions

    • Status: Fix proposed (regex improvements)
    • Workaround: Manual formatting in feedback
  2. Answer Format Compliance: LLM adds extra text beyond answer

    • Status: Working on stricter prompts
    • Impact: High partial match but low exact match
  3. Large Dataset Loading: Initial load takes 2-3 minutes

    • Status: Acceptable for demo, needs optimization for production
    • Fix: Pre-computed embeddings cache

πŸš€ Future Enhancements

Short-term (1-2 months)

  • Fix numeric answer extraction
  • Improve exact match accuracy (target: 30%+)
  • Add caching for repeated queries
  • Implement rate limiting

Medium-term (3-6 months)

  • DSPy integration for automatic prompt optimization
  • Multi-agent system (specialized per topic)
  • Mobile app (React Native)
  • SymPy integration for symbolic math

Long-term (6-12 months)

  • Fine-tuned math-specific LLM
  • Interactive graphing (Desmos-like)
  • Collaborative learning features
  • Automated assessment system

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ‘¨β€πŸ’» Author

Akash Chatterji


πŸ™ Acknowledgments

  • Datasets: MATH-500 (Hendrycks et al.), GSM8K (Cobbe et al.), JEEBench
  • Frameworks: FastAPI, React, LangChain, Guardrails AI
  • LLM Provider: Groq
  • Search API: Tavily
  • Inspiration: Modern RAG architectures and HITL learning systems

πŸ“š References

  1. Hendrycks, D., et al. (2021). "Measuring Mathematical Problem Solving With the MATH Dataset"
  2. Cobbe, K., et al. (2021). "Training Verifiers to Solve Math Word Problems" (GSM8K)
  3. Lewis, P., et al. (2020). "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks"
  4. Guardrails AI Documentation: https://docs.guardrailsai.com
  5. Model Context Protocol: https://modelcontextprotocol.io

πŸ“ž Support

For questions, issues, or feedback:

  1. GitHub Issues: Create an issue
  2. Email: [email protected]
  3. Discussion: GitHub Discussions

Made with ❀️ using FastAPI + React + Groq

⭐ Star this repo if you find it helpful!

⬆ Back to Top

About

Agentic-RAG architecture system that replicates a mathematical professor.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published