MiRAG is an interactive, multi-modal application built with Streamlit that leverages Retrieval-Augmented Generation (RAG) to perform question-answering and summarization across various content types:
- 🌐 Web pages
- 📄 PDF documents
- 📺 YouTube videos
- 📝 Custom user input
Built on LangChain, Gemini (Google Generative AI), and FAISS, MiRAG enables users to query unstructured content intelligently and intuitively.
- Extract and embed content from any public URL (JS and non-JS).
- Perform context-aware question answering and summarization.
- Retain memory across conversation turns.
- Upload any PDF and perform:
- Contextual Q&A
- Full-document summarization
- Chat history export as PDF
- Input any YouTube video URL to fetch its transcript.
- Ask questions and generate a summary.
- Ideal for educational content, lectures, and long-form videos.
- Use default chatbot mode or paste your own text block.
- Build a temporary vectorstore and perform RAG on your content.
- Memory support with chat history download.
- Python 3.10+
- Streamlit – User Interface
- LangChain – Chain and embedding orchestration
- Google Generative AI (Gemini) – LLM & embeddings
- FAISS – Vectorstore for semantic retrieval
- YouTube Transcript API – Transcript extraction
- FPDF – PDF generation for exporting chats
-
Clone the repository:
git clone https://github.com/iamtgiri/MiRAG.git cd MiRAG -
Create a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set environment variable:
export GOOGLE_API_KEY=your_api_key_here -
Run the app:
streamlit run app.py
MiRAG/
├── app.py # Main Streamlit app
├── pdf_utils.py # PDF loading, splitting & summarization
├── process_youtube.py # YouTube video processing & transcript extraction
├── rag_utils.py # Utility functions & chain builders
├── requirements.txt
└── README.md
A preview of the MiRAG application in action across different modules:
Normal Q&A without any context
Paste custom text, ask questions, and get answers using RAG with memory
Upload a PDF, ask questions, and download the chat history as a PDF:
Enter a YouTube URL, analyze the transcript, and chat with context:
Summarize the video and export the chat:
- Built with LangChain
- Powered by Google Gemini
- PDF export via FPDF
- Transcripts via YouTube Transcript API
MIT License © 2025 Tanmoy Giri
See LICENCE for details.







