This project is a lightweight yet powerful question-answering system that allows users to:
🧠 Ask questions about the content of any public web page URL, powered by the
mistralai/mistral-small-3.2-24b-instruct
model.
Live Demo Link - Huggingface
LLAMAINDEX.mp4
Without LangChain, this would require 200+ lines of manual data handling and orchestration logic. With it, you built a smart, extensible RAG system in ~30 lines of code.
- ✅ Takes a URL as input
- ✅ Loads and splits the page into chunks using LangChain
- ✅ Converts chunks into vector embeddings (MiniLM)
- ✅ Performs semantic retrieval using FAISS
- ✅ Uses Mistral 3.2 (24B, free via OpenRouter) to synthesize answers
This isn’t just plugging into an API. The system includes:
- 📄 Document Parsing: URL fetching, text extraction, chunking logic
- 📊 Semantic Search: Uses HuggingFace embeddings + FAISS vector search
- 🔄 Retrieval-Augmented Generation (RAG): Uses retrieved text as context for LLM to answer accurately
- 🧩 LangChain Chains: Modular chaining logic to connect retriever + LLM
This pipeline simulates the behavior of a fine-tuned ML model for Q&A — without requiring training from scratch.
Component | Tool |
---|---|
Text Retrieval | LangChain (WebBaseLoader , RecursiveCharacterTextSplitter ) |
Embeddings | sentence-transformers/all-MiniLM-L6-v2 |
Vector Store | FAISS |
LLM (Free API) | mistralai/mistral-small-3.2-24b-instruct:free |
Frontend | Gradio |
Deployment | Hugging Face Spaces |
- Ask questions about documentation or articles
- Educational summary generation
- Build RAG apps without training your own LLM
- Great base for interview-prep bots, study assistants, etc.
To use this on Hugging Face Spaces, we store the OpenRouter API key securely via Space Secrets (ArjunHF
).
Absolutely! Here's an expanded and clear explanation of LangChain’s role in your project, written in clean, professional Markdown, ideal for README.md
or project documentation:
LangChain acts as the orchestrator that connects different components—document loaders, chunking logic, embeddings, retrievers, and LLMs—into a single, smart pipeline.
Here’s how LangChain powers the entire flow:
WebBaseLoader
- LangChain uses
WebBaseLoader
to fetch and clean the raw content from a given web URL. - It abstracts away boilerplate scraping code.
- Returns a list of
Document
objects for downstream processing.
RecursiveCharacterTextSplitter
- Large documents are split into manageable overlapping text chunks.
- This improves LLM comprehension and retrieval granularity.
- LangChain handles chunk boundaries intelligently using recursion on characters, newlines, sentences, etc.
HuggingFaceEmbeddings + FAISS
- Each chunk is converted into a dense vector using a pretrained embedding model.
- These embeddings are stored in a FAISS index via LangChain’s
VectorStore
interface. - LangChain lets you use this vector store as a retriever later on.
retriever.as_retriever()
- When a user asks a question, LangChain performs semantic search over the FAISS index to find the most relevant chunks.
- These are passed as context to the LLM for more grounded answers.
RetrievalQA
chain
- LangChain uses a Retrieval-Augmented Generation (RAG) setup via
RetrievalQA.from_chain_type()
. - It plugs in the retriever + the OpenRouter-backed Mistral LLM.
- Automatically forms prompts like:
"Given the context: <retrieved_docs> — answer the question: <user_question>"