Automatic RAG Dataset Creation and Evaluation with Giskard & RAGAS

A lightweight RAG pipeline to create an evaluation dataset automatically and evaluate it, all using Langchain, RAGAS, Giskard, Gemini, and LangSmith.

Author: Daniel Puente Viejo

🎯 Objective

This repository demonstrates how to quickly evaluate RAG systems without the need to manually create a large dataset.
We use a sample use case: answering questions about popular TV series like Breaking Bad and La Casa de Papel.
The pipeline is fully open-source and built using:

Langchain - a platform to help build language model applications
Gemmini – a completely open-source orchestration framework for LLM apps
RAGAS – for evaluating RAG responses
Giskard – to detect hallucinations, bias, and robustness issues
LangSmith – to monitor, debug, and evaluate LLM usage at scale

🧠 Use Case

We simulate a real-world scenario:

A user asks detailed questions about a TV show, such as character arcs, plot developments, or ethical decisions.
The system retrieves summaries of episodes and returns a relevant, accurate response.

🛠️ Tools Used

Tool	Role
Langchain	Build the RAG pipeline (retriever + LLM)
Gemmini	Open-source LLM orchestration & agent management
RAGAS	Automatically evaluate generated answers
Giskard	Test model outputs for hallucinations, bias, robustness
LangSmith	Monitor and log RAG chains and metrics at runtime

🧪 Evaluation Strategy

We eliminate the need to create a labeled dataset from scratch by:

Generating realistic questions and answers using Giskard
Using RAGAS to compute evaluation metrics:
- Context Precision
- Context Recall
- Faithfulness
- Answer Similarity
- Answer Relevancy
Tracking all generations and context chunks using LangSmith

📂 Notebook Structure

1. 🔧 Setup
   - Install and import dependencies
   - Env variables
   - Load clients

2. 📦 Chunking & Vect BBDD creation
   - Simple steps to create chunking dataset and vector database

3. ⚙️ Create dataset
   - Use Giskard to create the dataset

4. 🔄 Retrieve examples & Evaluate
   - Use RAGAS to compute metrics

5. 🎯 Answer questions & Evaluate
   - Use RAGAS to compute metrics

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data/source		data/source
imgs		imgs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
tutorial.ipynb		tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automatic RAG Dataset Creation and Evaluation with Giskard & RAGAS

🎯 Objective

🧠 Use Case

🛠️ Tools Used

🧪 Evaluation Strategy

📂 Notebook Structure

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

DanielPuentee/Automatic-RAG-Dataset-Creation-And-Evaluation

Folders and files

Latest commit

History

Repository files navigation

Automatic RAG Dataset Creation and Evaluation with Giskard & RAGAS

🎯 Objective

🧠 Use Case

🛠️ Tools Used

🧪 Evaluation Strategy

📂 Notebook Structure

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages