This project implements a lightweight RAG pipeline using Ollama for answering cat-related questions (health, breeds, behavior).
- Curates and embeds a corpus of cat knowledge.
- Uses a vector store for fast semantic retrieval.
- Combines retrieved context with local LLM generation via Ollama.
- Optimized for low-latency, context-aware responses.