Skip to content

Using MLflow to deploy your RAG pipeline, using LLamaIndex, Langchain and Ollama/HuggingfaceLLMs/Groq

Notifications You must be signed in to change notification settings

AnasAber/MLflow_with_RAG

Repository files navigation

MLflow Deployement of a RAG pipeline 🥀

This project is for people that want to deploy a RAG pipeline using MLflow.

The project uses:

  • LlamaIndex and langchain as orchestrators
  • Ollama and HuggingfaceLLMs
  • MLflow as an MLOps framework for deploying and tracking

Project Overview Diagram

How to start

  1. Clone the repository
git clone https://github.com/AnasAber/RAG_in_CPU.git
  1. Install the dependencies
pip install -r requirements.txt

Make sure to put your api_keys into the example.env, and rename it to .env

  1. Notebook Prep:
  • Put your own data files in the data/ folder
  • Go to the notebook, and replace "api_key_here" with your huggingface_api_key
  • If you have GPU, you're fine, if not, run it on google colab, and make sure to download the json file output at the end of the run.
  1. Go to deployement folder, and open two terminals:
python workflow.py

And after the run, go to your mlflow run, and pick the run ID: Run ID Place it into this command:

mlflow models serve -m runs:/<run id>/rag_deployement -p 5001

In the other terminal, make sure to run

app.py
  1. Open another terminal, and move to the frontend folder, and run:
npm start

Now, you should be seeing a web interface, and the two terminals are running. Interface

If you got errors, try to see what's missing in the requirements.txt.

Enjoy!