Skip to content

Latest commit

 

History

History
85 lines (62 loc) · 3.79 KB

File metadata and controls

85 lines (62 loc) · 3.79 KB

Multi-Turn RAG Example

Example Features

This example showcases multi-turn conversational AI in a RAG pipeline. The chain server stores the conversation history and knowledge base in a vector database and retrieves them at runtime to understand contextual queries.

The example supports ingestion of PDF and text files. The documents are ingested in a dedicated document vector store, multi_turn_rag. The prompt for the example is tuned to act as a document chat bot. To maintain the conversation history, the chain server stores the previously asked query and the model's generated answer as a text entry in a different and dedicated vector store for conversation history, conv_store. Both of these vector stores are part of a LangChain LCEL chain as LangChain Retrievers. When the chain is invoked with a query, the query passes through both the retrievers. The retriever retrieves context from the document vector store and the closest-matching conversation history from conversation history vector store. The document chunks retrieved from the document vector store are then passed through a reranker model to determine the most relevant top_k context. The context is then passed onto the LLM prompt for response generation. Afterward, the chunks are added into the LLM prompt as part of the chain.

Model Embedding Ranking (Optional) Framework Vector Database File Types
meta/llama3-8b-instruct nvidia/nv-embedqa-e5-v5 nvidia/nv-rerankqa-mistral-4b-v3 LangChain Milvus TXT, PDF, MD

Diagram

Prerequisites

Complete the common prerequisites.

Build and Start the Containers

  1. Export your NVIDIA API key as an environment variable:

    export NVIDIA_API_KEY="nvapi-<...>"
    
  2. Start the containers:

    cd RAG/examples/advanced_rag/multi_turn_rag/
    docker compose up -d --build

    Example Output

    ✔ Network nvidia-rag                               Created
    ✔ Container milvus-etcd                            Running
    ✔ Container milvus-minio                           Running
    ✔ Container milvus-standalone                      Running
    ✔ Container chain-server                           Started
    ✔ Container rag-playground                         Started
    
  3. Confirm the containers are running:

    docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"

    Example Output

    CONTAINER ID   NAMES               STATUS
    dd4fc3da6c9c   rag-playground      Up About a minute
    ac1f039a1db8   chain-server        Up About a minute
    cd0a57ee20e0   milvus-standalone   Up 2 hours
    a36370e7ed75   milvus-minio        Up 2 hours (healthy)
    a796a4e59b68   milvus-etcd         Up 2 hours (healthy)
    
  4. Open a web browser and access http://localhost:8090 to use the RAG Playground.

    Refer to Using the Sample Web Application for information about uploading documents and using the web interface.

Next Steps