Skip to content

Commit

Permalink
Merge pull request #1435 from zc277584121/master
Browse files Browse the repository at this point in the history
badge
  • Loading branch information
wxywb authored Oct 10, 2024
2 parents b0a1365 + e67f25e commit 984c0df
Showing 1 changed file with 29 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,34 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/tutorials/quickstart/contextual_retrieval_with_milvus.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a> <a href=\"https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/quickstart/contextual_retrieval_with_milvus.ipynb\" target=\"_blank\">\n",
" <img src=\"https://img.shields.io/badge/View%20on%20GitHub-555555?style=flat&logo=github&logoColor=white\" alt=\"GitHub Repository\"/>"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"# Contextual Retrieval with Milvus\n",
"![image](https://raw.githubusercontent.com/milvus-io/bootcamp/refs/heads/master/images/contextual_retrieval_with_milvus.png)\n",
"[Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval) is an advanced retrieval method proposed by Anthropic to address the issue of semantic isolation of chunks, which arises in current Retrieval-Augmented Generation (RAG) solutions. In the current practical RAG paradigm, documents are divided into several chunks, and a vector database is used to search for the query, retrieving the most relevant chunks. An LLM then responds to the query using these retrieved chunks. However, this chunking process can result in the loss of contextual information, making it difficult for the retriever to determine relevance. \n",
"[Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval) is an advanced retrieval method proposed by Anthropic to address the issue of semantic isolation of chunks, which arises in current Retrieval-Augmented Generation (RAG) solutions. In the current practical RAG paradigm, documents are divided into several chunks, and a vector database is used to search for the query, retrieving the most relevant chunks. An LLM then responds to the query using these retrieved chunks. However, this chunking process can result in the loss of contextual information, making it difficult for the retriever to determine relevance.\n",
"\n",
"Contextual Retrieval improves traditional retrieval systems by adding relevant context to each document chunk before embedding or indexing, boosting accuracy and reducing retrieval errors. Combined with techniques like hybrid retrieval and reranking, it enhances Retrieval-Augmented Generation (RAG) systems, especially for large knowledge bases. Additionally, it offers a cost-effective solution when paired with prompt caching, significantly reducing latency and operational costs, with contextualized chunks costing approximately $1.02 per million document tokens. This makes it a scalable and efficient approach for handling large knowledge bases. Anthropic’s solution shows two insightful aspects: \n",
"Contextual Retrieval improves traditional retrieval systems by adding relevant context to each document chunk before embedding or indexing, boosting accuracy and reducing retrieval errors. Combined with techniques like hybrid retrieval and reranking, it enhances Retrieval-Augmented Generation (RAG) systems, especially for large knowledge bases. Additionally, it offers a cost-effective solution when paired with prompt caching, significantly reducing latency and operational costs, with contextualized chunks costing approximately $1.02 per million document tokens. This makes it a scalable and efficient approach for handling large knowledge bases. Anthropic’s solution shows two insightful aspects:\n",
"- `Document Enhancement`: Query rewriting is a crucial technique in modern information retrieval, often using auxiliary information to make the query more informative. Similarly, to achieve better performance in RAG, preprocessing documents with an LLM (e.g., cleaning the data source, complementing lost information, summarizing, etc.) before indexing can significantly improve the chances of retrieving relevant documents. In other words, this preprocessing step helps bring the documents closer to the queries in terms of relevance.\n",
"- `Low-Cost Processing by Caching Long Context`: One common concern when using LLMs to process documents is the cost. The KVCache is a popular solution that allows reuse of intermediate results for the same preceding context. While most hosted LLM vendors make this feature transparent to user, Anthropic gives users control over the caching process. When a cache hit occurs, most computations can be saved (this is common when the long context remains the same, but the instruction for each query changes). For more details, click [here](https://www.anthropic.com/news/prompt-caching).\n",
"\n",
"In this notebook, we will demonstrate how to perform contextual retrieval using Milvus with an LLM, combining dense-sparse hybrid retrieval and a reranker to create a progressively more powerful retrieval system. The data and experimental setup are based on the [contextual retrieval](https://github.com/anthropics/anthropic-cookbook/blob/main/skills/contextual-embeddings/guide.ipynb).\n",
"\n"
]
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
Expand All @@ -35,6 +50,15 @@
"!pip install anthropic"
]
},
{
"cell_type": "markdown",
"source": [
"> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the \"Runtime\" menu at the top of the screen, and select \"Restart session\" from the dropdown menu)."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -80,14 +104,11 @@
"metadata": {},
"outputs": [],
"source": [
"from pymilvus import MilvusClient\n",
"\n",
"from pymilvus.model.dense import VoyageEmbeddingFunction\n",
"from pymilvus.model.hybrid import BGEM3EmbeddingFunction\n",
"from pymilvus.model.reranker import CohereRerankFunction\n",
"\n",
"from typing import List, Dict, Any\n",
"from pymilvus import DataType\n",
"from typing import Callable\n",
"from pymilvus import (\n",
" MilvusClient,\n",
Expand Down Expand Up @@ -742,4 +763,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}

0 comments on commit 984c0df

Please sign in to comment.