Skip to content

Commit e9ac41d

Browse files
Milvus-doc-botMilvus-doc-bot
Milvus-doc-bot
authored and
Milvus-doc-bot
committed
Release new docs to master
1 parent edab0f5 commit e9ac41d

16 files changed

+841
-13
lines changed
198 KB
Loading
323 KB
Loading
480 KB
Loading

v2.5.x/assets/attu_login_page.png

200 KB
Loading

v2.5.x/assets/attu_searched_graph.png

471 KB
Loading

v2.5.x/assets/attu_searched_table.png

394 KB
Loading

v2.5.x/site/en/getstarted/run-milvus-docker/install_standalone-windows.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ If you are more familiar with PowerShell or Windows Command Prompt, the command
3131
2. Download the installation script and save it as `standalone.bat`.​
3232

3333
```powershell
34-
C:\>Invoke-WebRequest https://github.com/milvus-io/milvus/blob/master/scripts/standalone_embed.bat -OutFile standalone.bat​
34+
C:\>Invoke-WebRequest https://raw.githubusercontent.com/milvus-io/milvus/refs/heads/master/scripts/standalone_embed.bat -OutFile standalone.bat​
3535
3636
```
3737
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
---
2+
id: build_RAG_with_milvus_and_deepseek.md
3+
summary: In this tutorial, we’ll show you how to build a Retrieval-Augmented Generation (RAG) pipeline using Milvus and DeepSeek.
4+
title: Build RAG with Milvus and DeepSeek
5+
---
6+
7+
# Build RAG with Milvus and DeepSeek
8+
9+
<a href="https://colab.research.google.com/github/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_with_milvus_and_deepseek.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
10+
<a href="https://github.com/milvus-io/bootcamp/blob/master/bootcamp/tutorials/integration/build_RAG_with_milvus_and_deepseek.ipynb" target="_blank"><img src="https://img.shields.io/badge/View%20on%20GitHub-555555?style=flat&logo=github&logoColor=white" alt="GitHub Repository"/></a>
11+
12+
[DeepSeek](https://www.deepseek.com/) enables developers to build and scale AI applications with high-performance language models. It offers efficient inference, flexible APIs, and advanced Mixture-of-Experts (MoE) architectures for robust reasoning and retrieval tasks.
13+
14+
In this tutorial, we’ll show you how to build a Retrieval-Augmented Generation (RAG) pipeline using Milvus and DeepSeek.
15+
16+
17+
18+
## Preparation
19+
### Dependencies and Environment
20+
21+
22+
```python
23+
! pip install --upgrade pymilvus[model] openai requests tqdm
24+
```
25+
26+
> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the "Runtime" menu at the top of the screen, and select "Restart session" from the dropdown menu).
27+
28+
DeepSeek enables the OpenAI-style API. You can login to its official website and prepare the [api key](https://platform.deepseek.com/api_keys) `DEEPSEEK_API_KEY` as an environment variable.
29+
30+
31+
```python
32+
import os
33+
34+
os.environ["DEEPSEEK_API_KEY"] = "***********"
35+
```
36+
37+
### Prepare the data
38+
39+
We use the FAQ pages from the [Milvus Documentation 2.4.x](https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip) as the private knowledge in our RAG, which is a good data source for a simple RAG pipeline.
40+
41+
Download the zip file and extract documents to the folder `milvus_docs`.
42+
43+
44+
```python
45+
! wget https://github.com/milvus-io/milvus-docs/releases/download/v2.4.6-preview/milvus_docs_2.4.x_en.zip
46+
! unzip -q milvus_docs_2.4.x_en.zip -d milvus_docs
47+
```
48+
49+
We load all markdown files from the folder `milvus_docs/en/faq`. For each document, we just simply use "# " to separate the content in the file, which can roughly separate the content of each main part of the markdown file.
50+
51+
52+
```python
53+
from glob import glob
54+
55+
text_lines = []
56+
57+
for file_path in glob("milvus_docs/en/faq/*.md", recursive=True):
58+
with open(file_path, "r") as file:
59+
file_text = file.read()
60+
61+
text_lines += file_text.split("# ")
62+
```
63+
64+
### Prepare the LLM and Embedding Model
65+
66+
DeepSeek enables the OpenAI-style API, and you can use the same API with minor adjustments to call the LLM.
67+
68+
69+
```python
70+
from openai import OpenAI
71+
72+
deepseek_client = OpenAI(
73+
api_key=os.environ["DEEPSEEK_API_KEY"],
74+
base_url="https://api.deepseek.com",
75+
)
76+
```
77+
78+
Define a embedding model to generate text embeddings using the `milvus_model`. We use the `DefaultEmbeddingFunction` model as an example, which is a pre-trained and lightweight embedding model.
79+
80+
81+
```python
82+
from pymilvus import model as milvus_model
83+
84+
embedding_model = milvus_model.DefaultEmbeddingFunction()
85+
```
86+
87+
Generate a test embedding and print its dimension and first few elements.
88+
89+
90+
```python
91+
test_embedding = embedding_model.encode_queries(["This is a test"])[0]
92+
embedding_dim = len(test_embedding)
93+
print(embedding_dim)
94+
print(test_embedding[:10])
95+
```
96+
97+
768
98+
[-0.04836066 0.07163023 -0.01130064 -0.03789345 -0.03320649 -0.01318448
99+
-0.03041712 -0.02269499 -0.02317863 -0.00426028]
100+
101+
102+
## Load data into Milvus
103+
104+
### Create the Collection
105+
106+
107+
```python
108+
from pymilvus import MilvusClient
109+
110+
milvus_client = MilvusClient(uri="./milvus_demo.db")
111+
112+
collection_name = "my_rag_collection"
113+
```
114+
115+
> As for the argument of `MilvusClient`:
116+
> - Setting the `uri` as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.
117+
> - If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`.
118+
> - If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud.
119+
120+
Check if the collection already exists and drop it if it does.
121+
122+
123+
```python
124+
if milvus_client.has_collection(collection_name):
125+
milvus_client.drop_collection(collection_name)
126+
```
127+
128+
Create a new collection with specified parameters.
129+
130+
If we don't specify any field information, Milvus will automatically create a default `id` field for primary key, and a `vector` field to store the vector data. A reserved JSON field is used to store non-schema-defined fields and their values.
131+
132+
133+
```python
134+
milvus_client.create_collection(
135+
collection_name=collection_name,
136+
dimension=embedding_dim,
137+
metric_type="IP", # Inner product distance
138+
consistency_level="Strong", # Strong consistency level
139+
)
140+
```
141+
142+
### Insert data
143+
Iterate through the text lines, create embeddings, and then insert the data into Milvus.
144+
145+
Here is a new field `text`, which is a non-defined field in the collection schema. It will be automatically added to the reserved JSON dynamic field, which can be treated as a normal field at a high level.
146+
147+
148+
```python
149+
from tqdm import tqdm
150+
151+
data = []
152+
153+
doc_embeddings = embedding_model.encode_documents(text_lines)
154+
155+
for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")):
156+
data.append({"id": i, "vector": doc_embeddings[i], "text": line})
157+
158+
milvus_client.insert(collection_name=collection_name, data=data)
159+
```
160+
161+
Creating embeddings: 0%| | 0/72 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
162+
To disable this warning, you can either:
163+
- Avoid using `tokenizers` before the fork if possible
164+
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
165+
Creating embeddings: 100%|██████████| 72/72 [00:00<00:00, 246522.36it/s]
166+
167+
168+
169+
170+
171+
{'insert_count': 72, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], 'cost': 0}
172+
173+
174+
175+
## Build RAG
176+
177+
### Retrieve data for a query
178+
179+
Let's specify a frequent question about Milvus.
180+
181+
182+
```python
183+
question = "How is data stored in milvus?"
184+
```
185+
186+
Search for the question in the collection and retrieve the semantic top-3 matches.
187+
188+
189+
```python
190+
search_res = milvus_client.search(
191+
collection_name=collection_name,
192+
data=embedding_model.encode_queries(
193+
[question]
194+
), # Convert the question to an embedding vector
195+
limit=3, # Return top 3 results
196+
search_params={"metric_type": "IP", "params": {}}, # Inner product distance
197+
output_fields=["text"], # Return the text field
198+
)
199+
```
200+
201+
Let's take a look at the search results of the query
202+
203+
204+
205+
```python
206+
import json
207+
208+
retrieved_lines_with_distances = [
209+
(res["entity"]["text"], res["distance"]) for res in search_res[0]
210+
]
211+
print(json.dumps(retrieved_lines_with_distances, indent=4))
212+
```
213+
214+
[
215+
[
216+
" Where does Milvus store data?\n\nMilvus deals with two types of data, inserted data and metadata. \n\nInserted data, including vector data, scalar data, and collection-specific schema, are stored in persistent storage as incremental log. Milvus supports multiple object storage backends, including [MinIO](https://min.io/), [AWS S3](https://aws.amazon.com/s3/?nc1=h_ls), [Google Cloud Storage](https://cloud.google.com/storage?hl=en#object-storage-for-companies-of-all-sizes) (GCS), [Azure Blob Storage](https://azure.microsoft.com/en-us/products/storage/blobs), [Alibaba Cloud OSS](https://www.alibabacloud.com/product/object-storage-service), and [Tencent Cloud Object Storage](https://www.tencentcloud.com/products/cos) (COS).\n\nMetadata are generated within Milvus. Each Milvus module has its own metadata that are stored in etcd.\n\n###",
217+
0.6572665572166443
218+
],
219+
[
220+
"How does Milvus flush data?\n\nMilvus returns success when inserted data are loaded to the message queue. However, the data are not yet flushed to the disk. Then Milvus' data node writes the data in the message queue to persistent storage as incremental logs. If `flush()` is called, the data node is forced to write all data in the message queue to persistent storage immediately.\n\n###",
221+
0.6312146186828613
222+
],
223+
[
224+
"How does Milvus handle vector data types and precision?\n\nMilvus supports Binary, Float32, Float16, and BFloat16 vector types.\n\n- Binary vectors: Store binary data as sequences of 0s and 1s, used in image processing and information retrieval.\n- Float32 vectors: Default storage with a precision of about 7 decimal digits. Even Float64 values are stored with Float32 precision, leading to potential precision loss upon retrieval.\n- Float16 and BFloat16 vectors: Offer reduced precision and memory usage. Float16 is suitable for applications with limited bandwidth and storage, while BFloat16 balances range and efficiency, commonly used in deep learning to reduce computational requirements without significantly impacting accuracy.\n\n###",
225+
0.6115777492523193
226+
]
227+
]
228+
229+
230+
### Use LLM to get a RAG response
231+
232+
Convert the retrieved documents into a string format.
233+
234+
235+
```python
236+
context = "\n".join(
237+
[line_with_distance[0] for line_with_distance in retrieved_lines_with_distances]
238+
)
239+
```
240+
241+
Define system and user prompts for the Lanage Model. This prompt is assembled with the retrieved documents from Milvus.
242+
243+
244+
```python
245+
SYSTEM_PROMPT = """
246+
Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided.
247+
"""
248+
USER_PROMPT = f"""
249+
Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.
250+
<context>
251+
{context}
252+
</context>
253+
<question>
254+
{question}
255+
</question>
256+
"""
257+
```
258+
259+
Use the `deepseek-chat` model provided by DeepSeek to generate a response based on the prompts.
260+
261+
262+
```python
263+
response = deepseek_client.chat.completions.create(
264+
model="deepseek-chat",
265+
messages=[
266+
{"role": "system", "content": SYSTEM_PROMPT},
267+
{"role": "user", "content": USER_PROMPT},
268+
],
269+
)
270+
print(response.choices[0].message.content)
271+
```
272+
273+
In Milvus, data is stored in two main categories: inserted data and metadata.
274+
275+
1. **Inserted Data**: This includes vector data, scalar data, and collection-specific schema. The inserted data is stored in persistent storage as incremental logs. Milvus supports various object storage backends for this purpose, such as MinIO, AWS S3, Google Cloud Storage (GCS), Azure Blob Storage, Alibaba Cloud OSS, and Tencent Cloud Object Storage (COS).
276+
277+
2. **Metadata**: Metadata is generated within Milvus and is specific to each Milvus module. This metadata is stored in etcd, a distributed key-value store.
278+
279+
Additionally, when data is inserted, it is first loaded into a message queue, and Milvus returns success at this stage. The data is then written to persistent storage as incremental logs by the data node. If the `flush()` function is called, the data node is forced to write all data in the message queue to persistent storage immediately.
280+
281+
282+
Great! We have successfully built a RAG pipeline with Milvus and DeepSeek.
283+
284+

v2.5.x/site/en/integrations/integrate_with_camel.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
id: integrate_with_camel.md
3-
summary: This guide demonstrates how to use an open-source embedding model and large-language model on BentoCloud with Milvus vector database to build a Retrieval Augmented Generation (RAG) application.
4-
title: Retrieval-Augmented Generation (RAG) with Milvus and BentoML
3+
summary: This guide demonstrates how to build a Retrieval-Augmented Generation (RAG) system using CAMEL and Milvus.
4+
title: Retrieval-Augmented Generation (RAG) with Milvus and Camel
55
---
66

77
# Retrieval-Augmented Generation (RAG) with Milvus and Camel

v2.5.x/site/en/integrations/integrate_with_langfuse.md

+7-3
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,20 @@
11
---
22
id: integrate_with_langfuse.md
33
summary: This is a simple cookbook that demonstrates how to use the LlamaIndex Langfuse integration. It uses Milvus Lite to store the documents and Query.
4-
title: Cookbook LlamaIndex & Milvus Integration
4+
title: Using Langfuse to Evaluate RAG Quality
55
---
66

7-
# Cookbook - LlamaIndex & Milvus Integration
7+
# Using Langfuse to Trace Queries in RAG
88

99
<a target="_blank" href="https://colab.research.google.com/github/langfuse/langfuse-docs/blob/main/cookbook/integration_llama-index_milvus-lite.ipynb">
1010
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
1111
</a>
1212

13-
This is a simple cookbook that demonstrates how to use the [LlamaIndex Langfuse integration](https://langfuse.com/docs/integrations/llama-index/get-started). It uses Milvus Lite to store the documents and Query.
13+
This is a simple cookbook that demonstrates how to use Langfuse to trace your queries in RAG. The RAG pipeline is implemented with LlamaIndex and Milvus Lite to store and retrieve the documents.
14+
15+
In this quickstart, we’ll show you how to set up a LlamaIndex application using Milvus Lite as the vector store. We’ll also show you how to use the Langfuse LlamaIndex integration to trace your application.
16+
17+
[Langfuse](https://github.com/langfuse/langfuse) is an open-source LLM engineering platform that helps teams collaboratively debug, analyze, and iterate on their LLM applications. All platform features are natively integrated to accelerate the development workflow.
1418

1519
[Milvus Lite](https://github.com/milvus-io/milvus-lite/) is the lightweight version of Milvus, an open-source vector database that powers AI applications with vector embeddings and similarity search.
1620

v2.5.x/site/en/integrations/integrations_overview.md

+1
Original file line numberDiff line numberDiff line change
@@ -59,3 +59,4 @@ This page provides a list of tutorials for you to interact with Milvus and third
5959
| [Build RAG with Milvus and Gemini](build_RAG_with_milvus_and_gemini.md) | LLMs | Milvus, Gemini |
6060
| [Build RAG with Milvus and Ollama](build_RAG_with_milvus_and_ollama.md) | LLMs | Milvus, Ollama |
6161
| [Getting Started with Dynamiq and Milvus](milvus_rag_with_dynamiq.md) | Orchestration | Milvus, Dynamiq |
62+
| [Build RAG with Milvus and DeepSeek](build_RAG_with_milvus_and_deepseek.md) | LLMs | Milvus, DeepSeek |

v2.5.x/site/en/menuStructure/en.json

+21-3
Original file line numberDiff line numberDiff line change
@@ -1737,6 +1737,12 @@
17371737
"id": "build_RAG_with_milvus_and_ollama.md",
17381738
"order": 7,
17391739
"children": []
1740+
},
1741+
{
1742+
"label": "DeepSeek",
1743+
"id": "build_RAG_with_milvus_and_deepseek.md",
1744+
"order": 8,
1745+
"children": []
17401746
}
17411747
]
17421748
},
@@ -2001,7 +2007,7 @@
20012007
"id": "tutorials-overview.md",
20022008
"order": 0,
20032009
"children": []
2004-
},
2010+
},
20052011
{
20062012
"label": "Build RAG with Milvus",
20072013
"id": "build-rag-with-milvus.md",
@@ -2013,7 +2019,7 @@
20132019
"id": "how_to_enhance_your_rag.md",
20142020
"order": 2,
20152021
"children": []
2016-
},
2022+
},
20172023
{
20182024
"label": "Full-Text Search with Milvus",
20192025
"id": "full_text_search_with_milvus.md",
@@ -2031,7 +2037,7 @@
20312037
"id": "image_similarity_search.md",
20322038
"order": 5,
20332039
"children": []
2034-
},
2040+
},
20352041
{
20362042
"label": "Multimodal RAG",
20372043
"id": "multimodal_rag_with_milvus.md",
@@ -2080,6 +2086,18 @@
20802086
"order": 13,
20812087
"children": []
20822088
},
2089+
{
2090+
"label": "Quickstart with Attu",
2091+
"id": "quickstart_with_attu.md",
2092+
"order": 14,
2093+
"children": []
2094+
},
2095+
{
2096+
"label": "Use AsyncMilvusClient with asyncio",
2097+
"id": "use-async-milvus-client-with-asyncio.md",
2098+
"order": 15,
2099+
"children": []
2100+
},
20832101
{
20842102
"label": "Explore More",
20852103
"id": "explore-more",

0 commit comments

Comments
 (0)