Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
Signed-off-by: AnthonyTsu1984 <[email protected]>
  • Loading branch information
AnthonyTsu1984 committed Jul 9, 2024
2 parents 6202671 + 8860e2c commit 3f896bf
Show file tree
Hide file tree
Showing 461 changed files with 36,990 additions and 18,227 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,6 @@ Sets the data type to **Float**.
- DOUBLE = 11
Sets the data type to **Double**.

- STRING = 20
Sets the data type to **String**.

- VARCHAR = 21
Sets the data type to **Varchar**.

Expand Down Expand Up @@ -59,5 +56,6 @@ Sets the data type to **Float Vector**.
Sets the data type to **Sparse Vector**.

- UNKNOWN = 999
Sets the data type to **Unknown**.

Sets the data type to **Unknown**.

Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Constructs a JinaRerankFunction for common use cases.

```python
JinaRerankFunction(
model_name: str = "jina-reranker-v1-base-en",
model_name: str = "jina-reranker-v2-base-multilingual",
api_key: Optional[str] = None
)
```
Expand All @@ -21,7 +21,7 @@ JinaRerankFunction(

- **model_name** (*string*)

The name of the Jina AI reranker model to use for encoding. If you leave this parameter unspecified, `jina-reranker-v1-base-en` will be used. For a list of available models, refer to [Jina AI Rerankers](https://jina.ai/reranker/).
The name of the Jina AI reranker model to use for encoding. If you leave this parameter unspecified, `jina-reranker-v2-base-multilingual` will be used. For a list of available models, refer to [Jina AI Rerankers](https://jina.ai/reranker/).

- **api_key** (*string*)

Expand All @@ -33,9 +33,9 @@ JinaRerankFunction(
from pymilvus.model.reranker import JinaRerankFunction

jina_rf = JinaRerankFunction(
model_name="jina-reranker-v1-base-en", # Defaults to `jina-reranker-v1-base-en`
model_name="jina-reranker-v2-base-multilingual", # Defaults to `jina-reranker-v2-base-multilingual`
api_key="YOUR_JINAAI_API_KEY"
)
```

<DocCardList />
<DocCardList />
4 changes: 2 additions & 2 deletions preview/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The Milvus docs are open-source just like the database itself and welcome contributions from everyone in the Milvus community.

> This repository is for Milvus technical documentation update and maintenance. Visit [Milvus.io](milvus.io) or [Web content repo](https://github.com/milvus-io/web-content) for fully rendered documents.
> This repository is for Milvus technical documentation update and maintenance. Visit [Milvus.io](https://milvus.io/docs) or [Web content repo](https://github.com/milvus-io/web-content) for fully rendered documents.

## What contributions can I make?

Expand Down Expand Up @@ -42,7 +42,7 @@ For detailed information on this workflow, see [Make Your First Contribution](ht

![Folders](assets/folder-structure.png)

- [Languages](#languages)
- [Languages](#language)
- [Pages](#pages)
- [Sidebar](#sidebar)

Expand Down
2 changes: 1 addition & 1 deletion preview/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Welcome to Milvus documentation!

This repository contains technical documentation for [Milvus](https://github.com/milvus-io/milvus), the world's most advanced open-source vector database. Visit [Milvus.io](milvus.io) or [Web content repo](https://github.com/milvus-io/web-content) for fully rendered documents.
This repository contains technical documentation for [Milvus](https://github.com/milvus-io/milvus), the world's most advanced open-source vector database. Visit [Milvus.io](https://milvus.io/docs) or [Web content repo](https://github.com/milvus-io/web-content) for fully rendered documents.

Each branch corresponds to a Milvus release by name. We've set the branch of the latest Milvus release as the default branch. For documentation of a different Milvus release, switch to the corresponding branch.

Expand Down
29 changes: 20 additions & 9 deletions preview/Variables.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,24 @@
{
"milvus_release_version": "2.2.3",
"milvus_release_tag": "2.2.3",
"milvus_release_version": "2.4.5",
"milvus_release_tag": "2.4.5",
"milvus_deb_name": "milvus_2.2.0-1_amd64",
"milvus_rpm_name": "milvus-2.2.0-1.el7.x86_64",
"milvus_python_sdk_version": "2.2.2",
"milvus_node_sdk_version": "2.2.x",
"milvus_go_sdk_version": "2.2.0",
"milvus_java_sdk_version": "2.2.1",
"milvus_operator_version": "0.7.7",
"milvus_image": "2.2.2",
"attu_release": "2.1.1"
"milvus_python_sdk_version": "2.4.x",
"milvus_python_sdk_real_version": "2.4.4",
"milvus_node_sdk_version": "2.4.x",
"milvus_node_sdk_real_version": "v2.4.3",
"milvus_go_sdk_version": "2.3.x",
"milvus_go_sdk_real_version": "2.4.0",
"milvus_java_sdk_version": "2.4.x",
"milvus_java_sdk_real_version": "2.4.1",
"milvus_csharp_sdk_version": "2.2.x",
"milvus_csharp_sdk_real_version": "2.2.14",
"milvus_restful_sdk_version": "2.4.x",
"milvus_restful_sdk_real_version": "2.4.1",
"milvus_operator_version": "0.9.17",
"milvus_helm_chart_version": "4.1.24",
"milvus_image": "2.4.1",
"attu_release": "2.3.10",
"milvus_backup_release": "0.4.12",
"birdwatcher_release": "1.0.3"
}
Binary file modified preview/assets/IP_formula.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/access_key.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/arctan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/attu-snapshot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/azure_service.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/birdwatcher_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified preview/assets/coordinator_ha.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/cosine_similarity.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/datasource.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified preview/assets/distributed_architecture.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/dspy-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified preview/assets/euclidean_metric.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified preview/assets/gcp.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/gpu_index.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified preview/assets/handoff.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
246 changes: 246 additions & 0 deletions preview/assets/hello_milvus.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
# hello_milvus.py demonstrates the basic operations of PyMilvus, a Python SDK of Milvus.
# 1. connect to Milvus
# 2. create collection
# 3. insert data
# 4. create index
# 5. search, query, and hybrid search on entities
# 6. delete entities by PK
# 7. drop collection
import time

import numpy as np
import string
import random

from pymilvus import MilvusClient, DataType

fmt = "\n=== {:30} ===\n"
search_latency_fmt = "search latency = {:.4f}s"
num_entities, dim = 3000, 8

#################################################################################
# 1. connect to Milvus
# Add a new connection alias `default` for Milvus server in `localhost:19530`
# Actually the "default" alias is a buildin in PyMilvus.
# If the address of Milvus is the same as `localhost:19530`, you can omit all
# parameters and call the method as: `connections.connect()`.
#
# Note: the `using` parameter of the following methods is default to "default".
print(fmt.format("start connecting to Milvus"))
client = MilvusClient(uri="http://localhost:19530") # Replace with your Milvus server address

has = client.has_collection("hello_milvus")
print(f"Does collection hello_milvus exist in Milvus: {has}")

#################################################################################
# 2. create collection
# We're going to create a collection with 3 fields.
# +-+------------+------------+------------------+------------------------------+
# | | field name | field type | other attributes | field description |
# +-+------------+------------+------------------+------------------------------+
# |1| "pk" | VarChar | is_primary=True | "primary field" |
# | | | | auto_id=False | |
# +-+------------+------------+------------------+------------------------------+
# |2| "random" | Double | | "a double field" |
# +-+------------+------------+------------------+------------------------------+
# |3|"embeddings"| FloatVector| dim=8 | "float vector with dim 8" |
# +-+------------+------------+------------------+------------------------------+

schema = client.create_schema(
auto_id=False,
enable_dynamic_fields=True,
description="hello_milvus is the simplest demo to introduce the APIs",
)

schema.add_field(field_name="pk", datatype=DataType.VARCHAR, is_primary=True, max_length=100)
schema.add_field(field_name="random", datatype=DataType.DOUBLE)
schema.add_field(field_name="embeddings", datatype=DataType.FLOAT_VECTOR, dim=dim)

print(fmt.format("Create collection `hello_milvus`"))
client.create_collection(
collection_name="hello_milvus",
schema=schema,
consistency_level="Strong"
)

################################################################################
# 3. insert data
# We are going to insert 3000 rows of data into `hello_milvus`
# Data to be inserted must be organized in fields.
#
# The insert() method returns:
# - either automatically generated primary keys by Milvus if auto_id=True in the schema;
# - or the existing primary key field from the entities if auto_id=False in the schema.

print(fmt.format("Start inserting entities"))

def generate_random_string(length):
return ''.join(random.choice(string.ascii_letters + string.digits) for _ in range(length))

def generate_random_entities(num_entities, dim):
entities = []
for _ in range(num_entities):
pk = generate_random_string(10) # Generate a random primary key string of length 10
random_value = random.random() # Generate a random double value
embeddings = np.random.rand(dim).tolist() # Generate a random float vector of dimension 'dim'
entities.append({"pk": pk, "random": random_value, "embeddings": embeddings})
return entities

entities = generate_random_entities(num_entities, dim)

insert_result = client.insert(
collection_name="hello_milvus",
data=entities,
)

print(f"Number of entities in Milvus: {insert_result['insert_count']}") # check the num_entities

################################################################################
# 4. create index
# We are going to create an IVF_FLAT index for hello_milvus collection.
# create_index() can only be applied to `FloatVector` and `BinaryVector` fields.
print(fmt.format("Start Creating index IVF_FLAT"))

index_params = client.prepare_index_params()

index_params.add_index(
field_name="pk"
)

index_params.add_index(
field_name="embeddings",
index_type="IVF_FLAT",
metric_type="L2",
params={"nlist": 128}
)

client.create_index(
collection_name="hello_milvus",
index_params=index_params
)

################################################################################
# 5. search, query, and hybrid search
# After data were inserted into Milvus and indexed, you can perform:
# - search based on vector similarity
# - query based on scalar filtering(boolean, int, etc.)
# - hybrid search based on vector similarity and scalar filtering.
#

# Before conducting a search or a query, you need to load the data in `hello_milvus` into memory.
print(fmt.format("Start loading"))
client.load_collection("hello_milvus")

# -----------------------------------------------------------------------------
# search based on vector similarity
print(fmt.format("Start searching based on vector similarity"))
last_entity = entities[-1] # Get the last entity
vectors_to_search = [last_entity["embeddings"]] # Extract the embeddings vector and put it in a list
search_params = {
"metric_type": "L2",
"params": {"nprobe": 10},
}

start_time = time.time()
result = client.search(
collection_name="hello_milvus",
data=vectors_to_search,
anns_field="embeddings",
search_params=search_params,
limit=3,
output_fields=["random"]
)
end_time = time.time()

for hits in result:
for hit in hits:
print(f"hit: {hit}, random field: {hit.get('entity').get('random')}")
print(search_latency_fmt.format(end_time - start_time))

# -----------------------------------------------------------------------------
# query based on scalar filtering(boolean, int, etc.)
print(fmt.format("Start querying with `random > 0.5`"))

start_time = time.time()
result = client.query(
collection_name="hello_milvus",
filter="random > 0.5",
output_fields=["random", "embeddings"]
)
end_time = time.time()

print(f"query result:\n-{result[0]}")
print(search_latency_fmt.format(end_time - start_time))

# -----------------------------------------------------------------------------
# pagination
r1 = client.query(
collection_name="hello_milvus",
filter="random > 0.5",
limit=4,
output_fields=["random"]
)
r2 = client.query(
collection_name="hello_milvus",
filter="random > 0.5",
offset=1,
limit=3,
output_fields=["random"]
)
print(f"query pagination(limit=4):\n\t{r1}")
print(f"query pagination(offset=1, limit=3):\n\t{r2}")


# -----------------------------------------------------------------------------
# filtered search
print(fmt.format("Start filtered searching with `random > 0.5`"))

start_time = time.time()
result = client.search(
collection_name="hello_milvus",
data=vectors_to_search,
anns_field="embeddings",
search_params=search_params,
limit=3,
filter="random > 0.5",
output_fields=["random"]
)
end_time = time.time()

for hits in result:
for hit in hits:
print(f"hit: {hit}, random field: {hit.get('entity').get('random')}")
print(search_latency_fmt.format(end_time - start_time))

###############################################################################
# 6. delete entities by PK
# You can delete entities by their PK values using boolean expressions.
ids = [entity["pk"] for entity in entities]

expr = f'pk in ["{ids[0]}", "{ids[1]}"]'
print(fmt.format(f"Start deleting with expr `{expr}`"))

result = client.query(
collection_name="hello_milvus",
filter=expr,
output_fields=["random", "embeddings"]
)
print(f"query before delete by expr=`{expr}` -> result: \n-{result[0]}\n-{result[1]}\n")

client.delete(
collection_name="hello_milvus",
filter=expr
)

result = client.query(
collection_name="hello_milvus",
filter=expr,
output_fields=["random", "embeddings"]
)
print(f"query after delete by expr=`{expr}` -> result: {result}\n")

###############################################################################
# 7. drop collection
# Finally, drop the hello_milvus collection
print(fmt.format("Drop collection `hello_milvus`"))
client.drop_collection("hello_milvus")
Binary file added preview/assets/install-databricks-library.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/integrate_with_pytorch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/map-data-to-schema.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus-adopters.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus-cdc-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus-cdc-dashboard.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus-cdc-workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvus_backup_architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added preview/assets/milvuslog.jpg
Binary file added preview/assets/multi-vector-rerank.png
Binary file added preview/assets/query.png
Binary file added preview/assets/results.png
Binary file added preview/assets/rrf-ranker.png
Binary file added preview/assets/scalar_index_inverted.png
Binary file added preview/assets/snowflake-01.png
Binary file added preview/assets/snowflake-02.png
Binary file added preview/assets/snowflake-03.png
Binary file added preview/assets/snowflake-04.png
Binary file added preview/assets/snowflake-05.png
Binary file added preview/assets/snowflake-06.png
Binary file modified preview/assets/standalone_architecture.jpg
Binary file modified preview/assets/substructure.png
Binary file modified preview/assets/superstructure.png
Binary file added preview/assets/weighted-reranker.png
Loading

0 comments on commit 3f896bf

Please sign in to comment.