Skip to content

Commit

Permalink
Add multimodal rag notebook
Browse files Browse the repository at this point in the history
Signed-off-by: Jael Gu <[email protected]>
  • Loading branch information
jaelgu committed Jul 31, 2024
1 parent 85f63f2 commit 215fd78
Show file tree
Hide file tree
Showing 2 changed files with 645 additions and 7 deletions.
32 changes: 25 additions & 7 deletions bootcamp/tutorials/quickstart/cir_with_milvus.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,13 @@
"!pip install --upgrade pymilvus openai datasets timm einops ftfy peft tqdm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To enable the latest codes of Visualized BGE, we will download the official repository and then install with the source code."
]
},
{
"cell_type": "code",
"execution_count": 2,
Expand All @@ -57,8 +64,6 @@
"source": [
"### Download Data\n",
"\n",
"In this tutorial, we will use \n",
"\n",
"The following command will download the example data and extract to a local folder \"./images_folder\" including:\n",
"\n",
"- **amazon_fashion**: A subset of [Amazon Reviews 2023](https://github.com/hyp1231/AmazonReviews2023) as containing approximately 145 images from the category \"Amazon_Fashion\".\n",
Expand All @@ -81,7 +86,7 @@
"source": [
"### Load Embedding Model\n",
"\n",
"In this tutorial, we will use the Visualized BGE model \"bge-visualized-base-en-v1.5\" to generate embeddings for both images and text. \n",
"We will use the Visualized BGE model \"bge-visualized-base-en-v1.5\" to generate embeddings for both images and text. \n",
"\n",
"**1. Download weight**"
]
Expand Down Expand Up @@ -139,7 +144,11 @@
"source": [
"## Load Data\n",
"\n",
"### Generate embeddings"
"This section will load example images into the database with corresponding embeddings.\n",
"\n",
"### Generate embeddings\n",
"\n",
"Load all jpeg images from the data directory and apply the encoder to convert images to embeddings."
]
},
{
Expand Down Expand Up @@ -196,7 +205,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Insert into Milvus"
"### Insert into Milvus\n",
"\n",
"Insert images with corresponding paths and embeddings into Milvus collection.\n",
"\n",
"> As for the argument of `MilvusClient`:\n",
"> - Setting the `uri` as a local file, e.g. `./milvus_demo.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.\n",
"> - If you have large scale of data, you can set up a more performant Milvus server on [docker or kubernetes](https://milvus.io/docs/quickstart.md). In this setup, please use the server uri, e.g.`http://localhost:19530`, as your `uri`.\n",
"> - If you want to use [Zilliz Cloud](https://zilliz.com/cloud), the fully managed cloud service for Milvus, adjust the `uri` and `token`, which correspond to the [Public Endpoint and Api key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#free-cluster-details) in Zilliz Cloud."
]
},
{
Expand Down Expand Up @@ -248,7 +264,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multimodal Search"
"## Multimodal Search\n",
"\n",
"Now we are ready to perform the advanced image search with query data composed of both image and text instruction."
]
},
{
Expand All @@ -260,7 +278,7 @@
"query_image = os.path.join(\n",
" data_dir, \"clothes.jpg\"\n",
") # Change to your own query image path\n",
"query_text = \"black tops of this style\"\n",
"query_text = \"black tops of this style\" # Change to your own text instruction, or None\n",
"\n",
"# Generate query embedding given image and text instructions\n",
"query_vec = encoder.encode_query(image_path=query_image, text=query_text)\n",
Expand Down
Loading

0 comments on commit 215fd78

Please sign in to comment.