diff --git a/docs/pinecone-quickstart.ipynb b/docs/pinecone-quickstart.ipynb index 72fcfbe3..6264aac2 100644 --- a/docs/pinecone-quickstart.ipynb +++ b/docs/pinecone-quickstart.ipynb @@ -10,7 +10,7 @@ "\n", "# Pinecone Database quickstart\n", "\n", - "This notebook shows you how to set up and use Pinecone Database for high-performance similarity search." + "This notebook shows you how to set up and use Pinecone Database for high-performance semantic search." ] }, { @@ -54,7 +54,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", @@ -93,359 +93,295 @@ "source": [ "## Initialize a client\n", "\n", - "Use the generated API key to intialize a client connection to Pinecone:" + "Use the generated API key to intialize a Pinecone client:" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "metadata": { "id": "e9rr_u6ZIvZ-" }, "outputs": [], "source": [ - "from pinecone import Pinecone, ServerlessSpec\n", + "# Import the Pinecone library\n", + "from pinecone import Pinecone\n", "\n", + "# Initialize a Pinecone client with your API key\n", "api_key = os.environ.get(\"PINECONE_API_KEY\")\n", - "\n", "pc = Pinecone(api_key=api_key)" ] }, { "cell_type": "markdown", - "metadata": { - "id": "bN9Rl7GP258C" - }, "source": [ - "## Generate vectors\n", + "## Create an index\n", "\n", - "A [vector embedding](https://www.pinecone.io/learn/vector-embeddings/) is a numerical representation of data that enables similarity-based search in vector databases like Pinecone. To convert data into this format, you use an embedding model.\n", + "In Pinecone, there are two types of indexes for storing vector data: [Dense indexes](https://docs.pinecone.io/guides/indexes/understanding-indexes#dense-indexes) store dense vectors for semantic search, and [sparse indexes](https://docs.pinecone.io/guides/indexes/understanding-indexes#sparse-indexes) store sparse vectors for lexical/keyword search.\n", "\n", - "For this quickstart, use the [`multilingual-e5-large`](https://docs.pinecone.io/models/multilingual-e5-large) embedding model hosted by Pinecone to [convert](https://docs.pinecone.io/guides/inference/generate-embeddings) four sentences about apples into vectors, three related to health, one related to cultivation." - ] + "For this quickstart, create a dense index that is integrated with an [embedding model hosted by Pinecone](https://docs.pinecone.io/guides/inference/understanding-inference#embedding-models). With integrated models, you upsert and search with text and have Pinecone generate vectors automatically.\n", + "\n", + "**Note:** If you prefer to use external embedding models, see [Bring your own vectors](https://docs.pinecone.io/guides/indexes/understanding-indexes#bring-your-own-vectors)." + ], + "metadata": { + "id": "yfnoFFihfoY4" + } }, { "cell_type": "code", - "execution_count": null, + "source": [ + "# Create a dense index with integrated embedding\n", + "index_name = \"dense-index\"\n", + "if not pc.has_index(index_name):\n", + " pc.create_index_for_model(\n", + " name=index_name,\n", + " cloud=\"aws\",\n", + " region=\"us-east-1\",\n", + " embed={\n", + " \"model\":\"llama-text-embed-v2\",\n", + " \"field_map\":{\"text\": \"chunk_text\"}\n", + " }\n", + " )\n" + ], "metadata": { - "id": "ZIclo2UK3NFE" + "id": "FMyeNo6Afh4z" }, - "outputs": [], - "source": [ - "# Define a sample dataset where each item has a unique ID, text, and category\n", - "data = [\n", - " {\n", - " \"id\": \"rec1\",\n", - " \"text\": \"Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.\",\n", - " \"category\": \"digestive system\"\n", - " },\n", - " {\n", - " \"id\": \"rec2\",\n", - " \"text\": \"Apples originated in Central Asia and have been cultivated for thousands of years, with over 7,500 varieties available today.\",\n", - " \"category\": \"cultivation\"\n", - " },\n", - " {\n", - " \"id\": \"rec3\",\n", - " \"text\": \"Rich in vitamin C and other antioxidants, apples contribute to immune health and may reduce the risk of chronic diseases.\",\n", - " \"category\": \"immune system\"\n", - " },\n", - " {\n", - " \"id\": \"rec4\",\n", - " \"text\": \"The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.\",\n", - " \"category\": \"endocrine system\"\n", - " }\n", - "]\n", - "\n", - "# Convert the text into numerical vectors that Pinecone can index\n", - "embeddings = pc.inference.embed(\n", - " model=\"multilingual-e5-large\",\n", - " inputs=[d[\"text\"] for d in data],\n", - " parameters={\n", - " \"input_type\": \"passage\",\n", - " \"truncate\": \"END\"\n", - " }\n", - ")\n", - "\n", - "print(embeddings)" - ] + "execution_count": null, + "outputs": [] }, { "cell_type": "markdown", "metadata": { - "id": "VpgIIsLlJGFf" + "id": "bN9Rl7GP258C" }, "source": [ - "## Create an index\n", + "## Upsert text\n", "\n", - "In Pinecone, you store data in an [index](https://docs.pinecone.io/guides/indexes/understanding-indexes).\n", - "\n", - "Create a serverless index that matches the dimension (`1024`) and similarity metric (`cosine`) of the `multilingual-e5-large` model you used in the previous step, and choose a [cloud and region](https://docs.pinecone.io/guides/indexes/understanding-indexes#cloud-regions) for hosting the index:" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "id": "Buo2K1h8O_fN" - }, - "outputs": [], - "source": [ - "index_name = \"docs-quickstart-notebook\"" + "Prepare a sample dataset of factual statements from different domains like history, physics, technology, and music. Format the data as records with an ID, text, and category." ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "metadata": { - "id": "MaqbcsI4I1gU" + "id": "ZIclo2UK3NFE" }, "outputs": [], "source": [ - "import time\n", - "\n", - "if not pc.has_index(index_name):\n", - " pc.create_index(\n", - " name=index_name,\n", - " dimension=1024,\n", - " metric=\"cosine\",\n", - " spec=ServerlessSpec(\n", - " cloud='aws',\n", - " region='us-east-1'\n", - " )\n", - " )\n", - "\n", - "# Wait for the index to be ready\n", - "while not pc.describe_index(index_name).status['ready']:\n", - " time.sleep(1)" + "# Define your dataset\n", + "records = [\n", + " { \"_id\": \"rec1\", \"chunk_text\": \"The Eiffel Tower was completed in 1889 and stands in Paris, France.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec2\", \"chunk_text\": \"Photosynthesis allows plants to convert sunlight into energy.\", \"category\": \"science\" },\n", + " { \"_id\": \"rec3\", \"chunk_text\": \"Albert Einstein developed the theory of relativity.\", \"category\": \"science\" },\n", + " { \"_id\": \"rec4\", \"chunk_text\": \"The mitochondrion is often called the powerhouse of the cell.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec5\", \"chunk_text\": \"Shakespeare wrote many famous plays, including Hamlet and Macbeth.\", \"category\": \"literature\" },\n", + " { \"_id\": \"rec6\", \"chunk_text\": \"Water boils at 100°C under standard atmospheric pressure.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec7\", \"chunk_text\": \"The Great Wall of China was built to protect against invasions.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec8\", \"chunk_text\": \"Honey never spoils due to its low moisture content and acidity.\", \"category\": \"food science\" },\n", + " { \"_id\": \"rec9\", \"chunk_text\": \"The speed of light in a vacuum is approximately 299,792 km/s.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec10\", \"chunk_text\": \"Newton’s laws describe the motion of objects.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec11\", \"chunk_text\": \"The human brain has approximately 86 billion neurons.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec12\", \"chunk_text\": \"The Amazon Rainforest is one of the most biodiverse places on Earth.\", \"category\": \"geography\" },\n", + " { \"_id\": \"rec13\", \"chunk_text\": \"Black holes have gravitational fields so strong that not even light can escape.\", \"category\": \"astronomy\" },\n", + " { \"_id\": \"rec14\", \"chunk_text\": \"The periodic table organizes elements based on their atomic number.\", \"category\": \"chemistry\" },\n", + " { \"_id\": \"rec15\", \"chunk_text\": \"Leonardo da Vinci painted the Mona Lisa.\", \"category\": \"art\" },\n", + " { \"_id\": \"rec16\", \"chunk_text\": \"The internet revolutionized communication and information sharing.\", \"category\": \"technology\" },\n", + " { \"_id\": \"rec17\", \"chunk_text\": \"The Pyramids of Giza are among the Seven Wonders of the Ancient World.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec18\", \"chunk_text\": \"Dogs have an incredible sense of smell, much stronger than humans.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec19\", \"chunk_text\": \"The Pacific Ocean is the largest and deepest ocean on Earth.\", \"category\": \"geography\" },\n", + " { \"_id\": \"rec20\", \"chunk_text\": \"Chess is a strategic game that originated in India.\", \"category\": \"games\" },\n", + " { \"_id\": \"rec21\", \"chunk_text\": \"The Statue of Liberty was a gift from France to the United States.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec22\", \"chunk_text\": \"Coffee contains caffeine, a natural stimulant.\", \"category\": \"food science\" },\n", + " { \"_id\": \"rec23\", \"chunk_text\": \"Thomas Edison invented the practical electric light bulb.\", \"category\": \"inventions\" },\n", + " { \"_id\": \"rec24\", \"chunk_text\": \"The moon influences ocean tides due to gravitational pull.\", \"category\": \"astronomy\" },\n", + " { \"_id\": \"rec25\", \"chunk_text\": \"DNA carries genetic information for all living organisms.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec26\", \"chunk_text\": \"Rome was once the center of a vast empire.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec27\", \"chunk_text\": \"The Wright brothers pioneered human flight in 1903.\", \"category\": \"inventions\" },\n", + " { \"_id\": \"rec28\", \"chunk_text\": \"Bananas are a good source of potassium.\", \"category\": \"nutrition\" },\n", + " { \"_id\": \"rec29\", \"chunk_text\": \"The stock market fluctuates based on supply and demand.\", \"category\": \"economics\" },\n", + " { \"_id\": \"rec30\", \"chunk_text\": \"A compass needle points toward the magnetic north pole.\", \"category\": \"navigation\" },\n", + " { \"_id\": \"rec31\", \"chunk_text\": \"The universe is expanding, according to the Big Bang theory.\", \"category\": \"astronomy\" },\n", + " { \"_id\": \"rec32\", \"chunk_text\": \"Elephants have excellent memory and strong social bonds.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec33\", \"chunk_text\": \"The violin is a string instrument commonly used in orchestras.\", \"category\": \"music\" },\n", + " { \"_id\": \"rec34\", \"chunk_text\": \"The heart pumps blood throughout the human body.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec35\", \"chunk_text\": \"Ice cream melts when exposed to heat.\", \"category\": \"food science\" },\n", + " { \"_id\": \"rec36\", \"chunk_text\": \"Solar panels convert sunlight into electricity.\", \"category\": \"technology\" },\n", + " { \"_id\": \"rec37\", \"chunk_text\": \"The French Revolution began in 1789.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec38\", \"chunk_text\": \"The Taj Mahal is a mausoleum built by Emperor Shah Jahan.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec39\", \"chunk_text\": \"Rainbows are caused by light refracting through water droplets.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec40\", \"chunk_text\": \"Mount Everest is the tallest mountain in the world.\", \"category\": \"geography\" },\n", + " { \"_id\": \"rec41\", \"chunk_text\": \"Octopuses are highly intelligent marine creatures.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec42\", \"chunk_text\": \"The speed of sound is around 343 meters per second in air.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec43\", \"chunk_text\": \"Gravity keeps planets in orbit around the sun.\", \"category\": \"astronomy\" },\n", + " { \"_id\": \"rec44\", \"chunk_text\": \"The Mediterranean diet is considered one of the healthiest in the world.\", \"category\": \"nutrition\" },\n", + " { \"_id\": \"rec45\", \"chunk_text\": \"A haiku is a traditional Japanese poem with a 5-7-5 syllable structure.\", \"category\": \"literature\" },\n", + " { \"_id\": \"rec46\", \"chunk_text\": \"The human body is made up of about 60% water.\", \"category\": \"biology\" },\n", + " { \"_id\": \"rec47\", \"chunk_text\": \"The Industrial Revolution transformed manufacturing and transportation.\", \"category\": \"history\" },\n", + " { \"_id\": \"rec48\", \"chunk_text\": \"Vincent van Gogh painted Starry Night.\", \"category\": \"art\" },\n", + " { \"_id\": \"rec49\", \"chunk_text\": \"Airplanes fly due to the principles of lift and aerodynamics.\", \"category\": \"physics\" },\n", + " { \"_id\": \"rec50\", \"chunk_text\": \"Renewable energy sources include wind, solar, and hydroelectric power.\", \"category\": \"energy\" }\n", + "]" ] }, { "cell_type": "markdown", - "metadata": { - "id": "tNAgla6IKWie" - }, "source": [ - "## Upsert vectors\n", + "[Upsert](https://docs.pinecone.io/guides/data/upsert-data) the sample dataset into a new namespace in your index.\n", "\n", - "Target your index and use the [`upsert`](https://docs.pinecone.io/guides/data/upsert-data) operation to load your vector embeddings into a new namespace.\n", - "\n", - "**Note:** [Namespaces](https://docs.pinecone.io/guides/get-started/key-features#namespaces) let you partition records within an index and are essential for [implementing multitenancy](https://docs.pinecone.io/guides/get-started/implement-multitenancy) when you need to isolate the data of each customer/user.\n" - ] + "Because your index is integrated with an embedding model, you provide the textual statements and Pinecone converts them to dense vectors automatically." + ], + "metadata": { + "id": "f5vNb1pugnR5" + } }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "Ri6RX7FEiV4C" - }, - "outputs": [], "source": [ "# Target the index\n", - "# In production, target an index by its unique DNS host, not by its name\n", - "# See https://docs.pinecone.io/guides/data/target-an-index\n", - "index = pc.Index(index_name)\n", + "dense_index = pc.Index(index_name)\n", "\n", - "# Prepare the records for upsert\n", - "# Each contains an 'id', the vector 'values',\n", - "# and the original text and category as 'metadata'\n", - "records = []\n", - "for d, e in zip(data, embeddings):\n", - " records.append({\n", - " \"id\": d[\"id\"],\n", - " \"values\": e[\"values\"],\n", - " \"metadata\": {\n", - " \"source_text\": d[\"text\"],\n", - " \"category\": d[\"category\"]\n", - " }\n", - " })\n", - "\n", - "# Upsert the records into the index\n", - "index.upsert(\n", - " vectors=records,\n", - " namespace=\"example-namespace\"\n", - ")" - ] - }, - { - "cell_type": "markdown", + "# Upsert into a namespace\n", + "dense_index.upsert_records(\"example-namespace\", records)" + ], "metadata": { - "id": "fqVA4OrlidX2" + "id": "WqDOcyz5gp1Z" }, - "source": [ - "**Note:** To load large amounts of data, [import from object storage](https://docs.pinecone.io/guides/data/understanding-imports) or [upsert in large batches](https://docs.pinecone.io/guides/data/upsert-data#upsert-records-in-batches)." - ] + "execution_count": null, + "outputs": [] }, { "cell_type": "markdown", - "metadata": { - "id": "AsVqrR2YipPM" - }, "source": [ - "## Check the index\n", + "## Check index stats\n", "\n", - "Pinecone is eventually consistent, so there can be a delay before your upserted vectors are available to query. Use the [`describe_index_stats`](https://docs.pinecone.io/guides/data/data-freshness/check-data-freshness) operation to check if the current vector count matches the number of vectors you upserted:" - ] + "Pinecone is eventually consistent, so there can be a slight delay before new or changed records are visible to queries. You can [view index stats](https://docs.pinecone.io/guides/data/check-data-freshness#verify-record-counts) to check if the current vector count matches the number of vectors you upserted (50):" + ], + "metadata": { + "id": "cxM2hdTjg1tS" + } }, { "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ANfVNxzDivEY" - }, - "outputs": [], "source": [ - "time.sleep(10) # Wait for the upserted vectors to be indexed\n", + "import time\n", "\n", - "print(index.describe_index_stats())" - ] + "# Wait for the upserted vectors to be indexed\n", + "time.sleep(10)\n", + "\n", + "# View stats for the index\n", + "stats = dense_index.describe_index_stats()\n", + "print(stats)" + ], + "metadata": { + "id": "z3B4RkEqg1US" + }, + "execution_count": null, + "outputs": [] }, { "cell_type": "markdown", "metadata": { - "id": "6cNHN6_xjYm-" + "id": "VpgIIsLlJGFf" }, "source": [ - "## Search the index\n", + "## Semantic search\n", "\n", - "Now, let’s say you want to search your index for information related to \"health risks\".\n", + "[Search the dense index](https://docs.pinecone.io/guides/data/query-data#semantic-search) for ten records that are most semantically similar to the query, “Famous historical structures and monuments”.\n", "\n", - "Use the the `multilingual-e5-large` model hosted by Pinecone *to* convert your query into a vector embedding, and then use the [`query`](https://docs.pinecone.io/guides/data/query-data) operation to search for the three vectors in the index that are most semantically similar to the query vector:" + "Again, because your index is integrated with an embedding model, you provide the query as text and Pinecone converts the text to a dense vector automatically." ] }, { "cell_type": "code", "execution_count": null, "metadata": { - "id": "RyP4EQX8jcLn" + "id": "Buo2K1h8O_fN" }, "outputs": [], "source": [ - "# Define your query\n", - "query = \"Health risks\"\n", - "\n", - "# Convert the query into a numerical vector that Pinecone can search with\n", - "query_embedding = pc.inference.embed(\n", - " model=\"multilingual-e5-large\",\n", - " inputs=[query],\n", - " parameters={\n", - " \"input_type\": \"query\"\n", - " }\n", - ")\n", + "# Define the query\n", + "query = \"Famous historical structures and monuments\"\n", "\n", - "# Search the index for the three most similar vectors\n", - "results = index.query(\n", + "# Search the dense index\n", + "results = dense_index.search(\n", " namespace=\"example-namespace\",\n", - " vector=query_embedding[0].values,\n", - " top_k=3,\n", - " include_values=False,\n", - " include_metadata=True\n", + " query={\n", + " \"top_k\": 10,\n", + " \"inputs\": {\n", + " 'text': query\n", + " }\n", + " }\n", ")\n", "\n", - "print(results)" + "# Print the results\n", + "for hit in results['result']['hits']:\n", + " print(f\"id: {hit['_id']}, score: {round(hit['_score'], 2)}, text: {hit['fields']['chunk_text']}, category: {hit['fields']['category']}\")" ] }, { "cell_type": "markdown", - "metadata": { - "id": "9jAJDjSAjsvA" - }, "source": [ - "Notice that the response includes only records related to health, not the cultivation of apple." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ayZib8aEUYR_" - }, - "source": [ - "## Add reranking\n", - "\n", - "You can increase the accuracy of your search by reranking results based on their relevance to the query.\n", + "Notice that most of the results are about historical structures and monuments. However, a few unrelated statements are included as well and are ranked high in the list, for example, statements about Shakespeare and renewable energy.\n", "\n", - "Use the `rerank` operation and the `bge-reranker-v2-m3` reranking model hosted by Pinecone to rerank the values of the documents.source_text fields:" - ] + "To get a more accurate ranking, search again but this time [rerank the initial results](https://docs.pinecone.io/guides/data/query-data#rerank-results) based on their relevance to the query." + ], + "metadata": { + "id": "SFXzoE1thnXa" + } }, { "cell_type": "code", "execution_count": null, "metadata": { - "id": "SyPG_OmwUjtm" + "id": "MaqbcsI4I1gU" }, "outputs": [], "source": [ - "# Rerank the search results based on their relevance to the query\n", - "ranked_results = pc.inference.rerank(\n", - " model=\"bge-reranker-v2-m3\",\n", - " query=\"Disease prevention\",\n", - " documents=[\n", - " {\"id\": \"rec3\", \"source_text\": \"Rich in vitamin C and other antioxidants, apples contribute to immune health and may reduce the risk of chronic diseases.\"},\n", - " {\"id\": \"rec1\", \"source_text\": \"Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.\"},\n", - " {\"id\": \"rec4\", \"source_text\": \"The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.\"}\n", - " ],\n", - " top_n=3,\n", - " rank_fields=[\"source_text\"],\n", - " return_documents=True,\n", - " parameters={\n", - " \"truncate\": \"END\"\n", + "# Search the dense index and rerank results\n", + "reranked_results = dense_index.search(\n", + " namespace=\"example-namespace\",\n", + " query={\n", + " \"top_k\": 10,\n", + " \"inputs\": {\n", + " 'text': query\n", + " }\n", + " },\n", + " rerank={\n", + " \"model\": \"bge-reranker-v2-m3\",\n", + " \"top_n\": 10,\n", + " \"rank_fields\": [\"chunk_text\"]\n", " }\n", ")\n", "\n", - "print(ranked_results)\n" + "# Print the reranked results\n", + "for hit in reranked_results['result']['hits']:\n", + " print(f\"id: {hit['_id']}, score: {round(hit['_score'], 2)}, text: {hit['fields']['chunk_text']}, category: {hit['fields']['category']}\")" ] }, { "cell_type": "markdown", - "metadata": { - "id": "QTkhBFJHUnj0" - }, "source": [ - "Notice that the two records specifically related to \"health risks\" (chronic disease and diabetes) are now ranked highest." - ] + "Notice that all of the most relevant results about historical structures and monuments are now ranked highest." + ], + "metadata": { + "id": "lfaATEz7hvqC" + } }, { "cell_type": "markdown", "metadata": { - "id": "nGjpffT5UrrL" + "id": "AsVqrR2YipPM" }, "source": [ - "## Add filtering\n", + "## Improve results\n", "\n", - "You can use a [metadata filter](https://docs.pinecone.io/guides/data/understanding-metadata) to limit your search to records matching a filter expression.\n", + "Reranking results is one of the most effective ways to improve search accuracy and relevance, but there are many other techniques to consider. For example:\n", "\n", - "Your upserted records contain a `category` metadata field. Now use that field as a filter to search for records in the “digestive system” category:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "KkH7Wre3Ux5B" - }, - "outputs": [], - "source": [ - "# Search the index with a metadata filter\n", - "filtered_results = index.query(\n", - " namespace=\"example-namespace\",\n", - " vector=query_embedding.data[0].values,\n", - " filter={\n", - " \"category\": {\"$eq\": \"digestive system\"}\n", - " },\n", - " top_k=3,\n", - " include_values=False,\n", - " include_metadata=True\n", - ")\n", + "* [Filtering by metadata](https://docs.pinecone.io/guides/data/query-data#filter-by-metadata): When records contain additional metadata, you can limit the search to records matching a filter expression.\n", "\n", - "print(filtered_results)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "awumy10tU2Lv" - }, - "source": [ - "Notice that the response includes only the one record in the “digestive system” category." + "* [Hybrid search](https://docs.pinecone.io/guides/data/query-data#hybrid-search): You can add lexical search to capture precise keyword matches (e.g., product SKUs, email addresses, domain-specific terms) in addition to semantic matches.\n", + "\n", + "* [Chunking strategies](https://www.pinecone.io/learn/chunking-strategies/): You can chunk your content in different ways to get better results. Consider factors like the length of the content, the complexity of queries, and how results will be used in your application." ] }, { @@ -456,12 +392,12 @@ "source": [ "## Clean up\n", "\n", - "When you no longer need the `docs-quickstart-notebook` index, use the [`delete_index`](https://docs.pinecone.io/reference/api/control-plane/delete_index) operation to delete it:" + "When you no longer need your example index, delete it as follows:" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "metadata": { "id": "1iHV2Y0ujy0y" }, @@ -494,4 +430,4 @@ }, "nbformat": 4, "nbformat_minor": 0 -} +} \ No newline at end of file