-
Notifications
You must be signed in to change notification settings - Fork 265
Add Weaviate integration #1360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add Weaviate integration #1360
Changes from 10 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
2e7cd20
Add weaviate.py and update various files
hunteritself fb13606
update various files
hunteritself 3316518
update weaviate.py
hunteritself 1f3e6d8
add test file_cs6422_test_Yang Yang.ipynb
hunteritself 88dfbaf
Resolved merge conflicts
hunteritself 68932cf
Merge branch 'staging' of https://github.com/hunteritself/evadb into …
hunteritself b904f6f
Remove .ipynb file
hunteritself 93669e1
Update weaviate.py
hunteritself d365c80
Optimize the code format
hunteritself a974914
Optimize the code format
hunteritself f9136ce
commit local changes before merge
hunteritself 2517089
Merge remote-tracking branch 'upstream/staging' into staging
hunteritself 07239d9
Apply code formatting
hunteritself 5043876
Manually fix whitespace issue
hunteritself 98722b2
Fix linter
xzdandy ba3d373
Merge branch 'staging' into hunter
xzdandy 2fac652
Fix link
xzdandy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,173 @@ | ||
# coding=utf-8 | ||
# Copyright 2018-2023 EvaDB | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
import os | ||
from typing import List | ||
|
||
from evadb.third_party.vector_stores.types import ( | ||
VectorIndexQuery, | ||
VectorStore, | ||
) | ||
from evadb.utils.generic_utils import try_to_import_weaviate_client | ||
|
||
required_params = [] | ||
_weaviate_init_done = False | ||
|
||
|
||
class WeaviateVectorStore(VectorStore): | ||
def __init__(self, **kwargs) -> None: | ||
try_to_import_weaviate_client() | ||
global _weaviate_init_done | ||
|
||
# Get the API key. | ||
self._api_key = kwargs.get("WEAVIATE_API_KEY") | ||
|
||
if not self._api_key: | ||
self._api_key = os.environ.get("WEAVIATE_API_KEY") | ||
|
||
assert ( | ||
self._api_key | ||
), "Please set your Weaviate API key in evadb.yml file (third_party, weaviate_api_key) or " \ | ||
"environment variable (WEAVIATE_API_KEY). It can be found at the Details tab in WCS Dashboard." | ||
|
||
# Get the API Url. | ||
self._api_url = kwargs.get("WEAVIATE_API_URL") | ||
|
||
if not self._api_url: | ||
self._api_url = os.environ.get("WEAVIATE_API_URL") | ||
|
||
assert ( | ||
self._api_url | ||
), "Please set your Weaviate API Url in evadb.yml file (third_party, weaviate_api_url) or " \ | ||
"environment variable (WEAVIATE_API_URL). It can be found at the Details tab in WCS Dashboard." | ||
|
||
if not _weaviate_init_done: | ||
# Initialize weaviate client | ||
import weaviate | ||
|
||
client = weaviate.Client( | ||
url=self._api_url, | ||
auth_client_secret=weaviate.AuthApiKey(api_key=self._api_key), | ||
) | ||
client.schema.get() | ||
|
||
_weaviate_init_done = True | ||
|
||
self._client = client | ||
|
||
def create_weaviate_class(self, class_name: str, vectorizer: str, module_config: dict, properties: list) -> None: | ||
# In Weaviate, vector index creation and management is not explicitly done like Pinecone | ||
# Need to typically define a property in the schema to hold vectors and insert data accordingly | ||
|
||
""" | ||
Create a Weaviate class with the specified configuration. | ||
|
||
Args: | ||
class_name (str): The name of the class to create, e.g., "Article". | ||
vectorizer (str): The vectorizer module to use, e.g., "text2vec-cohere". | ||
module_config (dict): Configuration for vectorizer and generative module, e.g., | ||
{ | ||
"text2vec-cohere": { | ||
"model": "embed-multilingual-v2.0", | ||
}, | ||
} | ||
properties (list): List of dictionaries specifying class properties, e.g., | ||
[ | ||
{ | ||
"name": "title", | ||
"dataType": ["text"] | ||
}, | ||
{ | ||
"name": "body", | ||
"dataType": ["text"] | ||
}, | ||
] | ||
|
||
Returns: | ||
None | ||
""" | ||
# Check if the class already exists | ||
if self._client.schema.exists(class_name): | ||
self._client.schema.delete_class(class_name) | ||
|
||
# Define the class object with provided parameters | ||
class_obj = { | ||
"class": class_name, | ||
"vectorizer": vectorizer, | ||
"moduleConfig": module_config, | ||
"properties": properties | ||
} | ||
|
||
# Call the Weaviate API to create the class | ||
self._client.schema.create_class(class_obj) | ||
|
||
def delete_weaviate_class(self, class_name: str) -> None: | ||
""" | ||
Delete a Weaviate class and its data. | ||
|
||
Args: | ||
class_name (str): The name of the Weaviate class to delete. | ||
|
||
Returns: | ||
None | ||
""" | ||
# Call the Weaviate API to delete the class | ||
self._client.schema.delete_class(class_name) | ||
|
||
def add_to_weaviate_class(self, class_name: str, data_objects: List[dict]) -> None: | ||
""" | ||
Add objects to the specified Weaviate class. | ||
|
||
Args: | ||
class_name (str): The name of the Weaviate class to add objects to. | ||
data_objects (List[dict]): A list of dictionaries, | ||
where each dictionary contains property names and values. | ||
|
||
Returns: | ||
None | ||
""" | ||
# Iterate over each data object and add it to the Weaviate class | ||
for data_object in data_objects: | ||
self._client.data_object.create(data_object, class_name) | ||
|
||
def query_weaviate_class(self, class_name, properties_to_retrieve, query: VectorIndexQuery) -> List[dict]: | ||
""" | ||
Perform a similarity-based search in Weaviate. | ||
|
||
Args: | ||
class_name (str): The name of the Weaviate class to perform the search on. | ||
properties_to_retrieve (List[str]): A list of property names to retrieve. | ||
query (VectorIndexQuery): A query object for similarity search, containing the query vector and top_k. | ||
|
||
Returns: | ||
List[dict]: A list of dictionaries containing the retrieved properties. | ||
""" | ||
# Define the similarity search query | ||
response = ( | ||
self._client.query | ||
.get(class_name, properties_to_retrieve) | ||
.with_near_vector({ | ||
"vector": query.embedding | ||
}) | ||
.with_limit(query.top_k) | ||
.with_additional(["distance"]) | ||
.do() | ||
) | ||
|
||
data = response.get('data', {}) | ||
|
||
# Extract the results | ||
results = data['Get'][class_name] | ||
|
||
return results |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.