-
Notifications
You must be signed in to change notification settings - Fork 37
feat: Support Importing Precomputed Embeddings #199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hey @ChuckHend ! I'm first time contributor to tembo. Please can you review my implementation to resolve the issue. |
Hey @Ayushjhawar8, thank you for the PR. The title of this PR says "Add document chunking capability to vectorize.table()" but the changes and bounty look like more closely related to "Bring your own embeddings?" |
Hi @ChuckHend. Sry for bad pr title..i changed it....also im giving my idea behind the changes...so i implemented the import_embeddings feature to let users bring their pre-computed embeddings directly into pg_vectorize. I designed it to work with both join and append table methods since I wanted to maintain compatibility with existing workflows. I built this specifically thinking about users who might be using Sentence-Transformers or similar models and want to take the management capabilities without redoing their embedding work. I also made sure to document it clearly in the API docs and added comprehensive tests to verify it works as intended across different scenarios. |
Don't worry about any tests failing because of missing API keys or permissions, such as the two below.
Although this one and others I think needs a fix? |
okay I will be working to fix that |
hey @ChuckHend please can you check i tried to fix some hmm less error now =) |
@Ayushjhawar8 , I left a few comments. This is coming along nicely! |
…th JOINs & bulk ops.
Hey @ChuckHend ,Thank you for your review, I tried to implement all your suggestion as:
Hoping it fixes the test, Also please review it further if needed.
Thanks :) |
umm why install dependency one is failing now Also its only format issue and integration test now ? 😃 |
Looks like some things in our CI environment need to be updated. I'll fix that today. |
Getting closer! Remove those unused variable declarations and resolve this test error
|
Nice, sounds good, I will try to fix both 👍🏻 |
hm still that test failing let me see how can i work around it 🤔 |
@Ayushjhawar8 , I pushed a few changes in 53c2ace
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all that remains for a test to cover vectorize.table_from()
? Looks like import_embeddings() is functioning as intended now.
wow thanks for the improvements! That makes things much clearer. Appreciate the help! 💖 |
Love to hear that, im on the test🫡 |
I'm not sure what is this error about? We are using |
I think its happening during the init_table() call earlier in from_table(). I think init_table() will need to be modified such that it can accept a value of "manual" so that it will not init the cron job or table triggers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, nice work, thank you!
/claim #149
closes #149