-
Notifications
You must be signed in to change notification settings - Fork 22
Add retrieval RequestProcessor and end-to-end RAG examples #148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Fred Reiss <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@frreiss Do you mind fixing the gate issues?
Signed-off-by: Fred Reiss <[email protected]>
Signed-off-by: Fred Reiss <[email protected]>
Signed-off-by: Fred Reiss <[email protected]>
CI problems with base data fixed now, but am seeing issues with using an embedding model from SentenceTransformers:
@hickeyma @markstur is there a special trick to get models into the Hugging Face cache directory for CI? |
Signed-off-by: Fred Reiss <[email protected]>
Signed-off-by: Fred Reiss <[email protected]>
Move to test workflow as its not Ollama specific. Signed-off-by: Martin Hickey <[email protected]>
@frreiss I pushed commit ce00ef3 and that fixes the download of embedding model issue. There are now the following issues:
Do you mind addressing those issues? |
Signed-off-by: Fred Reiss <[email protected]>
Tests are passing now. Did an internal review of the notebook |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @frreiss for the PR.
Overall, it looks good. Small nit inline and maybe could squash the 2 notebooks into 1 as rag.ipynb
incorporates retrieval.ipynb
.
You mentioned that you wanted to update the notebooks. I am going to merge for now and lets do that in a follow up PR.
os.makedirs(target_root) | ||
|
||
part_num = 1 | ||
repo_root = "https://github.com/frreiss/mt-rag-embeddings" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for the moment. However, need to find a repo which is part of the community.
This PR adds a RequestProcessor that performs the retrieval phase of the RAG pattern. There is a basic implementation that uses an in-memory vector database and an extension point for adding support for other vector databases in the future.
The PR also includes tests for the new functionality.
I have refactored the RequestProcessor for hallucinations so that it can also be used to perform query rewrite.
This PR also includes a notebook that shows several end-to-end RAG examples that use different combinations of intrinsics.