Skip to content

Add retrieval RequestProcessor and end-to-end RAG examples #148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 18, 2025

Conversation

frreiss
Copy link
Collaborator

@frreiss frreiss commented Apr 16, 2025

This PR adds a RequestProcessor that performs the retrieval phase of the RAG pattern. There is a basic implementation that uses an in-memory vector database and an extension point for adding support for other vector databases in the future.

The PR also includes tests for the new functionality.
I have refactored the RequestProcessor for hallucinations so that it can also be used to perform query rewrite.

This PR also includes a notebook that shows several end-to-end RAG examples that use different combinations of intrinsics.

@frreiss frreiss requested review from hickeyma and markstur April 16, 2025 02:03
Copy link
Collaborator

@hickeyma hickeyma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frreiss Do you mind fixing the gate issues?

frreiss added 2 commits April 16, 2025 09:56
Signed-off-by: Fred Reiss <[email protected]>
Signed-off-by: Fred Reiss <[email protected]>
@frreiss
Copy link
Collaborator Author

frreiss commented Apr 16, 2025

Linter issues fixed.

@hickeyma and @markstur I'm seeing test failures because the test cases can't download the data files they need. Is there a way to get around that limitations?

@frreiss
Copy link
Collaborator Author

frreiss commented Apr 16, 2025

CI problems with base data fixed now, but am seeing issues with using an embedding model from SentenceTransformers:

@staticmethod
      def load(input_path) -> Pooling:
  >       with open(os.path.join(input_path, "config.json")) as fIn:
  E       FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/.cache/huggingface/hub/models--sentence-transformers--multi-qa-mpnet-base-dot-v1/snapshots/4633e80e17ea975bc090c97b049da[260](https://github.com/ibm-granite/granite-io/actions/runs/14499640836/job/40676006992?pr=148#step:8:262)62b054d3/1_Pooling/config.json'

@hickeyma @markstur is there a special trick to get models into the Hugging Face cache directory for CI?

frreiss and others added 3 commits April 16, 2025 13:22
Signed-off-by: Fred Reiss <[email protected]>
Move to test workflow as its not Ollama specific.

Signed-off-by: Martin Hickey <[email protected]>
@hickeyma
Copy link
Collaborator

hickeyma commented Apr 17, 2025

@frreiss I pushed commit ce00ef3 and that fixes the download of embedding model issue.

There are now the following issues:

  • Test fail: AssertionError: assert [-0.110380716...87530518, ...] == approx([-0.11...17 ± 1.2e-08])
  • There are also issues with Nvidia GPUs not available: RuntimeError: Found no NVIDIA driver on your system. Can potentially use the following check: if torch.cuda.is_available():

Do you mind addressing those issues?

Signed-off-by: Fred Reiss <[email protected]>
@frreiss
Copy link
Collaborator Author

frreiss commented Apr 17, 2025

Tests are passing now.

Did an internal review of the notebook rag.ipynb with the researchers who created the models involved this morning; some additional changes recommended before we merge this PR.

Copy link
Collaborator

@hickeyma hickeyma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @frreiss for the PR.

Overall, it looks good. Small nit inline and maybe could squash the 2 notebooks into 1 as rag.ipynb incorporates retrieval.ipynb.

You mentioned that you wanted to update the notebooks. I am going to merge for now and lets do that in a follow up PR.

os.makedirs(target_root)

part_num = 1
repo_root = "https://github.com/frreiss/mt-rag-embeddings"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok for the moment. However, need to find a repo which is part of the community.

@hickeyma hickeyma merged commit 18b663a into ibm-granite:main Apr 18, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants