Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trim input to TGI, moved clustering and summarization to dataprep and store in DB #893

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

rbrugaro
Copy link
Collaborator

Several enhancements:

  1. trim input to TGI in cases where all community partial answers do not fit in the input context of the final answer generation -> this was causing error
  2. In previous implementation clustering and summary extraction was done at query time resulting in slow time to fist token. Moved clustering a full dataset summariization to the dataprep step. In addition to storing the graph in Neo4j now we also store the entity_info and the community_summaries for retrieval with cypher queries in retriever code
  3. fix gateway input

Copy link

codecov bot commented Nov 12, 2024

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
comps/cores/mega/gateway.py 0.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
comps/cores/mega/gateway.py 29.82% <0.00%> (ø)

@rbrugaro rbrugaro added the WIP label Nov 12, 2024
@rbrugaro rbrugaro marked this pull request as ready for review November 12, 2024 22:00
@rbrugaro rbrugaro added this to the v1.1 milestone Nov 14, 2024
@ashahba ashahba removed the WIP label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants