Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hybrid search #31

Open
heehehe opened this issue Dec 16, 2023 · 4 comments
Open

hybrid search #31

heehehe opened this issue Dec 16, 2023 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@heehehe
Copy link
Owner

heehehe commented Dec 16, 2023

  • (경륜님 아이디어)
  • 기존 keyword search + vector search
  • 키워드 vectorize - GPT로 쿼리 질의 이해 - 매칭되는 벡터 공간 찾아서 가져오도록
@heehehe heehehe added the enhancement New feature or request label Dec 16, 2023
@ryuni-dev ryuni-dev self-assigned this Feb 12, 2024
@ryuni-dev
Copy link
Collaborator

ryuni-dev commented Feb 12, 2024

https://python.langchain.com/docs/integrations/document_loaders/google_bigquery
https://www.datascienceengineer.com/blog/post-chat-with-bigquery
https://medium.com/@shivansh.kaushik/talk-to-your-database-using-rag-and-llms-42eb852d2a3c

이건 Hybrid search는 아니고 LLM이 유저의 질문을 적절히 이해해서 bigquery query 문을 생성시키고 쿼리 결과를 가지고 답을 내도록 하는거긴 한데 hybrid search를 위한 작업보다는 빠르게 할 수 있을 것 같아서 링크 달아봅니다ㅎㅎ..ㅎㅎ

한번 실험만 해볼까요~?

@ryuni-dev
Copy link
Collaborator

ryuni-dev commented Feb 12, 2024

간단하게 테스트 해보고 있는데 vectorize해서 유사도 검색하는게 더 나을거 같긴 하네용ㅎㅎ..
참고 자료

@ryuni-dev
Copy link
Collaborator

ryuni-dev commented Feb 12, 2024

https://github.com/heehehe/job-trend/blob/feature/31-search/search/search-test.ipynb

Test 두개 진행해 봤는데

  1. SQL generation
    -> token 수 문제, SQL 정확도 문제 등으로 어려울 것 같습니다.

  2. Vector Search (w/ FAISS)

  • 임베딩 후 벡터화하여 벡터스토어를 만들고 vector search를 해봤는데 나름 괜찮은 것 같습니다.. (쿼리는 더 실험해봐야 할 것 같아요)
  • 다만, indexing batch도 같이 돌아가야 합니다 !

그리구 실험하면서 찾은건데 마감된 공고 지우는 batch도 추가되어야 할 것 같아요..!

@ryuni-dev
Copy link
Collaborator

ryuni-dev commented Feb 12, 2024

좀 더 나은 결과를 위해 실험해 볼 요소들은 더 있을 것 같은데 다음 회의 전까지 정리해서 공유드릴게용 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants