Skip to content

o19s/learning-to-hybrid-search-haystack-us-25

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

learning-to-hybrid-search-haystack-us-25

Repository accompanying the Haystack US 2025 workshop "Learning to Hybrid Search"

Requirements

  • Docker to run OpenSearch and OpenSearch Dashboards
  • Python and pip
  • Dataset: the notebooks assume the ESCI dataset to be downloaded. You can change the path to where the dataset can be found in the notebooks accordingly.

Run the Notebooks

Run OpenSearch

Execute the following command to fire up OpenSearch and OpenSearch Dashboards:

docker compose up -d

Install Requirements and Start Jupyter

Create a virtual environment:

python3 -m venv .venv

Activate the virtual environment:

source .venv/bin/activate

Install the requirements:

pip3 install -r requirements.txt

Start Jupyter:

jupyter notebook

Open http://localhost:8888 in your browser (you might need to go for http://127.0.0.1:8888)

Notebooks

  1. Prepare OpenSearch: necessary setup steps to enable embedding generation during index and query time.
  2. Index ESCI Data: load the product data.
  3. Queries, queries, queries: run lexical and hybrid queries.
  4. Baseline Search & Metrics: calculate search quality metrics for the baseline.
  5. Best Hybrid Search Configuration: identify the best configiuration parameters to run arithmetic combination of hybrid search.
  6. Dynamic Hybrid Search Optimization - Model Training and Evaluation: do feature engineering and evaluate good feature combinations.
  7. Calculate Search Metrics with Dynamic Optimizer: run the trained model on the test set.

About

Repository accompanying the Haystack US 2025 workshop "Learning to Hybrid Search"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages