SIGIR 2025 LiveRAG Challenge

Some code to process the SIGIR 2025 LiveRAG challenge. It has been tested on M1 Mac and Windows (with CUDA).

Setup

Setup virtual environment

Mac

python -m venv .venv
source .venv/bin/activate
# system-specific pytorch not part of the requirements.txt
pip install torch==2.7.0
pip install -r requirements.txt

Windows Powershell

python -m venv .venv
.venv\Scripts\Activate.ps1
# pytorch for CUDA 12.8 according to https://pytorch.org/get-started/locally/
pip install torch==2.7.0 --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt

Environment Variables

Copy .env-template to .env and replace the respective values.

Download data

Index Snapshots

We use bm25s for BM25 retrieval and Snowflake/arctic-embed-l embeddings in a usearch kNN index for retrieval.

You don't need to re-create these embeddings/indices. We have a prebuilt version for download. The download is about 60GB, so will take some time.

Mac

./01_download.sh

Windows

./01_download.ps1

Run Processing

The challenge is run in 4 steps:

Retrieval: BM25 and kNN results of the original question and a Falcon-generated HyDE passage
Result fusion: RRF of the 4 previously retrieved result sets
Reranking: Re-ranking the fused results using a reranker model
Answer generation

Run the script to execute them all sequentially.

Mac

./02_run.sh

Windows

./02_run.ps1

Note that the reranking step is using unicamp-dl/InRanker-base which is slow when running on a non-CUDA platform.

The resulting file will be liverag_step4.jsonl. There are .parquet files for the intermediate results.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.env-template		.env-template
.gitignore		.gitignore
01_download.ps1		01_download.ps1
01_download.sh		01_download.sh
02_run.ps1		02_run.ps1
02_run.sh		02_run.sh
CITATION.cff		CITATION.cff
README.md		README.md
array_io.py		array_io.py
corpus.py		corpus.py
custom_types.py		custom_types.py
embeddings.py		embeddings.py
file_io.py		file_io.py
llm.py		llm.py
query_generator.py		query_generator.py
rank_combiner.py		rank_combiner.py
requirements.txt		requirements.txt
rerank.py		rerank.py
retriever.py		retriever.py
runstep_1_retrieval.py		runstep_1_retrieval.py
runstep_2_result_fusion.py		runstep_2_result_fusion.py
runstep_3_reranking.py		runstep_3_reranking.py
runstep_4_answer_generation.py		runstep_4_answer_generation.py
torch_device.py		torch_device.py
usearch_indexer.py		usearch_indexer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SIGIR 2025 LiveRAG Challenge

Setup

Setup virtual environment

Mac

Windows Powershell

Environment Variables

Download data

Index Snapshots

Mac

Windows

Run Processing

Mac

Windows

About

Uh oh!

Releases

Packages

Uh oh!

Languages

o19s/sigir-2025-public

Folders and files

Latest commit

History

Repository files navigation

SIGIR 2025 LiveRAG Challenge

Setup

Setup virtual environment

Mac

Windows Powershell

Environment Variables

Download data

Index Snapshots

Mac

Windows

Run Processing

Mac

Windows

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages