This repository contains the information about the baselines used in the original XOR QA paper.
- Update as of 01/2020: Codes for running DPR based baselines are ready.
- Update as of 04/2020: The final version of DPR models are now available. You can download them by running scripts of
download_multilingual_models.sh
anddownload_trans_test_models.sh
. - Update as of 04/2020: The final prediction files on XOR QA full (development set) are available here. We also release the retrieval & final QA prediction results of DPR models here.
In our experiment, we have tried three different models (term-based, term-based model followed by neural paragraph ranker, end-to-end neural retriever). The codes for each baseline is available below:
- BM25: We use ElasticSearch's python client to retrieve documents in English or in target languages. The code is available here.
- Dense Passage Retriever (Karpukhin et al., 2020): The code (some minor modifications of the original DPR implementations) is here.
- Path Retriever (Asai et al., 2020)
We have trained with fairseq on OPUS corpus, as well as HuggingFace Models from Helsinki NLP. The source code is available at XOR_QA_MTPipeline.