Building a Swedish Question-Answering Model -- Datasets

Hannes von Essen, Daniel Hesslow

Datasets for PaM2020 "Building a Swedish Question-Answering Model" [Paper link]

This repository contains the datasets for Swedish and Spanish question-answering generated from the SQuAD dataset using the novel cross-lingual projection method introduced in our PaM2020 paper.

They can be used to train an NLP model for extractive question answering, such as Multilingual BERT. Check out the Transformers library for more details.

When used to train Multilingual BERT, the Spanish dataset achieves a new state of the art in the XQuAD and MLQA question-answering benchmarks.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Building_a_Swedish_Question_Answering_Model_Preprint.pdf		Building_a_Swedish_Question_Answering_Model_Preprint.pdf
README.md		README.md
english+spanish_squad_train.json		english+spanish_squad_train.json
english+swedish_squad_train.json		english+swedish_squad_train.json
spanish_squad_train.json		spanish_squad_train.json
swedish_squad_dev.json		swedish_squad_dev.json
swedish_squad_train.json		swedish_squad_train.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building a Swedish Question-Answering Model -- Datasets

Hannes von Essen, Daniel Hesslow

About

Releases

Packages

Contributors 2

Vottivott/building-a-swedish-qa-model

Folders and files

Latest commit

History

Repository files navigation

Building a Swedish Question-Answering Model -- Datasets

Hannes von Essen, Daniel Hesslow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages