Skip to content

Datasets for PaM2020 "Building a Swedish Question-Answering Model"

Notifications You must be signed in to change notification settings

Vottivott/building-a-swedish-qa-model

Repository files navigation

Building a Swedish Question-Answering Model -- Datasets

Hannes von Essen, Daniel Hesslow

Datasets for PaM2020 "Building a Swedish Question-Answering Model" [Paper link]

This repository contains the datasets for Swedish and Spanish question-answering generated from the SQuAD dataset using the novel cross-lingual projection method introduced in our PaM2020 paper.

They can be used to train an NLP model for extractive question answering, such as Multilingual BERT. Check out the Transformers library for more details.

When used to train Multilingual BERT, the Spanish dataset achieves a new state of the art in the XQuAD and MLQA question-answering benchmarks.

About

Datasets for PaM2020 "Building a Swedish Question-Answering Model"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published