MultiSpanQA: A Dataset for Multi-Span Question Answering

H Li; M Vasardani; M Tomko; T Baldwin

Conference Proceedings

MultiSpanQA: A Dataset for Multi-Span Question Answering

H Li, M Vasardani, M Tomko, T Baldwin

Naacl 2022 2022 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference | ASSOC COMPUTATIONAL LINGUISTICS-ACL | Published : 2022

DOI: 10.18653/v1/2022.naacl-main.90

Open access

Download PDF

Abstract

Most existing reading comprehension datasets focus on single-span answers, which can be extracted as a single contiguous span from a given text passage. Multi-span questions, i.e., questions whose answer is a series of multiple discontiguous spans in the text, are common in real life but are less studied. In this paper, we present MultiSpanQA, a new dataset that focuses on questions with multi-span answers. Raw questions and contexts are extracted from the Natural Questions (Kwiatkowski et al., 2019) dataset. After multi-span re-annotation, MultiSpanQA consists of over a total of 6,000 multi-span questions in the basic version, and over 19,000 examples with unanswerable questions, and questi..

View full abstract

University of Melbourne Researchers

Martin Tomko Author

Tim Baldwin Author

Related Projects (2)

Making human place knowledge digestible by computers

This project aims to develop the tools that will enable people to interact intuitively with computers about places and the relations between..

A HIGH-PERFORMANCE CLOUD RESOURCE FOR COMPUTATIONAL MODELLING

This project aims to build a relatively low-cost graphical-processing-unit-based cloud-accessible facility. Much current cutting-edge resear..

Grants

Awarded by Australian Research Council

Funding Acknowledgements

The authors would like to thank the anonymous reviewers for their constructive reviews. This research was undertaken using the LIEF HPC-GPGPU Facility hosted at the University of Melbourne. This Facility was established with the assistance of LIEF Grant LE170100200. This research was supported by Australian Research Council grant DP170100109.

Citation metrics

50Scopus

13Web of Science

24Dimensions

Keywords

Computer Science, Interdisciplinary Applications

Science & Technology

Linguistics

Social Sciences

Computer Science, Artificial Intelligence

Computer Science

Technology