Overview of the NLPCC 2018 Shared Task: Open Domain QA

Duan, Nan

doi:10.1007/978-3-319-99501-4_43

Nan Duan¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11109))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2032 Accesses

Abstract

We give the overview of the open domain QA shared task in the NLPCC 2018. In this year, we release three sub-tasks including Chinese knowledge-based question answering (KBQA) task, Chinese knowledge-based question generation (KBQG) task, and English knowledge-based question understanding (KBQU) task. The evaluation results of final submissions from participating teams will be presented in the experimental part.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Overview of the NLPCC-ICCPOL 2016 Shared Task: Open Domain Chinese Question Answering

Overview of the NLPCC 2017 Shared Task: Open Domain Chinese Question Answering

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Keywords

1 Background

Question Answering (or QA) is a fundamental task in Artificial Intelligence, whose goal is to build a system that can automatically answer natural language questions. In the last decade, the development of QA techniques has been greatly promoted by both academic and industry fields, and many QA-related topics have been well studied by researchers from all over world.

In order to further advance QA-related research in China, we organize this open domain QA shared task series in the past several years via NLPCC, and in this year, we release following 3 sub-tasks: (1) Chinese Knowledge-based Question Answering (KBQA); (2) Chinese Knowledge-based Question Generation (KBQG); and (3) English Knowledge-based Question Understanding (KBQU). You can see that comparing to previous two shared tasks, we retain the KBQA task and add KBQG and KBQU as two new tasks. The reason of adding these two new tasks is that we think the capabilities of asking questions in a proactive way and understanding user utterances in a deep way are very important to building human-computer interaction engines, such as search engine, chitchat bot, and task bot.

2 Task Description

The NLPCC 2018 open domain QA shared task includes 2 sub-tasks for Chinese language: KBQA and KBQG, and 1 sub-task for English language: KBQU.

2.1 KBQA Task

For KBQA task, we provide a train set and a test set. In train set, both questions and their golden answers are provided. In test set, only questions are provided. The participating teams should predict an answer for each question in test set, based on a given large-scale Chinese KB. If no answer can be predicted for a given question, just set the value of <answer id=”X”> to an empty string. The quality of a KBQA system will be evaluated by answer exact match. An example in train set is given below:

We provide a large-scale Chinese KB to participating teams, and it includes knowledge triples crawled from web. Each knowledge triple has the form: <Subject, Predicate, Object> , where ‘Subject’ denotes a subject entity, ‘Predicate’ denotes a relation, and ‘Object’ denotes an object entity. A sample of knowledge triples is given in Fig. 1, and the statistics of the Chinese KB is given in Table 1.

Table 1. Statistics of the Chinese KB.

Full size table

2.2 KBQG Task

For KBQG task, we provide a train set and a test set. In train set, both triples and their golden questions are provided. In test set, only triples are provided. The participating teams should generate a natural language question for each triple in test set, and this generated question can be answered by the object entity of the given triple. The quality of a KBQG system will be evaluated by BLEU-4. An example in train set is given below:

2.3 KBQU Task

For KBQU task, we provide a train set and a test set. In train set, both questions and their golden logical forms are provided. In test set, only questions are provided. The participating teams should predicate a logical form for each question in test set. The quality of a KBQU system will be evaluated by logical form exact match. An example in train set is given below:

<question id=”X”>	what is fight songs of Maryland
<logical form id=”X”>	(lambda ?x (sports.team.fight_song Maryland ?x))

3 Evaluation Results

There are 19 submissions to the KBQA task, and Table 2 lists the evaluation results.

Table 2. Evaluation results of the KBQA task.

Full size table

There are 9 submissions to the KBQG task, and Table 3 lists the evaluation results.

Table 3. Evaluation results of the KBQG task.

Full size table

There are 3 submissions to the KBQU task, and Table 4 lists the evaluation results.

Table 4. Evaluation results of the KBQU task.

Full size table

4 Conclusion

This paper briefly introduces the overview of this year’s 3 open domain QA shared tasks. In the future, we plan to provide more QA datasets for Chinese QA field. In the future, we will build more datasets for QA research, such as multi-turn QA dataset and cross-lingual QA dataset.

Author information

Authors and Affiliations

Microsoft Research Asia, Beijing, China
Nan Duan

Authors

Nan Duan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nan Duan .

Editor information

Editors and Affiliations

Soochow University, Suzhou, China
Min Zhang
The University of Texas at Dallas, Richardson, Texas, USA
Vincent Ng
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duan, N. (2018). Overview of the NLPCC 2018 Shared Task: Open Domain QA. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11109. Springer, Cham. https://doi.org/10.1007/978-3-319-99501-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-99501-4_43
Published: 14 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99500-7
Online ISBN: 978-3-319-99501-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Overview of the NLPCC 2018 Shared Task: Open Domain QA

Abstract

Similar content being viewed by others

Overview of the NLPCC-ICCPOL 2016 Shared Task: Open Domain Chinese Question Answering

Overview of the NLPCC 2017 Shared Task: Open Domain Chinese Question Answering

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Keywords

1 Background