Abstract
We give the overview of the open domain QA shared task in the NLPCC 2018. In this year, we release three sub-tasks including Chinese knowledge-based question answering (KBQA) task, Chinese knowledge-based question generation (KBQG) task, and English knowledge-based question understanding (KBQU) task. The evaluation results of final submissions from participating teams will be presented in the experimental part.
Access provided by CONRICYT-eBooks. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Background
Question Answering (or QA) is a fundamental task in Artificial Intelligence, whose goal is to build a system that can automatically answer natural language questions. In the last decade, the development of QA techniques has been greatly promoted by both academic and industry fields, and many QA-related topics have been well studied by researchers from all over world.
In order to further advance QA-related research in China, we organize this open domain QA shared task series in the past several years via NLPCC, and in this year, we release following 3 sub-tasks: (1) Chinese Knowledge-based Question Answering (KBQA); (2) Chinese Knowledge-based Question Generation (KBQG); and (3) English Knowledge-based Question Understanding (KBQU). You can see that comparing to previous two shared tasks, we retain the KBQA task and add KBQG and KBQU as two new tasks. The reason of adding these two new tasks is that we think the capabilities of asking questions in a proactive way and understanding user utterances in a deep way are very important to building human-computer interaction engines, such as search engine, chitchat bot, and task bot.
2 Task Description
The NLPCC 2018 open domain QA shared task includes 2 sub-tasks for Chinese language: KBQA and KBQG, and 1 sub-task for English language: KBQU.
2.1 KBQA Task
For KBQA task, we provide a train set and a test set. In train set, both questions and their golden answers are provided. In test set, only questions are provided. The participating teams should predict an answer for each question in test set, based on a given large-scale Chinese KB. If no answer can be predicted for a given question, just set the value of <answer id=”X”> to an empty string. The quality of a KBQA system will be evaluated by answer exact match. An example in train set is given below:
We provide a large-scale Chinese KB to participating teams, and it includes knowledge triples crawled from web. Each knowledge triple has the form: <Subject, Predicate, Object> , where ‘Subject’ denotes a subject entity, ‘Predicate’ denotes a relation, and ‘Object’ denotes an object entity. A sample of knowledge triples is given in Fig. 1, and the statistics of the Chinese KB is given in Table 1.
2.2 KBQG Task
For KBQG task, we provide a train set and a test set. In train set, both triples and their golden questions are provided. In test set, only triples are provided. The participating teams should generate a natural language question for each triple in test set, and this generated question can be answered by the object entity of the given triple. The quality of a KBQG system will be evaluated by BLEU-4. An example in train set is given below:
2.3 KBQU Task
For KBQU task, we provide a train set and a test set. In train set, both questions and their golden logical forms are provided. In test set, only questions are provided. The participating teams should predicate a logical form for each question in test set. The quality of a KBQU system will be evaluated by logical form exact match. An example in train set is given below:
<question id=”X”> | what is fight songs of Maryland |
<logical form id=”X”> | (lambda ?x (sports.team.fight_song Maryland ?x)) |
4 Conclusion
This paper briefly introduces the overview of this year’s 3 open domain QA shared tasks. In the future, we plan to provide more QA datasets for Chinese QA field. In the future, we will build more datasets for QA research, such as multi-turn QA dataset and cross-lingual QA dataset.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Duan, N. (2018). Overview of the NLPCC 2018 Shared Task: Open Domain QA. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11109. Springer, Cham. https://doi.org/10.1007/978-3-319-99501-4_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-99501-4_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99500-7
Online ISBN: 978-3-319-99501-4
eBook Packages: Computer ScienceComputer Science (R0)