WDAqua-core0: A Question Answering Component for the Research Community

Diefenbach, Dennis; Singh, Kamal; Maret, Pierre

doi:10.1007/978-3-319-69146-6_8

Dennis Diefenbach¹²,
Kamal Singh¹² &
Pierre Maret¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 769))

Included in the following conference series:

Semantic Web Evaluation Challenge

576 Accesses
30 Citations

Abstract

We describe and present a new Question Answering (QA) component that can be easily used by the QA research community.

It can be used to answer questions over DBpedia and Wikidata. The language support over DBpedia is restricted to English, while it can be used to answer questions in 4 different languages over Wikidata namely English, French, German and Italian. Moreover it supports both full natural language queries as well as keyword queries.

We describe the interfaces to access and reuse it and the services it can be combined with. Moreover we show the evaluation results we achieved on the QALD-7 benchmark.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Modeling of the Question Answering Task in the YodaQA System

RuBQ 2.0: An Innovated Russian Question Answering Dataset

Keywords

1 Introduction

Question answering (QA) is a very old research field in computer science. In the last two decades, thanks to the development of the Semantic Web, a lot of new structured data has become available on the web in the form of knowledge bases (KBs). Nowadays, there are KBs about media, publications, geography, life-science and more^{Footnote 1}. The idea behind a QA system over KBs is to find the information, in a KB, requested by the user using natural language. This is generally addressed by translating a natural question to a SPARQL query that can be used to retrieve the desired information. We present here a QA component to answer questions over DBpedia and Wikidata that can answer both full and keyword natural language questions. It is integrated in the Qanary Ecosystem [4] so that first, it can be easily reused by the research community and second, it takes advantage of the services available in Qanary.

2 Related Work

In the context of QA, a large number of systems have been developed in the last years. For example, more than twenty QA systems were evaluated against the QALD benchmark^{Footnote 2}. While many systems are querying DBpedia, we are only aware of one system querying wikidata, namely Platypus ^{Footnote 3}. Moreover most of the works address full natural language questions while only few address keyword questions. One exception is SINA[7].

The fact that QA systems often reuse existing techniques lead to the idea of developing QA systems in a modular way. Four frameworks tried to achieve this goal: QALL-ME [5], openQA [6], the Open Knowledge Base and Question-Answering (OKBQA) challenge^{Footnote 4} and Qanary [1, 4, 8]. We integrated our QA component into the Qanary Ecosystem since it makes it easily reusable by the research community and offers a series of off-the-shelf services related to QA systems.

3 Description of WDAqua-core0

Our SPARQL creation algorithm uses a combinatorial approach based on the semantics encoded in the underlying KB. The full details will be disclosed in an upcoming publication as this is only a challenge submission. In the following we briefly describe the capabilities of WDAqua-core0. WDAqua-core0 can answer questions on both DBpedia and Wikidata. Note that the Wikidata dump^{Footnote 5} contains binary and non-binary relationships. An example of a non-binary relationships expressing that the capital of Germany was Berlin from 1990 is expressed in two versions:

The first version uses properties with the namespaces p and ps while the second loses the temporal information and uses the namespace wdt. WDAqua-core0 is querying only the triples containing properties with namespace wdt. WDAqua-core0 can answer both keyword questions and questions in natural language. The complexity of the generated queries is limited to queries containing at most two triple patterns. The generated queries can be of type SELECT or ASK. The modifiers are limited to the COUNT operator. Thus, the questions with superlatives and comparatives can in general not be answered. Finally it supports English on DBpedia and 4 different language over Wikidata, namely English, French, German and Italian. The evaluation is shown in Sect. 5.

4 Integration in Qanary

Qanary is a framework to integrate QA components with the goal to make existing research in the QA field reusable. The QA component presented here is integrated into Qanary. A running version is registered into the Qanary service running under:

http://www.wdaqua.eu/qanary

In particular the component can be executed through RestFul interfaces. To run the service over a new question the RestFul interface under:

http://www.wdaqua.eu/qanary/startquestionansweringwithtextquestion

can be used. Besides the generated answer, the top-30 generated queries can also be retrieved.

The integration into Qanary allows the combination of WDAqua-core0 with the other components and services that are already integrated into Qanary. In particular it can be combined with a speech recognition component and a language detection component. Additionally it can be used together with a number of services that are constructed around Qanary. These include a reusable front-end called Trill [2]. A demo of Trill that in the back-end uses WDAqua-core0 can be found under www.wdaqua.eu/qa. Figure 1 shows a screen-shot of Trill. Moreover WDAqua-core0 can be used together with some interfaces for user-feedback that are integrated into Trill [3]. One such feedback-interface can be seen in Fig. 2. As a consequence WDAqua-core0 can be used by end-users and can for example be used to drive forward research in the domain of human-computer interaction. Finally Qanary has an interface that allows QA pipelines to be evaluated using Gerbil for QA^{Footnote 6}. This means that WDAqua-core0 can be evaluated by the research community at all time especially when new benchmarks arise.

5 Evaluation over QALD-7

In this section we show the results of WDAqua-core0 over QALD-7 task 1 and task 4. We evaluate both over the keyword and the full-natural language questions.

Moreover, we extended the training set of task 4 and introduced a new type of multilingual QA benchmark. QALD-7 task 1 requires to answer questions in multiple languages using data contained in the English DBpedia. In particular taking the Italian DBpedia to answer the Italian questions of QALD-7 task 1 does not work in general. The fact that the Italian questions must be answered using the English dataset, forces the systems to use translations. Instead we translate the questions of the QALD-7 task 4 into French, German and Italian and try to answer them using Wikidata. This is fundamentally different since in Wikidata the knowledge is the same and only the labels change. In particular a translation is not required, one can answer the Italian questions using an Italian dataset.

The global (or macro) precision, recall and F-measure achieved over QALD-7 can be found in Table 1. Note that WDAqua-core0 does not use a machine learning algorithm so there is not a problem of over-fitting the dataset.

Table 1. The table shows the results of WDAqua-core0 over the QALD-7 training set.

Full size table

6 Conclusion

We have presented a QA component integrated into the Qanary Ecosystem that can be easily reused by the QA community. In particular it can used to push forward research in directions like the integration of speech recognition systems with QA systems and the interaction with users.

We have evaluated the component against QALD-7 in multiple aspects. We have shown the performance over both DBpedia and Wikidata with respect to keyword and full-natural language queries. Moreover, we have introduced a new type of multilingual QA benchmark that does not require translation but where the questions and the KB are in the same language. We have shown our results over this new type of multilingual QA benchmark.

Notes

References

Both, A., Diefenbach, D., Singh, K., Shekarpour, S., Cherix, D., Lange, C.: Qanary a methodology for vocabulary-driven open question answering systems. In: ESWC 2016 (2016)
Google Scholar
Diefenbach, D., Amjad, S., Both, A., Singh, K., Maret, P.: Trill: a reusable front-end for QA systems. In: ESWC P&D (2017)
Google Scholar
Diefenbach, D., Hormozi, N., Amjad, S., Both, A.: Introducing feedback in qanary: How users can interact with QA systems. In: ESWC P&D (2017)
Google Scholar
Diefenbach, D., Singh, K., Both, A., Cherix, D., Lange, C., Auer, S.: The qanary ecosystem: getting new insights by composing question answering pipelines. In: Cabot, J., Virgilio, R., Torlone, R. (eds.) ICWE 2017. LNCS, vol. 10360, pp. 171–189. Springer, Cham (2017). doi:10.1007/978-3-319-60131-1_10
Chapter Google Scholar
Ferrández, Ó., Spurk, C., Kouylekov, M., Dornescu, I., et al.: The QALL-ME framework: A specifiable-domain multilingual Question Answering architecture. J. Web Sem. 9(2) (2011). Elsevier
Google Scholar
Marx, E., Usbeck, R., Ngonga Ngomo, A., Höffner, K., Lehmann, J., Auer, S.: Towards an open question answering architecture. In: SEMANTiCS (2014)
Google Scholar
Shekarpour, S., Marx, E., Ngomo, A.C.N., Auer, S.: Sina: Semantic interpretation of user queries for question answering on interlinked data. Web Semant. Sci. Serv. Agents World Wide Web 30 (2015)
Google Scholar
Singh, K., Both, A., Diefenbach, D., Shekarpour, S.: Towards a message-driven vocabulary for promoting the interoperability of question answering systems. In: ICSC 2016 (2016)
Google Scholar

Download references

Acknowledgments

Parts of this work received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement No. 642795, project: Answering Questions using Web Data (WDAqua).

Author information

Authors and Affiliations

Université de Lyon, CNRS UMR 5516 Laboratoire Hubert Curien, 42023, Saint-Etienne, France
Dennis Diefenbach, Kamal Singh & Pierre Maret

Authors

Dennis Diefenbach
View author publications
You can also search for this author in PubMed Google Scholar
Kamal Singh
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Maret
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dennis Diefenbach .

Editor information

Editors and Affiliations

Fondazione Bruno Kessler , Povo, Trento, Italy
Mauro Dragoni
University of Oxford , Oxford, United Kingdom
Monika Solanki
Linköping University , Linköping, Sweden
Eva Blomqvist

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diefenbach, D., Singh, K., Maret, P. (2017). WDAqua-core0: A Question Answering Component for the Research Community. In: Dragoni, M., Solanki, M., Blomqvist, E. (eds) Semantic Web Challenges. SemWebEval 2017. Communications in Computer and Information Science, vol 769. Springer, Cham. https://doi.org/10.1007/978-3-319-69146-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-69146-6_8
Published: 31 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69145-9
Online ISBN: 978-3-319-69146-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

WDAqua-core0: A Question Answering Component for the Research Community

Abstract

Similar content being viewed by others

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Modeling of the Question Answering Task in the YodaQA System

RuBQ 2.0: An Innovated Russian Question Answering Dataset

Keywords

1 Introduction

2 Related Work

3 Description of WDAqua-core0

4 Integration in Qanary

5 Evaluation over QALD-7

6 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

WDAqua-core0: A Question Answering Component for the Research Community

Abstract

Similar content being viewed by others

Overview of the NLPCC 2015 Shared Task: Open Domain QA

Modeling of the Question Answering Task in the YodaQA System

RuBQ 2.0: An Innovated Russian Question Answering Dataset

Keywords

1 Introduction

2 Related Work

3 Description of WDAqua-core0

4 Integration in Qanary

5 Evaluation over QALD-7

6 Conclusion

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation