Abstract
In this paper we tackle the problem of answering SPARQL queries over virtually integrated databases. We assume that the entity resolution problem has already been solved and explicit information is available about which records in the different databases refer to the same real world entity. Surprisingly, to the best of our knowledge, there has been no attempt to extend the standard Ontology-Based Data Access (OBDA) setting to take into account these DB links for SPARQL query-answering and consistency checking. This is partly because the OWL built-in owl:sameAs property, the most natural representation of links between data sets, is not included in OWL 2 QL, the de facto ontology language for OBDA. We formally treat several fundamental questions in this context: how links over database identifiers can be represented in terms of owl:sameAs statements, how to recover rewritability of SPARQL into SQL (lost because of owl:sameAs statements), and how to check consistency. Moreover, we investigate how our solution can be made to scale up to large enterprise datasets. We have implemented the approach, and carried out an extensive set of experiments showing its scalability.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Artale, A., Calvanese, D., Kontchakov, R., Zakharyaschev, M.: The DL-Lite family and relations. J. of Artificial Intelligence Research 36, 1–69 (2009)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: The DL-Lite family. J. Autom. Reasoning 39(3), 385–429 (2007)
Calvanese, D., Giese, M., Hovland, D., Rezk, M.: Ontology-based integration of cross-linked datasets (2015). http://www.inf.unibz.it/~mrezk/pdf/techRep-ISWC15.pdf (accessed April 30, 2015)
Das, S., Sundara, S., Cyganiak, R.: R2RML: RDB to RDF mapping language. W3C Recommendation, W3C (September 2012). http://www.w3.org/TR/r2rml/
DeWitt, D.J.: The wisconsin benchmark: past, present, and future. In: Gray, J. (ed.) The Benchmark Handbook. Morgan Kaufmann (1992)
Doan, A., Halevy, A.Y., Ives, Z.G.: Principles of Data Integration. Morgan Kaufmann (2012)
Ioannou, E., Nejdl, W., Niederée, C., Velegrakis, Y.: On-the-fly entity-aware query processing in the presence of linkage. PVLDB 3(1), 429–438 (2010)
Kontchakov, R., Lutz, C., Toman, D., Wolter, F., Zakharyaschev, M.: The combined approach to ontology-based data access. In: Proc. of IJCAI 2011, pp. 2656–2661 (2011)
Kontchakov, R., Rezk, M., Rodríguez-Muro, M., Xiao, G., Zakharyaschev, M.: Answering SPARQL queries over databases under OWL 2 QL entailment regime. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 552–567. Springer, Heidelberg (2014)
Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer-Verlag New York Inc, Secaucus (1993)
Marnette, B.: Generalized schema-mappings: from termination to tractability. In: PODS 2009, pp. 13–22. ACM, New York (2009)
Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles, 2nd edn. W3C Recommendation, W3C (December 2012). http://www.w3.org/TR/owl2-profiles/
Motik, B., Nenov, Y., Piro, R.E.F., Horrocks, I.: Handling owl:sameAs via rewriting. In: Bonet, B., Koenig, S. (eds) Proc. 29th AAAI, pp. 231–237. AAAI Press (2015)
Rodríguez-Muro, M., Kontchakov, R., Zakharyaschev, M.: Ontology-based data access: Ontop of databases. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 558–573. Springer, Heidelberg (2013)
Rodriguez-Muro, M., Rezk, M.: Efficient SPARQL-to-SQL with R2RML mappings. J. of Web Semantics 33, 141–169 (2015)
Schlegel, K., Stegmaier, F., Bayerl, S., Granitzer, M., Kosch, H.: Balloon fusion: SPARQL rewriting based on unified co-reference information. In: Proc. of the 30th Int. Conf. on Data Engineering Workshops (ICDE 2014), pp. 254–259. IEEE (2014)
Tsangaris, M.M., Kakaletris, G., Kllapi, H., Papanikos, G., Pentaris, F., Polydoras, P., Sitaridi, E., Stoumpos, V., Ioannidis, Y.E.: Dataflow processing and optimization on grid and cloud infrastructures. IEEE Bull. on Data Engineering 32(1), 67–74 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Calvanese, D., Giese, M., Hovland, D., Rezk, M. (2015). Ontology-Based Integration of Cross-Linked Datasets. In: Arenas, M., et al. The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science(), vol 9366. Springer, Cham. https://doi.org/10.1007/978-3-319-25007-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-25007-6_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25006-9
Online ISBN: 978-3-319-25007-6
eBook Packages: Computer ScienceComputer Science (R0)