Selecting Materialized Views for RDF Data

Castillo, Roger; Leser, Ulf

doi:10.1007/978-3-642-16985-4_12

Roger Castillo¹⁸ &
Ulf Leser¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6385))

Included in the following conference series:

International Conference on Web Engineering

2421 Accesses
10 Citations

Abstract

In the design of a relational database, the administrator has to decide, given a fixed or estimated workload, which indexes should be created. This so called index selection problem is an non-trivial optimization problem in relational databases. In this paper we describe a novel approach for index selection on RDF data sets. We propose an algorithm to automatically suggest a set of indexes as materialized views based on a workload of SPARQL queries. The selected set of indexes aims to decrease the cost of the workload. We provide a cost model to estimate the potential impact of candidate indexes on query performance and an algorithm to select an optimal set of indexes. This algorithm may be integrated into an existing SPARQL query engine. We experimentally evaluate our approach on a standard query processor. We claim that our approach is the first comprehensive suggestion for the index selection problem in RDF.

Download to read the full chapter text

Chapter PDF

Optimizing Aggregate SPARQL Queries Using Materialized RDF Views

ROSIE: Runtime Optimization of SPARQL Queries over RDF Using Incremental Evaluation

Efficient SPARQL Query Evaluation via Automatic Data Partitioning

Keywords

References

Comer, D.: The difficulty of optimum index selection. ACM Trans. Database Syst. 3(4), 440–445 (1978)
Article Google Scholar
Manola, F., Miller, E.: RDF Primer, W3C Recommendation (February 2004)
Google Scholar
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF, W3C Recommendation (April 2008)
Google Scholar
Castillo, R., Leser, U., Rothe, C.: RDFMatView: Indexing RDF Data for SPARQL Queries. Technical report, Humboldt University of Berlin (2010)
Google Scholar
UniProt: RDF Dataset, http://dev.isb-sib.ch/projects/uniprot-rdf/
W3C SWEO Community Project, Linking open data on the semantic web, http://esw.w3.org/topic/sweoig/taskforces/communityprojects/linkingopendata/
Stenzhorn, H., Srinivas, K., Samwald, M., Ruttenberg, A.: Simplifying access to large-scale health care and life sciences datasets. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 864–868. Springer, Heidelberg (2008)
Chapter Google Scholar
Caprara, A., Fischetti, M., Maio, D.: Exact and approximate algorithms for the index selection problem in physical database design. IEEE Transactions on Knowledge and Data Engineering 7(6), 955–967 (1995)
Article Google Scholar
Chaudhuri, S., Narasayya, V.R.: An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In: VLDB 1997: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 146–155. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Agrawal, S., Chaudhuri, S., Narasayya, V.R.: Automated Selection of Materialized Views and Indexes in SQL Databases. In: VLDB 2000: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 496–505. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Groppe, J., Groppe, S., Ebers, S., Linnemann, V.: Efficient Processing of SPARQL Joins in Memory by Dynamically Restricting Triple Patterns. In: Proceedings of the 24th ACM Symposium on Applied Computing (ACM SAC 2009), Honolulu, Hawaii, March 8-12, pp. 1231–1238. ACM, New York (2009)
Google Scholar
Groppe, S., Groppe, J., Linnemann, V.: Using an Index of Precomputed Joins in order to speed up SPARQL Processing. In: Cardoso, J., Cordeiro, J., Filipe, J. (eds.) Proceedings 9th International Conference on Enterprise Information Systems, ICEIS 2007 (1), Funchal, Madeira, Portugal, June 12-16, DISI, pp. 13–20. INSTICC (2007)
Google Scholar
Jinghua Groppe, S.G., Kolbaum, J.: Optimization of SPARQL by Using coreSPARQL. In: Cordeiro, J., Filipe, J. (eds.) Proceedings of the 11th International Conference on Enterprise Information Systems (ICEIS 2009), Milano, Italien, Mai 6 - 10, DISI, pp. 107–112. INSTICC (2009)
Google Scholar
Olaf Hartig, R.H.: The SPARQL Query Graph Model for Query Optimization. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 564–578. Springer, Heidelberg (2007)
Chapter Google Scholar
Matono, A., Amagasa, T., Yoshikawa, M., Uemura, S.: An Indexing Scheme for RDF and RDF Schema based on Suffix Arrays. In: Proceedings of SWDB 2003, pp. 151–168 (2003)
Google Scholar
Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A Graph Based RDF Index. In: AAAI, pp. 1465–1470 (2007)
Google Scholar
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. Proc. VLDB Endow. 1(1), 1008–1019 (2008)
Article Google Scholar
Yan, X., Yu, P.S., Han, J.: Graph indexing based on discriminative frequent structure analysis. ACM Trans. Database Syst. 30(4), 960–993 (2005)
Article Google Scholar
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endow. 1(1), 647–659 (2008)
Article Google Scholar
Halevy, A.Y.: Answering queries using views: A survey. The VLDB Journal 10(4), 270–294 (2001)
Article MATH Google Scholar
Dantzig, G.: Linear Programming and Extensions. Princeton University Press, Princeton (August 1998)
MATH Google Scholar
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. International Journal on Semantic Web and Information Systems - Special Issue on Scalability and Performance of Semantic Web Systems (2009)
Google Scholar
ARQJena: ARQ - A SPARQL Processor for Jena (2010), http://jena.sourceforge.net/ARQ/

Download references

Author information

Authors and Affiliations

Humboldt Universtiy of Berlin, Berlin, Germany
Roger Castillo & Ulf Leser

Authors

Roger Castillo
View author publications
You can also search for this author in PubMed Google Scholar
Ulf Leser
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Trento, Via Sommarive 14, 38100, Povo, Italy
Florian Daniel
CREATE-NET, Via alla Cascata 56/D, 38123, Povo, Trento, Italy
Federico Michele Facca

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castillo, R., Leser, U. (2010). Selecting Materialized Views for RDF Data. In: Daniel, F., Facca, F.M. (eds) Current Trends in Web Engineering. ICWE 2010. Lecture Notes in Computer Science, vol 6385. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16985-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-16985-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16984-7
Online ISBN: 978-3-642-16985-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Selecting Materialized Views for RDF Data

Abstract

Chapter PDF

Similar content being viewed by others

Optimizing Aggregate SPARQL Queries Using Materialized RDF Views

ROSIE: Runtime Optimization of SPARQL Queries over RDF Using Incremental Evaluation

Efficient SPARQL Query Evaluation via Automatic Data Partitioning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Selecting Materialized Views for RDF Data

Abstract

Chapter PDF

Similar content being viewed by others

Optimizing Aggregate SPARQL Queries Using Materialized RDF Views

ROSIE: Runtime Optimization of SPARQL Queries over RDF Using Incremental Evaluation

Efficient SPARQL Query Evaluation via Automatic Data Partitioning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation