Abstract
The Resource Description Framework (RDF) is the W3C recommended standard for data on the semantic web, while the SPARQL Protocol and RDF Query Language (SPARQL) is the query language that retrieves RDF triples. RDF data often contain valuable information that can only be queried through filter functions. The SPARQL query language for RDF can include filter clauses in order to define specific data criteria, such as full-text searches, numerical filtering, and constraints and relationships between data resources. However, the downside of executing SPARQL filter queries is the frequently slow query execution times. This paper presents a SPARQL filter query-processing engine for conventional triplestores called FILT (Filtering Indexed Lucene Triples), built on top of the Apache Lucene framework for storing and retrieving indexed documents, compatible with unmodified SPARQL queries. The objective of FILT was to decrease the query execution time of SPARQL filter queries. This aspect was evaluated by performing a benchmark test of FILT compared to the Joseki triplestore, focusing on two different use-cases; SPARQL regular expression filtering in medical data, and SPARQL numerical/logical filtering of geo-coordinates in geographical locations.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Apache Jena ARQ, ARQ - A SPARQL Processor for Jena (2012), http://incubator.apache.org/jena/documentation/larq/index.html
Apache Jena LARQ (2012), LARQ - adding free text searches to SPARQL, http://incubator.apache.org/jena/documentation/query/index.html
Apache Lucene Core, Apache Lucene Core (2011), http://lucene.apache.org/core/
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. The Proceedings of the International Journal on Semantic Web and Information Systems (IJSWIS) 5(2), 24 (2009), http://www.igi-global.com/article/berlin-sparql-benchmark/4112 , doi:10.4018/jswis.2009040101
Castillo, R., Rothe, C., Leser, U.: RDFMatView: Indexing RDF Data Using Materialized SPARQL Queries. In: Proceedings of the 6th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2010), vol. 669, pp. 80–95 (2010), http://ceur-ws.org/Vol-669
Delbru, R., Campinas, S., Tummarello, G.: Searching Web Data: an Entity Retrieval and High-Performance Indexing Model. Web Semantics: Science, Services and Agents on the World Wide Web, Web-Scale Semantic Information Processing 10, 33–58 (2012), http://www.sciencedirect.com/science/article/pii/S1570826811000230 , doi:10.1016/j.websem.2011.04.004
Delbru, R., Toupikov, N., Catasta, M., Tummarello, G.: A Node Indexing Scheme for Web Entity Retrieval. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 240–256. Springer, Heidelberg (2010), http://dx.doi.org/10.1007/978-3-642-13489-0_17 , doi:10.1007/978-3-642-13489-0_17.
Manola, F., Miller, E.: RDF Primer, W3C Recommendation (2004), http://www.w3.org/TR/rdf-primer/
Minack, E., Sauermann, L., Grimnes, G., Fluit, C., Broekstra, J.: The Sesame LuceneSail: RDF Queries with Full-text Search. NEPOMUK Technical Report 2008-1 (2008), http://www.dfki.uni-kl.de/~sauermann/papers/Minack%2B2008.pdf
NEPOMUK, NEPOMUK - The Social Semantic Desktop - FP6-027705 (2008), http://nepomuk.semanticdesktop.org/nepomuk/
OpenLink Software, OpenLink Virtuoso Universal Server: Documentation. RDF and Geometry (2009), http://docs.openlinksw.com/virtuoso/rdfsparqlgeospat.html (retrieved May 13, 2012)
Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice.com: A Document-oriented Lookup Index for Open Linked Data. Proceedings of the International Journal of Metadata, Semantics and Ontologies 3(1/2008), 37–52 (2008), http://inderscience.metapress.com/content/3518208222365647 , doi:10.1504/IJMSO.2008.021204
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C working draft, 4 (January 2008), http://www.w3.org/TR/rdf-sparql-query
Wang, H., Liu, Q., Penin, T., Fu, L., Zhang, L., Tran, T., Yu, Y., Pan, Y.: Semplore: A scalable IR approach to search the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 177–188 (2009), http://www.sciencedirect.com/science/article/pii/S1570826809000262 , doi:10.1016/j.websem.2009.08.001
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stuhr, M., Veres, C. (2013). FILT – Filtering Indexed Lucene Triples – A SPARQL Filter Query Processing Engine–. In: Larsen, H.L., Martin-Bautista, M.J., Vila, M.A., Andreasen, T., Christiansen, H. (eds) Flexible Query Answering Systems. FQAS 2013. Lecture Notes in Computer Science(), vol 8132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40769-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-40769-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40768-0
Online ISBN: 978-3-642-40769-7
eBook Packages: Computer ScienceComputer Science (R0)