Abstract
Recently, Triple Pattern Fragments (tpfs) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate sparql queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the tpfs interface purposely does not support complex constructs such as sparql filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the tpfs interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of sparql queries on multiple implementations, including Elastic Search and case-insensitive fm-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on tpf servers allows users to obtain faster responses for filter-based sparql queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries.
Chapter PDF
Similar content being viewed by others
References
Arias Gallego, M., Corcho, O., Fernández, J.D., Martínez-Prieto, M.A., Suárez-Figueroa, M.C.: Compressing semantic metadata for efficient multimedia retrieval. In: Bielza, C., Salmerón, A., Alonso-Betanzos, A., Hidalgo, J.I., Martínez, L., Troncoso, A., Corchado, E., Corchado, J.M. (eds.) CAEPIA 2013. LNCS, vol. 8109, pp. 12–21. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-40643-0_2
Brisaboa, N.R., Cánovas, R., Claude, F., Martínez-Prieto, M.A., Navarro, G.: Compressed string dictionaries. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 136–147. Springer, Heidelberg (2011)
Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.Y.: sparql web-querying infrastructure: ready for action? In: Proceedings of the 12th International Semantic Web Conference, November 2013. http://springerlink.bibliotecabuap.elogim.com/chapter/10.1007/978-3-642-41338-4_18
Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Tech. Rep. SRC-RR-124, Digital Equipment Corporation (1994)
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: Pellegrini, T., Auer, S., Tochtermann, K., Schaffert, S. (eds.) Networked Knowledge - Networked Media. SCI, vol. 221, pp. 7–24. Springer, Heidelberg (2009)
Ermilov, I., Martin, M., Lehmann, J., Auer, S.: Linked open data statistics: collection and exploitation. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2013. CCIS, vol. 394, pp. 242–249. Springer, Heidelberg (2013). http://dx.doi.org/10.1007/978-3-642-41360-5_19
Feigenbaum, L., Williams, G.T., Clark, K.G., Torres, E.: sparql 1.1 protocol. Recommendation, World Wide Web Consortium, March 2013. http://www.w3.org/TR/sparql11-protocol/
Ferguson, M.P.: FEMTO: fast search of large sequence collections. In: Kärkkäinen, J., Stoye, J. (eds.) CPM 2012. LNCS, vol. 7354, pp. 208–219. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-31265-6_17
Fernández, J.D., Martínez-Prieto, M.A., Gutiérrez, C., Polleres, A., Arias, M.: Binary rdf representation for publication and exchange (hdt). Journal of Web Semantics 19, 22–41, March 2013. http://dx.doi.org/10.1016/j.websem.2013.01.002
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pp. 390–398 (2000)
Freitas, A., Curry, E., O’Riain, S.: A distributional approach for terminological semantic search on the linked data web. In: Proceedings of the 27th Annual acm Symposium on Applied Computing, pp. 384–391 (2012)
Gormley, C., Tong, Z.: Elasticsearch: The Definitive Guide. O’Reilly (2014)
Harris, S., Seaborne, A.: sparql 1.1 query language. Recommendation, World Wide Web Consortium, March 2013. http://www.w3.org/TR/sparql11-query/
Lanthaler, M., Gütl, C.: Hydra: a vocabulary for hypermedia-driven web apis. In: Proceedings of the 6th Workshop on Linked Data on the Web, May 2013. http://ceur-ws.org/Vol-996/papers/ldow2013-paper-03.pdf
Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., Wang, J.: soap2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)
Martínez-Prieto, M.A., Fernández, J.D., Cánovas, R.: Querying rdf dictionaries in compressed space. acm sigapp Applied Computing Review 12(2), 64–77 (2012)
Minack, E., Sauermann, L., Grimnes, G., Fluit, C., Broekstra, J.: The sesame lucenesail: rdf queries with full-text search. nepomuk Consortium, Technical Report 1 (2008)
Minack, E., Siberski, W., Nejdl, W.: Benchmarking fulltext search performance of RDF stores. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 81–95. Springer, Heidelberg (2009)
Nelson, M.: Data compression with the Burrows-Wheeler transform. Dr. Dobb’s Journal 9, 46–50 (1996)
Rietveld, L., Verborgh, R., Beek, W., Vander Sande, M., Schlobach, S.: Linked data-as-a-service: the semantic web redeployed. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 471–487. Springer, Heidelberg (2015)
Sadakane, K.: A modified Burrows-Wheeler transformation for case-insensitive search with application to suffix array compression. In: Proceedings of the Data Compression Conference, p. 548 (1999)
Van Herwegen, J., Verborgh, R., Mannens, E., Van de Walle, R.: Query execution optimization for clients of triple pattern fragments. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 302–318. Springer, Heidelberg (2015)
van Hooland, S., Verborgh, R., De Wilde, M., Hercher, J., Mannens, E., Van de Walle, R.: Evaluating the success of vocabulary reconciliation for cultural heritage collections. Journal of the American Society for Information Science and Technology 64(3), 464–479 (2013). http://freeyourmetadata.org/publications/freeyourmetadata.pdf
Verborgh, R.: Triple Pattern Fragments. Unofficial draft, Hydra w3c Community Group. http://www.hydra-cg.com/spec/latest/triple-pattern-fragments/
Verborgh, R., et al.: Querying datasets on the web with high availability. In: Mika, P., et al. (eds.) The Semantic Web – ISWC 2014. LNCS, vol. 8796, pp. 180–196. Springer, Heidelberg (2014). http://linkeddatafragments.org/publications/iswc2014.pdf
Verborgh, R., Mannens, E., Van de Walle, R.: Initial usage analysis of DBpedia’s triple pattern fragments. In: Proceedings of the 5th Workshop on Usage Analysis and the Web of Data, June 2015
Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., Van de Walle, R.: Web-scale querying through linked data fragments. In: Proceedings of the 7th Workshop on Linked Data on the Web, April 2014. http://events.linkeddata.org/ldow2014/papers/ldow2014_paper_04.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Van Herwegen, J., De Vocht, L., Verborgh, R., Mannens, E., Van de Walle, R. (2015). Substring Filtering for Low-Cost Linked Data Interfaces. In: Arenas, M., et al. The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science(), vol 9366. Springer, Cham. https://doi.org/10.1007/978-3-319-25007-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-25007-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25006-9
Online ISBN: 978-3-319-25007-6
eBook Packages: Computer ScienceComputer Science (R0)