Abstract
Huge RDF datasets are currently exchanged on textual RDF formats, hence consumers need to post-process them using RDF stores for local consumption, such as indexing and SPARQL query. This results in a painful task requiring a great effort in terms of time and computational resources. A first approach to lightweight data exchange is a compact (binary) RDF serialization format called HDT. In this paper, we show how to enhance the exchanged HDT with additional structures to support some basic forms of SPARQL query resolution without the need of ”unpacking” the data. Experiments show that i) with an exchanging efficiency that outperforms universal compression, ii) post-processing now becomes a fast process which iii) provides competitive query performance at consumption.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Compact Data Structures Library (libcds), http://libcds.recoded.cl/
SPARQL Query Language for RDF. W3C Recomm. (2008), http://www.w3.org/TR/rdf-sparql-query/
Turtle-Terse RDF Triple Language. W3C Team Subm. (2008), http://www.w3.org/TeamSubmission/turtle/
Notation3. W3C Design Issues (1998), http://www.w3.org/DesignIssues/Notation3
RDF/XML Syntax. W3C Recomm. (2004), http://www.w3.org/TR/REC-rdf-syntax/
Binary RDF Representation for Publication and Exchange (HDT). W3C Member Subm. (2011), http://www.w3.org/Submission/2011/03/
Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked Data On the Web (LDOW 2008). In: Proc. of WWW, pp. 1265–1266 (2008)
Brisaboa, N.R., Cánovas, R., Claude, F., Martínez-Prieto, M.A., Navarro, G.: Compressed String Dictionaries. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 136–147. Springer, Heidelberg (2011)
Ding, L., Finin, T.: Characterizing the Semantic Web on the Web. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 242–257. Springer, Heidelberg (2006)
Erling, O., Mikhailov, I.: RDF Support in the Virtuoso DBMS. In: Proc. of CSSW, pp. 59–68 (2007)
Fernández, J.D., Martínez-Prieto, M.A., Gutierrez, C.: Compact Representation of Large RDF Data Sets for Publishing and Exchange. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 193–208. Springer, Heidelberg (2010)
González, R., Grabowski, S., Mäkinen, V., Navarro, G.: Practical Implementation of Rank and Select Queries. In: Proc. of WEA, pp. 27–38 (2005)
Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. of SODA, pp. 841–850 (2003)
Le-Phuoc, D., Parreira, J.X., Reynolds, V., Hauswirth, M.: RDF On the Go: An RDF Storage and Query Processor for Mobile Devices. In: Proc. of ISWC (2010), http://iswc2010.semanticweb.org/pdf/503.pdf
Martínez-Prieto, M., Fernández, J., Cánovas, R.: Compression of RDF Dictionaries. In: Proc. of SAC (2012), http://dataweb.infor.uva.es/sac2012.pdf
Navarro, G., Mäkinen, V.: Compressed Full-Text Indexes. ACM Comput. Surv. 39(1), art. 2 (2007)
Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management of RDF data. The VLDB Journal 19(1), 91–113 (2010)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for Semantic Web Data Management. Proc. of the VLDB Endowment 1(1), 1008–1019 (2008)
Witten, I.H., Moffat, A., Bell, T.C.: Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martínez-Prieto, M.A., Arias Gallego, M., Fernández, J.D. (2012). Exchange and Consumption of Huge RDF Data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds) The Semantic Web: Research and Applications. ESWC 2012. Lecture Notes in Computer Science, vol 7295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-30284-8_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30283-1
Online ISBN: 978-3-642-30284-8
eBook Packages: Computer ScienceComputer Science (R0)