Abstract
The use of graphs in analytic environments is getting more and more widespread, with applications in many different environments like social network analysis, fraud detection, industrial management, knowledge analysis, etc. Graph databases are one important solution to consider in the management of large datasets. The course will be oriented to tackle four important aspects of graph management. First, to give a characterization of graphs and the most common operations applied on them. Second, to review the technologies for graph management and focus on the particular case of Sparksee. Third, to analyze in depth some important applications and how graphs are used to solve them. Fourth, to understand the use of benchmarking to make the requirements of the user compatible with the growth of the technologies for graph management.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Leskovec, J., Huttenlocher, D.P., Kleinberg, J.M.: Signed networks in social media. In: CHI, pp. 1361–1370 (2010)
Goertzel, B.: OpenCogPrime: A cognitive synergy based architecture for artificial general intelligence. In: IEEE ICCI, pp. 60–68 (2009)
Newman, M.: Networks: An Introduction. Oxford University Press, Inc., New York (2010)
Levene, M., Poulovassilis, A.: The hypernode model: A graph-theoretic approach to integrating data and computation. In: FMLDO, pp. 55–77 (1989)
Ërdos, P., Rényi, A.: On random graphs. Mathematicae 6, 290–297 (1959)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704 (2008)
Flickr Blog: Six billion (retrieved on march 2014), http://blog.flickr.net/en/2011/08/04/6000000000/
Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, pp. 251–262 (1999)
McGlohon, M., Akoglu, L., Faloutsos, C.: Weighted graphs and disconnected components: patterns and a generator. In: KDD, pp. 524–532 (2008)
Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38 (2006)
Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graph evolution: Densification and shrinking diameters. TKDD 1 (2007)
SNAP: (Stanford large network dataset collection), http://snap.stanford.edu/data/index.html
Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Nin, J., Sánchez-Martínez, M.-A., Larriba-Pey, J.-L.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM, pp. 573–582 (2007)
Martínez-Bazan, N., Aguila-Lorente, M.A., Muntés-Mulero, V., Dominguez-Sal, D., Gómez-Villamor, S., Larriba-Pey, J.-L.: Efficient graph management based on bitmap indices. In: IDEAS, pp. 110–119 (2012)
Nelson, J., Myers, B., Hunter, A.H., Briggs, P., Ceze, L., Ebeling, C., Grossman, D., Kahan, S., Oskin, M.: Crunching large graphs with commodity processors. In: HotPar (2011)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)
Stutz, P., Bernstein, A., Cohen, W.: Signal/Collect: Graph algorithms for the (Semantic) web. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 764–780. Springer, Heidelberg (2010)
Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: Wtf: The who to follow service at twitter. In: WWW, pp. 505–514 (2013)
Averbuch, A., Neumann, M.: Partitioning graph databases-a quantitative evaluation. arXiv preprint arXiv:1301.5121 (2013)
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization and identification of web communities. IEEE Computer 35(3), 66–71 (2002)
Girvan, M., Newman, M.: Community structure in social and biological networks. National Academy of Sciences 99(12), 7821–7826 (2002)
Schwartz, M., Wood, D.: Discovering shared interests among people using graph analysis of global electronic mail traffic. Communications of the ACM 36, 78–89 (1992)
Prat-Pérez, A., Dominguez-Sal, D., Larriba-Pey, J.-L.: High quality, scalable and parallel community detection for large real graphs. In: To be published in WWW (2014)
Bleiholder, J., Naumann, F.: Data fusion. ACM Computing Surveys (CSUR) 41, 1 (2008)
Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. on Knowledge and Data Engineering 24, 1537–1555 (2012)
Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: ICDE, pp. 952–963 (2009)
Whang, S.E., Garcia-Molina, H.: Entity resolution with evolving rules. PVLDB 3, 1326–1337 (2010)
Whang, S.E., Benjelloun, O., Garcia-Molina, H.: Generic entity resolution with negative rules. VLDB Journal 18, 1261–1277 (2009)
Leitão, L., Calado, P., Weis, M.: Structure-based inference of xml similarity for fuzzy duplicate detection. In: CIKM, pp. 293–302 (2007)
Rastogi, V., Dalvi, N., Garofalakis, M.: Large-scale collective entity matching. PVLDB 4, 208–218 (2011)
Thor, A., Rahm, E.: MOMA - A Mapping-based Object Matching System. In: CIDR, pp. 247–258 (2007)
Transaction Processing Performance Council (TPC): TPC benchmark website, http://www.tpc.org
Cattell, R., Skeen, J.: Object operations benchmark. ACM Trans. Database Syst. 17, 1–31 (1992)
Carey, M.J., DeWitt, D.J., Naughton, J.F.: The oo7 benchmark. In: SIGMOD Conference, pp. 12–21 (1993)
Bader, D., Feo, J., Gilbert, J., Kepner, J., Koetser, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC Scalable Graph Analysis Benchmark v1.0. HPC Graph Analysis (2009)
Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: SDM, pp. 442–446 (2004)
Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazan, N., Larriba-Pey, J.-L.: Survey of graph database performance on the hpc scalable graph analysis benchmark. In: WAIM Workshops, pp. 37–48 (2010)
Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, V., Baleta, P., Larriba-Pey, J.L.: A discussion on the design of graph database benchmarks. In: Nambiar, R., Poess, M. (eds.) TPCTC 2010. LNCS, vol. 6417, pp. 25–40. Springer, Heidelberg (2011)
Ciglan, M., Averbuch, A., Hluchý, L.: Benchmarking traversal operations over graph databases. In: ICDE Workshops, pp. 186–189 (2012)
Tinkerpop: Open source property graph software stack, http://www.tinkerpop.com
Graph 500 Website: The graph 500 list, http://www.graph500.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Larriba-Pey, J.L., Martínez-Bazán, N., Domínguez-Sal, D. (2014). Introduction to Graph Databases. In: Koubarakis, M., et al. Reasoning Web. Reasoning on the Web in the Big Data Era. Reasoning Web 2014. Lecture Notes in Computer Science, vol 8714. Springer, Cham. https://doi.org/10.1007/978-3-319-10587-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-10587-1_4
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10586-4
Online ISBN: 978-3-319-10587-1
eBook Packages: Computer ScienceComputer Science (R0)