Abstract
Migrating large-scale data sets (e.g. social graphs) from cluster to cluster and meanwhile providing high system uptime is a challenge task. It requires fast bulk import speed. We address this problem by introducing our “Demand-driven Bulk Loading” scheme based on the data/query distributions tracked from Facebook’s social graphs. A client-side coordinator and a hybrid store which consists of both MySQL and HBase engines work together to deliver fast availability to small, “hot” data in MySQL and incremental availability to massive, “cold” data in HBase on demand. The experimental results show that our approach enables the fastest system’s starting time while guaranteeing high query throughputs.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Curtiss, M., Becker, I., Bosman, T., Doroshenko, S., Grijincu, L., Jackson, T., Zhang, N.: Unicorn: a system for searching the social graph. VLDB, 1150–1161 (2013)
Armstrong, T.G., Ponnekanti, V., Borthakur, D., Callaghan, M.: Linkbench: a database benchmark based on the facebook social graph, pp. 1185–1196. ACM (2013)
Borthakur, D., Gray, J., Sarma, J.S., Muthukkaruppan, K., Spiegelberg, N., Kuang, H., Aiyer, A.: Apache Hadoop goes realtime at Facebook. In: SIGMOD, pp. 1071–1080 (2011)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB, pp. 143–154. ACM (2010)
Rabl, T., Gómez-Villamor, S., Sadoghi, M., Muntés-Mulero, V., Jacobsen, H.A., Mankovskii, S.: Solving big data challenges for enterprise application performance management. VLDB, 1724–1735 (2012)
Bercken, J., Seeger, B.: An evaluation of generic bulk loading techniques. VLDB, 461–470 (2001)
White, T.: Hadoop: The definitive guide. O’Reilly Media, Inc. (2012)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: TOCS (2008)
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica, 351–385 (1996)
Thomsen, C., Pedersen, T.B., Lehner, W.: RiTE: Providing on-demand data for right-time data warehousing. In: ICDE, pp. 456–465 (2008)
Graefe, G., Kuno, H.: Fast loads and queries. Transactions on Large-Scale Data-and Knowledge-Centered Systems II, 31–72 (2010)
Moerkotte, G.: Small materialized aggregates: A light weight index structure for data warehousing. VLDB, 476–487 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Qu, W., Dessloch, S. (2014). A Demand-Driven Bulk Loading Scheme for Large-Scale Social Graphs. In: Manolopoulos, Y., Trajcevski, G., Kon-Popovska, M. (eds) Advances in Databases and Information Systems. ADBIS 2014. Lecture Notes in Computer Science, vol 8716. Springer, Cham. https://doi.org/10.1007/978-3-319-10933-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-10933-6_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10932-9
Online ISBN: 978-3-319-10933-6
eBook Packages: Computer ScienceComputer Science (R0)