Abstract
Mining and analyzing graphs are challenging tasks, especially with today’s fast-growing graphs such as Web and social networks. In the case of Web and social networks an effective approach have been using compressed representations that enable basic navigation over the compressed structure. In this paper, we first present a parallel algorithm for reducing the number of edges of Web graphs adding virtual nodes over a cluster using BSP (Bulk Synchronous Processing) model. Applying another compression technique on edge-reduced Web graphs we achieve the best state-of-the-art space/time tradeoff for accessing out/in-neighbors. Second, we present a scalable parallel algorithm over BSP for extracting dense subgraphs and represent them with compact data structures. Our algorithm uses summarized information for implementing dynamic load balance avoiding idle time on processors. We show that our algorithms are scalable and keep compression efficiency.
Partially funded by Millennium Nucleus Information and Coordination in Networks ICM/FIC P10-024F.
Partially funded by FONDEF IDeA CA12i10314.
The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-319-02432-5_33
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Hernández, C., Navarro, G.: Compressed representations for web and social graphs. To appear in Knowledge and Information Systems (2013), http://springerlink.bibliotecabuap.elogim.com/article/10.1007/s10115-013-0648-4
Boldi, P., Rosa, M., Santini, M., Vigna, S.: Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In: WWW, pp. 587–596 (2011)
Apostolico, A., Drovandi, G.: Graph compression by bfs. Algorithms 2(3), 1031–1044 (2009)
Brisaboa, N.R., Ladra, S., Navarro, G.: k2-Trees for compact web graph representation. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 18–30. Springer, Heidelberg (2009)
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: A system for large-scale graph processing. In: SIGMOD Conference, pp. 135–146 (2010)
Kang, U., Tsourakakis, C.E., Faloutsos, C.: Pegasus: mining peta-scale graphs. Knowl. Inf. Syst. 27(2), 303–325 (2011)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Pace, M.F.: Bsp vs mapreduce. Procedia CS 9, 246–255 (2012)
Ladra, S.: Algorithms and compressed data structures for information retrieval. Ph.D. Thesis, University of A. Coruña (2011)
Hernández, C., Navarro, G.: Compressed representation of web and social networks via dense subgraphs. In: Calderón-Benavides, L., González-Caro, C., Chávez, E., Ziviani, N. (eds.) SPIRE 2012. LNCS, vol. 7608, pp. 264–276. Springer, Heidelberg (2012)
Claude, F., Ladra, S.: Practical representations for web and social graphs. In: CIKM, pp. 1185–1190 (2011)
Buehrer, G., Chellapilla, K.: A scalable pattern mining approach to web graph compression with communities. In: WSDM, pp. 95–106 (2008)
Maserrat, H., Pei, J.: Neighbor query friendly compression of social networks. In: KDD, pp. 533–542 (2010)
Schmidt, M.C., Samatova, N.F., Thomas, K., Park, B.-H.: A scalable, parallel algorithm for maximal clique enumeration. J. Parallel Distrib. Comput. 69(4), 417–428 (2009)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31(11-16), 1481–1493 (1999)
Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: WWW, pp. 461–470 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hernández, C., Marín, M. (2013). Discovering Dense Subgraphs in Parallel for Compressing Web and Social Networks. In: Kurland, O., Lewenstein, M., Porat, E. (eds) String Processing and Information Retrieval. SPIRE 2013. Lecture Notes in Computer Science, vol 8214. Springer, Cham. https://doi.org/10.1007/978-3-319-02432-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-02432-5_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02431-8
Online ISBN: 978-3-319-02432-5
eBook Packages: Computer ScienceComputer Science (R0)