Abstract
In this paper we benchmark two distinct algorithms for extracting community structure from social networks represented as graphs, considering how we can representatively sample an OSN graph while maintaining its community structure. We also evaluate the extraction algorithms’ optimum value (modularity) for the number of communities using five well-known benchmarking datasets, two of which represent real online OSN data. Also we consider the assignment of the filtering and sampling criteria for each dataset. We find that the extraction algorithms work well for finding the major communities in the original and the sampled datasets. The quality of the results is measured using an NMI (Normalized Mutual Information) type metric to identify the grade of correspondence between the communities generated from the original data and those generated from the sampled data. We find that a representative sampling is possible which preserves the key community structures of an OSN graph, significantly reducing computational cost and also making the resulting graph structure easier to visualize. Finally, comparing the communities generated by each algorithm, we identify the grade of correspondence.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69, 26113 (2004)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebure, E.: Fast unfolding of communities in large networks. In: J. of Stat. Mech.: Theory and Experiment (10), P1000 (2008)
Traud, A.L., Kelsic, E.D., Mucha, P.J., Porter, M.A.: Comparing Community Structure to Characteristics in Online Collegiate Social Networks. SIAM Review 53(3), 526–543 (2011)
Zachary, W.W.: An Information Flow Model for Conflict and Fission in Small Groups. Journal of Anthropological Research 33, 452–473 (1977)
Freeman, L.C.: A Set of Measures of Centrality Based on Betweenness. Sociometry 40(1), 35–41 (1977)
Newman, M.E.J.: Modularity and community structure in networks. PNAS 103(23), 8577–8582 (2006)
Shetty, J., Adibi, J.: Discovering Important Nodes through Graph Entropy - The Case of Enron Email Database. In: Proc. 3rd Int. W. on Link Discovery, pp. 74–81 (2005)
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattarcharjee, B.: Measurement and analysis of online social networks. In: Proc. 7th ACM SIGCOMM Conference on Internet Measurement, IMC 2007, pp. 29–42 (2007)
Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the Evolution of User Interaction in Facebook. In: Proc. 2nd ACM Workshop on Online Social Networks, WOSN 2009, Barcelona, Spain, pp. 37–42 (2009)
Kleinberg, J.M.: Challenges in mining social network data: processes, privacy, and paradoxes. In: Proc. 13th Int. Conf. on K. Disc. & Data Mining (KDD 2007), pp. 4–5 (2007)
Kumar, R., Novak, J., Tomkins, A.: Structure and Evolution of Online Social Networks. In: Link Mining: Models, Algorithms, and Applications, Part 4, pp. 337-357. Springer (2010)
Newman, M.E.J.: Scientific collaboration networks. I. Network construction and fundamental results. Phys. Rev. E 64(1), 016131 (2001)
Lancichinetti, A., Fortunato, S.: Community Detection Algorithms: a comparative analysis. Physical Review E 80, 056117 (2009)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002)
Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Computing Surveys 38(1) (March 2006)
Bartz, K., Blitzstein, J., Liu, J.: Graphs, Bridges and Snowballs: Monte Carlo Maximum Likelihood for Exponential Random Graph Models. Presentation, January 8 (2009)
Lee, S.H., Kim, P.J., Jeong, H.: Statistical properties of sampled networks. Phys. Rev. E 73, 016102 (2006)
Lusseau, D., Schneider, K., Boisseau, O.J., Haase, P., Slooten, E., Dawson, S.M.: The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology 54(4), 396–405 (2003)
Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1), 29–123 (2009)
Ahn, Y.Y., Han, S., Kwak, H., Moon, S., Jeong, H.: Analysis of topological characteristics of huge online social networking services. In: Proc. 6th Int. Conf. WWW, pp. 835–844 (2007)
Xie, J., Szymanski, B.K., Liu, X.: SLPA: Uncovering Overlapping Communities in Social Networks via a Speaker-listener Interaction Dynamic Process. Cornell University Library (2011), http://arxiv.org arXiv:1109.5720v3
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Arqué, N.M., Nettleton, D.F. (2012). Analysis of On-Line Social Networks Represented as Graphs – Extraction of an Approximation of Community Structure Using Sampling. In: Torra, V., Narukawa, Y., López, B., Villaret, M. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2012. Lecture Notes in Computer Science(), vol 7647. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34620-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-34620-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34619-4
Online ISBN: 978-3-642-34620-0
eBook Packages: Computer ScienceComputer Science (R0)