Network-based naive Bayes model for social network

Huang, Danyang; Guan, Guoyu; Zhou, Jing; Wang, Hansheng

doi:10.1007/s11425-017-9209-6

Network-based naive Bayes model for social network

Articles
Published: 29 December 2017

Volume 61, pages 627–640, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Science China Mathematics Aims and scope Submit manuscript

Network-based naive Bayes model for social network

Download PDF

Danyang Huang¹,
Guoyu Guan²,
Jing Zhou¹ &
…
Hansheng Wang³

197 Accesses
3 Citations
Explore all metrics

Abstract

Naive Bayes (NB) is one of the most popular classification methods. It is particularly useful when the dimension of the predictor is high and data are generated independently. In the meanwhile, social network data are becoming increasingly accessible, due to the fast development of various social network services and websites. By contrast, data generated by a social network are most likely to be dependent. The dependency is mainly determined by their social network relationships. Then, how to extend the classical NB method to social network data becomes a problem of great interest. To this end, we propose here a network-based naive Bayes (NNB) method, which generalizes the classical NB model to social network data. The key advantage of the NNB method is that it takes the network relationships into consideration. The computational effciency makes the NNB method even feasible in large scale social networks. The statistical properties of the NNB model are theoretically investigated. Simulation studies have been conducted to demonstrate its finite sample performance. A real data example is also analyzed for illustration purpose.

Article PDF

Probabilistic graphical models in modern social network analysis

Article 19 October 2015

Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting

A note on testing conditional independence for social network analysis

Article 11 April 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Antonakis A C, Sfakianakis M E. Assessing naïve Bayes as a method for screening credit applicants. J Appl Stat, 2009, 36: 537–545
Article MathSciNet MATH Google Scholar
Belkin M, Niyogi P, Sindhwani V. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res, 2006, 7: 2399–2434
MathSciNet MATH Google Scholar
Bickel P J, Chen A. A nonparametric view of network models and Newman-Girvan and other modularities. Proc Natl Acad Sci USA, 2009, 106: 21068–21073
Article MATH Google Scholar
Breiman L. Random forest. Mach Learn, 2001, 45: 5–32
Article MATH Google Scholar
Buhlmann P, Yu B. Boosting with the L2 loss: Regression and classification. J Amer Statist Assoc, 2003, 98: 324–340
Article MathSciNet MATH Google Scholar
Choi D, Wolfe P, Airoldi E. Stochastic blockmodels with a growing number of classes. Biometrika, 2012, 99: 273–284
Article MathSciNet MATH Google Scholar
Craven M, McCallum A, PiPasquo D, et al. Learning to extract symbolic knowledge from the World Wide Web. In: Proceedings of the 15th National Conference on Artificial Intelligence. World Wide Web Internet and Web Information Systems, vol. 118. Menlo Park: Amer Assoc Artif Intell, 1998, 509–516
Google Scholar
Erdős P, Rényi A. On the evolution of random graphs. Magyar Tud Akad Mat Kutató Int Közl, 1960, 5: 17–61
MathSciNet MATH Google Scholar
Fan J, Feng Y, Jiang J, et al. Feature augmentation via nonparametrics and selection (FANS) in high-dimensional classification. J Amer Statist Assoc, 2016, 111: 275–287
Article MathSciNet Google Scholar
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn, 1997, 29: 131–163
Article MATH Google Scholar
Guan G, Guo J, Wang H. Varying naive Bayes models with applications to classification of Chinese text documents. J Bus Econom Statist, 2014, 32: 445–456
Article MathSciNet Google Scholar
Guan G, Shan N, Guo J. Feature screening for ultrahigh dimensional binary data. Stat Interface, 2018, 11: 41–50
Article MathSciNet Google Scholar
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer, 2001
Book MATH Google Scholar
Holland P W, Leinhardt S. An exponential family of probability distributions for directed graphs. J Amer Statist Assoc, 1981, 76: 33–50
Article MathSciNet MATH Google Scholar
Hunter D R, Handcock M S. Inference in curved exponential family models for networks. J Comput Graph Statist, 2006, 15: 565–583
Article MathSciNet Google Scholar
Hunter D R, Handcock M S, Butts C T, et al. Ergm: A package to fit, simulate and diagnose exponential-family models for networks. J Statist Softw, 2008, 24: 1–29
Article Google Scholar
Lewis D D. Evaluating and optimizing autonomous text classification systems. In: International Acm Sigir Conference on Research and Development in Information Retrieval. New York: ACM, 1995, 246–254
Google Scholar
Lewis D D. Naive Bayes at forty: The independence assumption in information retrieval. In: Proceedings of ECML-98, 10th European Conference on Machine Learning. London: Springer-Verlag, 1998, 4–15
Chapter Google Scholar
Macskassy S A, Provost F. Classification in networked data: A toolkit and a univariate case study. J Mach Learn Res, 2007, 8: 935–983
Google Scholar
Minnier J, Yuan M, Liu J S, et al. Risk classification with an adaptive naive Bayes kernel machine model. J Amer Statist Assoc, 2015, 110: 393–404
Article MathSciNet MATH Google Scholar
Neville J, Jensen D. Iterative classification in relational data. In: Proceedings of American Association for Artificial Intelligence Workshop on Learning Statistical Models from Relational Data. Palo Alto: AAAI Press, 2000, 42–49
Google Scholar
Nowicki K, Snijders T A B. Estimation and prediction for stochastic block structures. J Amer Statist Assoc, 2001, 96: 1077–1087
Article MathSciNet MATH Google Scholar
Ozuysal M, Calonder M, Lepetit V, et al. Fast keypoint recognition using random ferns. IEEE Trans Pattern Anal Mach Intell, 2010, 32: 448–461
Article Google Scholar
Robins G, Pattison P, Elliott P. Network models for social in uence processes. Psychometrika, 2001, 66: 161–189
Article MathSciNet MATH Google Scholar
Wang Y J, Wong G Y. Stochastic blockmodels for directed graphs. J Amer Statist Assoc, 1987, 82: 8–19
Article MathSciNet MATH Google Scholar
Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1994
Book MATH Google Scholar
Webb G I, Boughton J R, Wang Z. Not so naive Bayes: Aggregating one-dependence estimators. Mach Learn, 2005, 58: 5–24
Article MATH Google Scholar
Wu Y, Liu Y. Robust truncated-hinge-loss support vector machines. J Amer Statist Assoc, 2007, 102: 974–983
Article MathSciNet MATH Google Scholar
Zaidi N A, Cerquides J, Carman M, et al. Alleviating naive Bayes attribute independence assumption by attribute weighting. J Mach Learn Res, 2013, 14: 1947–1988
MathSciNet MATH Google Scholar
Zanin M, Papo D, Sousa P A, et al. Combining complex networks and data mining: Why and how. Phys Rep, 2016, 635: 1–44
Article MathSciNet Google Scholar
Zheng Z, Webb G I. Lazy learning of Bayesian rules. Mach Learn, 2000, 41: 53–84
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Grant Nos. 11701560, 11501093, 11631003, 11690012, 71532001, 11525101), the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China (Grant No. 16XNLF01), the Beijing Municipal Social Science Foundation (Grant No. 17GLC051), Fund for Building World-Class Universities (Disciplines) of Renmin University of China, the Fundamental Research Funds for the Central Universities (Grant Nos. 130028613, 130028729 and 2412017FZ030), China’s National Key Research Special Program (Grant No. 2016YFC0207700) and Center for Statistical Science at Peking University.

Author information

Authors and Affiliations

School of Statistics, Renmin University of China, Beijing, 100872, China
Danyang Huang & Jing Zhou
KLAS of MOE, and School of Economics, Northeast Normal University, Changchun, 130034, China
Guoyu Guan
Guanghua School of Management, Peking University, Beijing, 100872, China
Hansheng Wang

Authors

Danyang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Guoyu Guan
View author publications
You can also search for this author in PubMed Google Scholar
Jing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hansheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoyu Guan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, D., Guan, G., Zhou, J. et al. Network-based naive Bayes model for social network. Sci. China Math. 61, 627–640 (2018). https://doi.org/10.1007/s11425-017-9209-6

Download citation

Received: 22 June 2017
Accepted: 18 November 2017
Published: 29 December 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s11425-017-9209-6

Keywords

MSC(2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Network-based naive Bayes model for social network

Abstract

Article PDF

Similar content being viewed by others

Probabilistic graphical models in modern social network analysis

Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting

A note on testing conditional independence for social network analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2010)

Navigation

Network-based naive Bayes model for social network

Abstract

Article PDF

Similar content being viewed by others

Probabilistic graphical models in modern social network analysis

Detection of Sparsity in Multidimensional Data Using Network Degree Distribution and Improved Supervised Learning with Correction of Data Weighting

A note on testing conditional independence for social network analysis

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2010)

Search

Navigation