Abstract
Learning Bayesian network structure is one of the most exciting challenges in machine learning. Discovering a correct skeleton of a directed acyclic graph(DAG) is the foundation for dependency analysis algorithms for this problem. Considering the unreliability of high order condition independence(CI) tests, and to improve the efficiency of a dependency analysis algorithm, the key steps are to use few numbers of CI tests and reduce the sizes of conditioning sets as much as possible. Based on these reasons and inspired by the algorithm PC, we present an algorithm, named fast and efficient PC (FEPC), for learning the adjacent neighbourhood of every variable. FEPC implements the CI tests by three kinds of orders, which reduces the high order CI tests significantly. Compared with current algorithm proposals, the experiment results show that FEPC has better accuracy with fewer numbers of condition independence tests and smaller size of conditioning sets. The highest reduction percentage of CI test is 83.3% by EFPC compared with PC algorithm.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Yang Y, Wu Y. On the properties of concept classes induced by some multiple-valued Bayesian networks [J]. Information Sciences, 2012, 184: 155–165.
Chang H, Lee W. An information theoretic filter method for feature weighting in naive Bayes [J]. International Journal of Pattern Recognition and Artificial Intelligence, 2014, 28(5): 1451007.
Spirtes P, Glymour C N, Scheines R. Causation, Prediction and Search [M]. Cambridge: MIT Press, 2000.
Pearl J. Causality: Models, Reasoning and Inference[M]. Cambridge: MIT Press, 2000.
Cheng J, David B, Liu W. Learning Bayesian networks from data: An efficient approach based on information theory [C]//Advances in Knowledge Discovery and Data Mining Lecture Notes in Computer Science. Berlin: Springer-Verlag, 2005, 3518: 474–479.
Koller D, Sahami M. Toward optimal feature selection[C]// Proc of International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 1996: 284–292.
Margaritis D, Thrun S. Bayesian network induction via local neighborhoods[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 1999.
Tsamardinos I, Aliferis C F. Towards principled feature selection: Relevancy, filters and wrappers[C]_// Proc 9th International Workshop on Artificial Intelligence and Statistics. San Francisco: Morgan Kaufmann Publishers, 2003.
Yaramakala S, Margaritis D. Speculative Markov blanket discovery for optimal feature selection, data mining[C]//5th IEEE International Conference on IEEE, 5th IEEE International Conference on Data Mining. Washington D C: IEEE Press, 2005: 809–812.
Tsamardinos I, Aliferis C F, Statnikov A. Time and sample efficient discovery of Markov blankets and direct causal relations[C]//Proc 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 673–678.
Aliferis C F, Tsamardinos I, Statnikov A. HITON: A novel Markov blanket algorithm for optimal variable selection[C]//AMIA Annual Symposium Proceedings. Maryland: American Medical Informatics Association, 2003: 21–25.
Pena J M, Nilsson R, Bjorkegren J. Towards scalable and data efficient learning of Markov boundaries [J]. International Journal of Approximate Reasoning, 2007, 45: 211–232.
Fu S, Desmarais M C. Fast Markov blanket discovery algorithm via local learning within single pass [C]// Advances in Artificial Intelligence. New York: Springer-Verlag, 2008: 96–107.
Yehezkel R, Lerner B. Bayesian network structure learning by recursive autonomy identification[J]. Journal of Machine Learning Research, 2009, 10: 1527–1570.
Meek C. Causal inference and causal explanation with background knowledge[C]//Proceedings of the 11th Conference on Uncertainty in Artifical Intelligence. San Francisco: Morgan Kaufmann Publishers, 1995: 403–410.
Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: The combination of knowledge and statistical data [J]. Machine Learning, 1995, 20(3): 197–243.
Tsamardinos I, Brown L E, Aliferic C F. The Max-minhill-climbing Bayesian network structure learning algorithm [J]. Machine Learning, 2006, 65(1): 31–78.
Murphy K. The Bayes net toolbox for Matlab[J]. Computing Science and Statistics, 2010, 33: 1024–1034.
Colombo D, Maathuis M H, Kalisch M, et al. Learning high-dimensional directed acyclic graphs withlatent and selection variables [J]. The Annals of Statistics, 2012, 40(1): 294–321.
Liu X, Yang Y, Zhu M. Structure learning of causal bayesian networks based on adjacent nodes [J]. International Journal on Artificial Intelligence Tools(IJAIT), 2013, 22(2): 1–18.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Supported by the National Natural Science Foundation of China (61403290,11301408,11401454), the Foundation for Youths of Shaanxi Province (2014JQ1020), the Foundation of Baoji City (2013R7-3) and the Foundation of Baoji University of Arts and Sciences (ZK15081)
Biography: LI Yanying, female, Ph.D. candidate, research direction: Bayesian network.
Rights and permissions
About this article
Cite this article
Li, Y., Yang, Y., Zhu, X. et al. Towards fast and efficient algorithm for learning Bayesian network. Wuhan Univ. J. Nat. Sci. 20, 214–220 (2015). https://doi.org/10.1007/s11859-015-1084-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11859-015-1084-y