Structure Learning of Bayesian Network from the Data

Bhimagavni, Naveen Kumar; Adilakshmi, T.

doi:10.1007/978-981-99-2742-5_35

Part of the book series: Cognitive Science and Technology ((CSAT))

Included in the following conference series:

International Conference on Information and Management Engineering

279 Accesses

Abstract

Learning the structure of Bayesian network is a two-step process, one is parameter learning, and other is finding the best structure among search space using uncertain and incomplete data. Structure learning is the most important and complex task (NP hard problem) in estimation theory. However, existing techniques such as K3 algorithm require topological order of the nodes or a constraint on the maximum number of parents for a node, and also, most of them are generating all possible graphs even for small number of random variables and consume large amount of space and time complexity (Buntine in IEEE Trans Knowl Data Eng 8:195–210 [1]; Chickering in Learning from data. Springer, pp 121–130 [2]) to verify each of the structure. In this work, we propose an algorithm that generates comparatively small number of graphs as a heuristic search technique, and Bayesian score is calculated for each candidate structure to find the best network structure. The proposed algorithm consumes comparatively less time complexity to discover network structure using the data.

Access provided by Autonomous University of Puebla. Download conference paper PDF

A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge

Article 08 July 2015

Structure learning of Bayesian Networks using global optimization with applications in data classification

Article 04 October 2014

A survey of Bayesian Network structure learning

Article Open access 17 January 2023

Keywords

1 Introduction

Bayesian network is a compact representation of joined probability distribution and referred as directed acyclic graph (DAG) that consists of vertices and edges. Each vertex denotes a random variable, and edge from (A → B) represents the conditional dependency between random variables A and B. BN is a compact representation of joint probability distribution of all random variables [3].

1.1 Structure Learning

Bayesian network can be constructed manually by domain experts using prior structure knowledge especially in medical domain. The structure can then be refined manually or by automatically using techniques such as refinement algorithm and ExpertBayes [4]. However, manual construction may not be feasible for all domains. Another approach is to use available information. Structure of BN can also be estimated directly from the data. There are two approaches for the structure learning: score-based approach and constraint-based approach [5].

Constraint-Based Approach

The constraint-based case employs the independence test [6] to identify a set of edge constraints [7] for the graph and then finds the best DAG that satisfies the constraints [8].

Score-Based Approach

The score-based approach first defines a criterion to evaluate how well the Bayesian network fits the data and then searches over the space of DAGs [9] for a structure with maximal score [10].

Most of the existing techniques [1] follow the below steps to find the best structure. First, generate all possible graphs (which consumes large time complexity [2]) and then apply any scoring function for each candidate structure; the structure with the highest Bayesian score is considered as the best Bayesian network structure. However, the scoring-based approach generates exponent number of candidate graphs.

2 Proposed Algorithm

In this work, a new heuristic search technique is proposed that generates optimal number of graphs, and then, scoring function is applied on each of the graph to find the best structure.

Step 1: Generate optimal number of candidate structure.
Step 2: Calculate Bayesian score for each candidate structure using sufficient statistics of the input data.
Step 3: The structure with the highest Bayesian score is the best Bayesian network structure.

Algorithm 1: Proposed algorithm

Data: Input data for covid 19 Result: Bayesian Network structure 1. begin 2. Read Input data for covid 19; 3. Best_Score = 0; 4. Best_structure = 0; 5. Calculate Sufficient statistics (S) for the input data; 6. Call G = GenerateOptimalGraphs(n); // n is number of random variables in the Input data 7. for each graph g_i in G begin 8. Calculate Bayesian Score(BS(g_i))for each graph g_i using Sufficient statistics (S); 9. if BS(g_i) > Best_Score then Best_Score = BS(g_i); Best_structure = g_i end 10. end

3 Calculate Bayesian Score for Each Structure

Bayesian score [3] can be used as scoring function for all possible graphs [11, 12]. It is the function which takes one graph as an input and computes the Bayesian score for that structure [13, 14].

$$ P\left( \frac{G}{D} \right) = \frac{{P\left( \frac{D}{G} \right)P\left( G \right)}}{P\left( D \right)} $$

(1)

P(D | G) is the marginal likelihood of the graph structure G. P(G) is the prior probability of the graph structure and can be considered as uniform distribution G in the candidate structure space. P(D) is the probability of the data which acts as a normalizing constants the score function can be written as below; it primarily depends on the marginal likelihood P(D | G)

$$ {\text{ScoreB}}\left( {G:D} \right) = \log P\left( \frac{D}{G} \right) $$

(2)

3.1 Data Preprocessing

The data was collected from source kaggle [15] for the pandemic disease Covid-19 that consists of 35 million records; the dataset contains 27 features with missing values. These values are handled using Python package. Test data contains missing values for mixed data types (numeric or strings), and SimpleImputer from the Python package sklearn.impute can be used.

Ex: imp_mean = SimpleImputer (missing_values=np.nan, strategy=“most_frequent”)

When strategy is set to “most_frequent”, it replaces missing values using the most frequent value along each column. It can be used with strings or numeric data (mixed data types) (Fig. 1).

An illustration of the Bayesian network by the proposed algorithm of the covid dataset. The K leads to D then further to R, Z, and w. The B and S lead to Y, C leads to T and M, X leads to F and E, J leads to U, and O lead to V. The N leads to A, Q, and Z. The G, H, I, L, and P are single datasets. — **Fig. 1**

4 Goodness Fit of the Model

Proposed heuristic algorithm can be compared with the existing structure learning algorithms such as hill-climbing learnt by bnlearn package. Bayesian network models can be compared with the following metrics.

1.
score comparison
2.
hold-out cross-validation
3.
precision.

1.
Score Comparison

The Bayesian scoring function is used to compute the score of the Bayesian network with 27 nodes learnt using bnlearn package (hill-climbing approach)

$$ {\text{score}}\left( {{\text{bnlearn}}\_{\text{BN}},26\;{\text{nodes}}} \right) = 3240.513 $$

Score of the Bayesian network with 20 nodes learnt using heuristic algorithm

$$ {\text{score}}\left( {{\text{heuristic}}\_{\text{BN}},26\;{\text{nodes}}} \right) = 3576.921 $$

2.
Hold-Out Cross-Validation

Cross-validation is a standard method used to estimate a model’s goodness of fit. In k-fold cross-validation, data is randomly divided into k subsets. For each subset k, a model is evaluated on k having been trained on (k − 1) subsets (Fig. 2; Tables 1 and 2).

A graph plots log-likelihood loss versus cluster B N and bnlearn B N. The expected loss for bnlearn is 14.80372 and the expected loss for cluster B N is 14.26. All values are approximated. — **Fig. 2**

Table 1 Expected loss for the Bayesian network created by bnlearn package

Full size table

Table 2 Expected loss for the Bayesian network created by heuristic algorithm

Full size table

3.
Precision Analysis

Precision: It computes the proportion of positive identifications was actually correct out of total predicted positives. 3 × 3 confusion matrix created for the node sore throat which has three levels, namely low, moderate and high, diagonal elements represent the true positives along that column, and the remaining values in the same column represent false positives. Precision values have been computed with various threshold values ranging from 0.2 to 1.0 for the below confusion matrices from precision comparison graph shown in Fig. 3 (Table 3).

A line graph plots precision versus threshold levels. The line starts at 0.94, drops to 0.87, rises to 0.95, again drops to 0.74, then rises to 0.93, and again drops. All values are estimated. — **Fig. 3**

Table 3 3 × 3 confusion matrix created for the node sore throat

Full size table

5 Conclusion

The proposed algorithm generates less number of candidate structures, thereby consumes comparatively less time complexity and finds the best structure in optimal time. Bayesian score is used as a scoring function to compute the score of the each candidate structure. The goodness fit of the best Bayesian network structure is measured in terms of Bayesian score, expected loss and precision values. It is observed that score of the best Bayesian network structure has improved by 10%. At the same time, expected loss is reduced by 1.0098 percentages.

However, present work can be extended for large number of nodes in the graph, and Bayesian score can be calculated for Bayesian networks having both discrete and continuous random variables.

References

Buntine W (1996) A guide to the literature on learning probabilistic networks from data. IEEE Trans Knowl Data Eng 8(2):195–210
Article Google Scholar
Chickering DM (1996) Learning Bayesian networks is NP-complete. In: Learning from data. Springer, pp 121–130
Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press
Google Scholar
Beretta S, Castelli M, Gonçalves I, Henriques R, Ramazzotti D. Learning the structure of Bayesian networks: a quantitative assessment of the effect of different algorithmic schemes
Google Scholar
Bernstein D, Saeed B, Squires C, Uhler C (2020, June) Ordering-based causal structure learning in the presence of latent variables. In: International conference on artificial intelligence and statistics, pp 4098–4108. PMLR
Google Scholar
Chen Y, Tian J (2014, June) Finding the k-best equivalence classes of Bayesian network structures for model averaging. Proc AAAI Conf Artif Intell 28(1)
Google Scholar
Chen EYJ, Shen Y, Choi A, Darwiche A (2016) Learning Bayesian networks with ancestral constraints. Adv Neural Inf Process Syst 29:2325–2333
Google Scholar
Cheng J, Greiner R, Kelly J, Bell D, Liu W (2002) Learning Bayesian networks from data: an information-theory based approach. Artif Intell 137(1–2):43–90
Article MathSciNet MATH Google Scholar
Chickering D (2002) Learning equivalence classes of Bayesian-network structures. J Mach Learn Res 2:445–498
MathSciNet MATH Google Scholar
Chickering DM, Meek C (2002, August) Finding optimal Bayesian networks. In: Proceedings of the eighteenth conference on uncertainty in artificial intelligence, pp 94–102
Google Scholar
Chobtham K, Constantinou AC (2020) Bayesian network structure learning with causal effects in the presence of latent variables. In: Proceedings of the 10th international conference on probabilistic graphical models, in proceedings of machine learning research, vol 138, pp 101–112
Google Scholar
Claassen T, Mooij JM, Heskes T (2013, August) Learning sparse causal models is not NP-hard. In: Proceedings of the twenty-ninth conference on uncertainty in artificial intelligence, pp 172–181
Google Scholar
Colombo D, Maathuis MH (2014) Order-independent constraint-based causal structure learning. J Mach Learn Res 15(1):3741–3782
MathSciNet MATH Google Scholar
Colombo D, Maathuis MH, Kalisch M, Richardson TS (2012) Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann Stat:294–321
Google Scholar
Kaggle data source for Covid19. https://www.kaggle.com/iamhungundji/covid19-symptoms-checker

Download references

Author information

Authors and Affiliations

University College of Engineering, Osmania University, Hyderabad, India
Naveen Kumar Bhimagavni
Vasavi College of Engineering, Osmania University, Hyderabad, India
T. Adilakshmi

Authors

Naveen Kumar Bhimagavni
View author publications
You can also search for this author in PubMed Google Scholar
T. Adilakshmi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T. Adilakshmi .

Editor information

Editors and Affiliations

BioAxis DNA Research Centre Private Limited, Hyderabad, Andhra Pradesh, India
Amit Kumar
Department of Computer Science, Brunel University, Uxbridge, UK
Gheorghita Ghinea
CMR College of Engineering and Technology, Hyderabad, India
Suresh Merugu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhimagavni, N.K., Adilakshmi, T. (2023). Structure Learning of Bayesian Network from the Data. In: Kumar, A., Ghinea, G., Merugu, S. (eds) Proceedings of the 2nd International Conference on Cognitive and Intelligent Computing. ICCIC 2022. Cognitive Science and Technology. Springer, Singapore. https://doi.org/10.1007/978-981-99-2742-5_35

Download citation

DOI: https://doi.org/10.1007/978-981-99-2742-5_35
Published: 27 September 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2741-8
Online ISBN: 978-981-99-2742-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Structure Learning of Bayesian Network from the Data

Abstract

Similar content being viewed by others

A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge

Structure learning of Bayesian Networks using global optimization with applications in data classification

A survey of Bayesian Network structure learning

Keywords