On Tree Structures Used by Simple Propagation

Madsen, Anders L.; Butz, Cory J.; Oliveira, Jhonatan S.; dos Santos, André E.

doi:10.1007/978-3-319-34111-8_26

Anders L. Madsen^15,16,
Cory J. Butz¹⁷,
Jhonatan S. Oliveira¹⁷ &
…
André E. dos Santos¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9673))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

1621 Accesses

Abstract

Simple Propagation (SP) is a new junction tree-based algorithm for probabilistic inference in discrete Bayesian networks. It is similar to Lazy Propagation, but uses a simpler approach to exploit the factorization during message computation. The message construction is based on a one-in, one-out-principle meaning a potential has at least one non-evidence variable in the separator and at least one non-evidence variable not in the separator. This paper considers the use of different tree structures to guide the message passing in SP and reports on an experimental analysis using a set of real-world Bayesian networks.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Solving Influence Diagrams with Simple Propagation

A Comparison of Different Marginalization Operations in Simple Propagation

On the Tree Structure Used by Lazy Propagation for Inference in Bayesian Networks

Keywords

1 Introduction

The Simple Propagation (SP) algorithm was introduced in [1] as a new algorithm for belief update in Bayesian networks. SP is similar to LP as it proceeds by message passing in a join tree or a junction tree taking advantage of a decomposition of clique and separator potentials, but it takes a simpler approach to the computation of potentials in the message passing phase of belief update. In SP, message construction is based on a one-in, one-out-principle meaning a potential has at least one non-evidence variable in the separator and at least one non-evidence variable not in the separator.

In this paper, we empirically analyze the role that the choice of the tree structure plays when SP conducts inference. We consider three kinds of tree structures, namely, optimal (or believed to be close to optimal) junction trees, junction trees produced by the fill-in-weight heuristic, and maximal prime subgraph decomposition (MPD) trees [4]. When using optimal junction trees or the ones produced by the fill-in-weight heuristic, experimental results shows that in fewer than half the cases, the space cost of SP is almost invariant. Moreover, the time cost of SP in these two categories is approximately the same in 14 out of 28 cases. On the other hand, SP often runs out of memory on MPD trees.

2 Preliminaries and Notation

A Bayesian network BN $\mathcal{{N}}=(\mathcal{{X}}, G, \mathcal{{P}})$ consists of a set of discrete random variables $\mathcal{{X}}=\{X_1,\ldots ,X_n\}$, where $\mathrm {dom}(X)$ is the state space of X and $||X||=|\mathrm {dom}(X)|$, an acyclic, directed graph (DAG) $G=(V,E)$, where $V\sim \mathcal{{X}}$ is the set of vertices and E is the set of edges, and a set $\mathcal{{P}}$ of conditional probability distributions (CPDs). A BN represents a decomposition of the joint probability distribution $P(\mathcal{{X}})= \prod _{X\in \mathcal{{X}}}P(X\!\mid \!\mathrm {pa}(X))$, where $\mathrm {pa}(X)$ denotes the parents of X in G and $\mathrm {fa}(X)=\mathrm {pa}(X)\cup \{X\}$. For example, a BN $\mathcal{{N}}=(\mathcal{{X}}, G, \mathcal{{P}})$ over six variables is shown by G in Fig. 1, where we assume $||X_1||=||X_4||=3$, $||X_2||=||X_5||=||X_6||=4$, and $||X_3||=2$.

Belief update is the task of computing the posterior marginal probability distribution $P(X\!\mid \!\epsilon )$ for each non-observed variable $X\in \mathcal{{X}}\setminus \mathcal{{X}}_\epsilon $ given evidence $\epsilon $ assumed to be instantiations of variables $\mathcal{{X}}(\epsilon )$. A variable X is a barren w.r.t. a set $T\subseteq \mathcal{{X}}$, evidence $\epsilon $, and DAG G, if $X\not \in T$, $X\not \in \mathcal{{X}}_\epsilon $ and X only has barren descendants in G, if any [6]. The notion of barren variables can be extended to graphs with both directed and undirected edges [3].

A junction tree $T=(\mathcal{{C}},\mathcal{{S}})$ of $\mathcal{{N}}$ is created by moralization and triangulation of G (see, e.g., [2]), where $\mathcal{{C}}$ denotes the cliques and $\mathcal{{S}}$ denotes the separators of T. The state space size of clique or separator A is defined as $s(A)=\prod _{X\in A}||X||$.

3 Simple Propagation

SP [1] can be considered a simplification of LP [5] with respect to how messages are computed. The basic idea is to maintain a decomposition of clique and separator potentials and to exploit independence relations induced by evidence and barren variables during belief update. Each CPD $P\in \mathcal{{P}}$ is associated with a clique C s.t. $ \mathrm {dom}(P)\subseteq C$, where P is reduced to reflect $\epsilon $. Next messages are passed over the computational tree structure relative to a selected root of T.

In SP, the one-in, one-out-principle is applied when a clique sends a message to a neighbouring clique over a separator S. The one-in, one-out-principle states that a probability potential $\phi $ has at least one non-evidence variable in S and another variable X not in S and SP can eliminate X. The computation of messages in SP is performed using the Simple Message Computation (SMC) algorithm shown as Algorithm 1. The $\textsc {{SumOut}}$ algorithm corresponds to Variable Elimination summing out X from the product of $\varPhi _X=\{\phi \in \mathcal{{F}}| X\in \mathrm {dom}(\phi )\}$ and replacing $\varPhi _X$ in $\mathcal{{F}}$ with the result.

In addition to passing the potentials created by Algorithm 1, SP also passes any potential $\phi $ for which $\mathrm {dom}(\phi )\subseteq S$, i.e., the potentials where the domain is a subset of the separator. After a full round of message passing, the posterior marginal $P(X\!\mid \!\epsilon )$ can be computed from any clique or separator containing X.

4 Computational Tree Structure

The computational tree structure induces a partial order on the set of possible (implicit) elimination orders produced by the SMC algorithm during the message passing step of belief update. As SP maintains a decomposition of clique and separator potentials, it becomes sensitive to the order in which variables are eliminated during message passing. The number of variables to be eliminated when sending a message from A to B is determined by $A\setminus B$ which is influenced by |A| and |B|. Thus, the impact of the (implicit) elimination order tends to increase with $|A \setminus B|$. As SP uses the same potentials for message computation as LP, we can rely on the correctness considerations of [4].

To illustrate the impact of using different computational tree structures, consider the moral graph $G^m$ and two triangulations of $G^m$ in Fig. 1. The fill-in-size heuristic may produce the elimination order $\sigma _{fis}=(X_6,X_1,X_5,X_2,X_3,X_4)$, while the order $\sigma _{fiw}=(X_6,X_5,X_2,X_1,X_3,X_4)$ will be produced by the fill-in-weight heuristic (optimal w.r.t. s(T)).

The junction trees $T_{fis}$ and $T_{fiw}$ produced are shown in Fig. 2. This figure also shows the MPD tree $T_{MPD}$ for $\mathcal{{N}}$ constructed from $T_{fiw}$ by merging cliques connected by incomplete separators in $G^m$. For simplicity, we assume that clique $X_4X_5X_6$ is selected as root of each tree.

Due to space limitations, we only consider the first message passed in each of the computational trees shown in Fig. 2. For $T_{fis}$, the first message is from clique $X_1X_2X_4$ to $X_2X_3X_4$. The clique potential $\pi _{X_1X_2X_4} = (\{P(X_2),P(X_1\!\mid \!X_2),P(X_4\!\mid \!X_1)\})$ has two CPDs satisfying the one-in, one-out-principle: $P(X_1\!\mid \!X_2)$ and $P(X_4\!\mid \!X_1)$. Notice $P(X_2)$ does not satisfy the principle as all domain variables are in S. Both $P(X_1\!\mid \!X_2)$ and $P(X_4\!\mid \!X_1)$ satisfy the condition in line 2 of the SMC algorithm and variable $X_1$ can be eliminated using the $\textsc {{SumOut}}$-algorithm producing $\phi (X_4\!\mid \!X_2)$. The message is then

$$\begin{aligned} \pi _{X_1X_2X_4\rightarrow X_2X_3X_4} = (\{P(X_2), \phi (X_4\!\mid \!X_2)\}). \end{aligned}$$

(1)

For $T_{fiw}$, the first message is from clique $X_1X_2X_3$ to $X_1X_3X_4$. The potential $\pi _{X_1X_2X_3} = (\{P(X_2),P(X_1\!\mid \!X_2),P(X_3\!\mid \!X_2)\})$ has two CPDs satisfying the one-in, one-out-principle: $P(X_1\!\mid \!X_2)$ and $P(X_3\!\mid \!X_2)$. Notice $P(X_2)$ does not satisfy the principle. Both $P(X_1\!\mid \!X_2)$ and $P(X_3\!\mid \!X_2)$ satisfy the condition in line 2 of the SMC algorithm and variable $X_2$ can be eliminated using the $\textsc {{SumOut}}$-algorithm producing $\phi (X_1,X_3)$. The message is then

$$\begin{aligned} \pi _{X_1X_2X_3\rightarrow X_1X_3X_4}=(\{\phi (X_1,X_3)\}). \end{aligned}$$

(2)

For $T_{MPD}$, the first message is from clique $X_1X_2X_3X_4X_5$ to $X_4X_5X_6$. Potential $\pi _{X_1X_2X_3X_4X_5}=(\{P(X_2),$ $P(X_1\!\mid \!X_2),$ $P(X_3\!\mid \!X_2),$ $P(X_4|X_1),$ $P(X_5|X_3)\})$ has two CPDs satisfying the one-in, one-out-principle: $P(X_4\!\mid \!X_1)$ and $P(X_5\!\mid \!X_3)$. Assume variable $X_1$ selected for elimination first producing the updated set $\mathcal{{F}}=\{P(X_2),P(X_3\!\mid \!X_2),P(X_5|X_3), \phi (X_4\!\mid \!X_2)\}$. Next, $P(X_5|X_3)$ and $\phi (X_4\!\mid \!X_2)$ satisfy the condition in line 2. Assume variable $X_2$ is selected for elimination producing the updated set $\mathcal{{F}}=\{P(X_5|X_3),\phi (X_4,X_3)\}$. Next, $P(X_5|X_3)$ and $\phi (X_3, X_4)$ satisfy the condition in line 2. Variable $X_3$ is the last variable eliminated producing the updated set $\mathcal{{F}}=\{\phi (X_4,X_5)\}$. The message is then

$$\begin{aligned} \pi _{X_1X_2X_3X_4X_5\rightarrow X_4X_5X_6}=(\{\phi (X_4,X_5)\}). \end{aligned}$$

(3)

The important point to notice is that different tree structures lead to different messages being constructed by SP, as shown in Eqs. (1), (2), and (3). The next section reports on an empirical evaluation of the performance impact of different tree structures.

5 Experimental Analysis

Table 1 shows statistics on a sample of 10 out of 28 BNs used in the experiments and information on the junction tree $\hat{T}$, the junction tree $T_{fiw}$ and the MPD tree $T_{MPD}$, where sizes are on a log-scale in base 10. Table 2 shows the average largest factor over 100 sets of random evidence created by SP during belief update (this includes computing marginals using Variable Elimination). The empty entries denote examples where SP ran out of memory on some evidence sets.

Table 1. Information on 10 out of the 28 Bayesian networks and the computational tree structures used in the experiments.

Full size table

Table 2. Average space cost of belief update using SP.

Full size table

In fewer than half the cases, the space cost of SP is almost invariant to the use of $\hat{T}$ or $T_{fiw}$ as the computational structure. These are the cases where $s(\hat{T})\approx s(T_{fiw})$. In the cases where $s(\hat{T})<< s(T_{fiw})$, there is a large difference in the average size of the largest factor created by SP. SP is not able to perform belief update in 20 out of 28 networks using $T_{MPD}$ as the computational structure. The eight networks for which SP can perform belief update have an average size of the largest factor of less than 100, 000 probabilities.

Table 3 shows the average computation cost (wall-clock time) of belief update by SP. In 14 out of 28 cases the time cost of SP using $T_{fiw}$ is approximately the same as the time cost SP using $\hat{T}$. The difference is less than five per cent. Since SP using $T_{MPD}$ is only able to solve the most simple networks, the time cost of SP in this case is low and in many cases comparable to the time cost of $\hat{T}$ and $T_{fiw}$ (at least in absolute terms).

Table 3. Average time cost of belief update using SP (mean and standard deviation).

Full size table

6 Conclusion

This paper has investigated the impact of the secondary computational structure used for belief update on time and space performance of SP. The results of a preliminary empirical performance evaluation on a set of real-world Bayesian networks indicate that the tree structure used to control the message passing can have a significant impact on both time and space performance.

References

Butz, C.J., Oliveira, J.S., dos Santos, A.E., Madsen, A.L.: Bayesian network inference with simple propagation. In: Proceedings of the Twenty-Ninth International FLAIRS Conference (2016)
Google Scholar
Jensen, F.V., Jensen, F.: Optimal junction trees. In: Proceedings of UAI, pp. 360–366 (1994)
Google Scholar
Madsen, A.L.: Variations over the message computation algorithm of lazy propagation. IEEE Trans. Syst. Man Cybern. Part B 36(3), 636–648 (2006)
Article Google Scholar
Madsen, A.L., Butz, C.: On the tree structure used by lazy propagation for inference in Bayesian networks. In: van der Gaag, L.C. (ed.) ECSQARU 2013. LNCS, vol. 7958, pp. 400–411. Springer, Heidelberg (2013)
Chapter Google Scholar
Madsen, A.L., Jensen, F.V.: Lazy propagation: a junction tree inference algorithm based on lazy evaluation. Artif. Intell. 113(1–2), 203–245 (1999)
Article MathSciNet MATH Google Scholar
Shachter, R.D.: Evaluating influence diagrams. Oper. Res. 34(6), 871–882 (1986)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Hugin Expert A/S, Aalborg, Denmark
Anders L. Madsen
Department of Computer Science, Aalborg University, Aalborg, Denmark
Anders L. Madsen
Department of Computer Science, University of Regina, Regina, Canada
Cory J. Butz, Jhonatan S. Oliveira & André E. dos Santos

Authors

Anders L. Madsen
View author publications
You can also search for this author in PubMed Google Scholar
Cory J. Butz
View author publications
You can also search for this author in PubMed Google Scholar
Jhonatan S. Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
André E. dos Santos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anders L. Madsen .

Editor information

Editors and Affiliations

Lakehead University, Thunder Bay, Ontario, Canada
Richard Khoury
National Research Council Canada , Ottawa, Canada
Christopher Drummond

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Madsen, A.L., Butz, C.J., Oliveira, J.S., dos Santos, A.E. (2016). On Tree Structures Used by Simple Propagation. In: Khoury, R., Drummond, C. (eds) Advances in Artificial Intelligence. Canadian AI 2016. Lecture Notes in Computer Science(), vol 9673. Springer, Cham. https://doi.org/10.1007/978-3-319-34111-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-34111-8_26
Published: 13 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-34110-1
Online ISBN: 978-3-319-34111-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On Tree Structures Used by Simple Propagation

Abstract

Similar content being viewed by others

Solving Influence Diagrams with Simple Propagation

A Comparison of Different Marginalization Operations in Simple Propagation

On the Tree Structure Used by Lazy Propagation for Inference in Bayesian Networks

Keywords

1 Introduction

2 Preliminaries and Notation

3 Simple Propagation

4 Computational Tree Structure

5 Experimental Analysis

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On Tree Structures Used by Simple Propagation

Abstract

Similar content being viewed by others

Solving Influence Diagrams with Simple Propagation

A Comparison of Different Marginalization Operations in Simple Propagation

On the Tree Structure Used by Lazy Propagation for Inference in Bayesian Networks

Keywords

1 Introduction

2 Preliminaries and Notation

3 Simple Propagation

4 Computational Tree Structure

5 Experimental Analysis

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation