Keywords

1 Introduction

Multi-state is a typical characteristic of advance engineering systems and products [1,2,3,4]. Many technical systems that perform their intended tasks/missions with multiple (more than two) distinguishable states between perfectly functioning and completely failed can be regarded as multi-state systems (MSSs) [1]. The MSS reliability models, first introduced in the mid-1970s, have received considerable concerns in the past few decades, because the models can characterize complicated deterioration processes of engineering systems more precisely than that of the traditional binary-state system (BSS) reliability models [1, 5]. For example, based on the length of flank wear, the health status of a cutting tool can be classified approximately into five discrete states from the normal state (perfectly functioning) to nominally sharp (<0.1 mm), part worn (0.1–0.15 mm), severely worn (>0.15 mm), and fractured/chipped (completely failed) states [6]. Another instance is that a power generating system can function at multiple levels of generating capacity [1]. Similar treatment can also be found in diverse engineering situations, e.g., manufacturing systems, networked systems, grid systems, spacecraft, and municipal infrastructure.

As both components, subsystems, and the entire system can manifest multiple states, the MSS reliability models are, therefore, much complicated. The approaches to MSS reliability modeling and assessment can be roughly classified into four categories [1].

  • An extension of the Boolean models to the multi-value case. The methods based on the extension of the Boolean models is natural expansions of Boolean methods that were well implemented in BSSs, such as, for example, multi-state fault tree [7], multi-state minimal cuts/paths [8, 9], and multi-value decision diagram [10].

  • Stochastic models. The stochastic models, such as homogeneous/non-homogeneous Markov [11] and semi-Markov [12, 13], are more universal to characterize the degradation processes of MSSs. However, due to the dimension damnation, the stochastic models only suit to relatively small scale MSSs because the number of system states increases dramatically with the increase in the number of components and component states. Another severe restriction to implement the stochastic models is the computational complexity, because it is inevitably to solve a system of differential equations (for homogeneous/non-homogeneous Markov) or a system of integral equations (for semi-Markov).

  • Universal generating functions (UGFs). The UGF technique is effective enough that utilizes a rapid algebraic procedure to identify the state probability distribution of the entire system based on the state probability distributions of all the components [14]. However, this technique is a sort of “static” approaches that cannot characterize the dynamic degradation profiles of MSSs.

  • Simulation-based methods. The degradation behaviors of most MSSs in real-world situations can be simulated by the Monte Carlo method [15]. Nevertheless, the time consumption involved in the development and execution of the simulation models are oftentimes unaffordable to achieve a high accurate result.

The recursive algorithms were also developed to evaluate the reliability of generalized multi-state k-out-of-n systems and multi-state weighted k-out-of-n systems [16, 17]. It was proved that the recursive algorithms can outperform the UGF approaches with or without collecting like terms for the reliability assessment of multi-state weighted k-out-of-n systems. In addition, the degradation process of each multi-state component in an MSS can be characterized by the stochastic models, and thus the state probability distribution of the component at any particular time can be obtained. By combining the stochastic models and UGF approaches, the state probability distribution of the entire system at any particular time can be readily obtained, even for relatively large-scale systems.

Apart from the aforementioned methods and tools, Bayesian networks (BNs) [18], as a probabilistic graphical model, are capable of handling with various uncertainty problems effectively based on probabilistic information representation and inference. BNs have gained considerable popularity in MSS reliability modeling and assessment over the last decade. There is still a booming interest for using BNs in the reliability community, especially for MSS reliability modeling and assessment. This chapter will present a holistic framework for MSS reliability modeling and assessment with BNs. The contributions of this chapter are trifold: (1) the proposed framework can effectively characterize the dynamic behaviors of various MSSs; (2) the proposed framework can effectively characterize various dependencies in MSSs; (3) the proposed framework can effectively aggregate multi-level observation data to dynamically assess reliability of MSSs.

The reminder of this chapter is organized as follows. In Sect. 2, the basic characteristics of MSS and BNs are reviewed. The detailed procedures of constructing the BN models for a diversity of MSSs are provided in Sect. 3. A reliability assessment method by fusing multi-level observation data is developed in Sect. 4. A brief closure is given in Sect. 5.

2 Preliminaries

2.1 Multi-state Systems

An MSS herein is composed of \( M^{\text{c}} \) homogenous or heterogeneous multi-state components. The states of each component are distinguished by its performance capacities or degradation levels. Suppose that component \( l \) can possess \( N_{l}^{\text{c}} \) mutually ordered states, then the sets of the performance capacity and state component \( l \) can be denoted as \( {\mathbf{g}}_{l}^{\text{c}} = \)\( \{ g_{l,1} ,g_{l,2} , \ldots g_{{l,N_{l}^{\text{c}} }} \} \) and \( {\mathbf{s}}_{l}^{\text{c}} = \)\( \{ 1,2, \ldots , \)\( N_{l}^{\text{c}} \} \), respectively. States 1 and \( N_{l}^{\text{c}} \) are the best state and worst state of component \( l \), respectively. The performance capacity and state of component \( l \) at time \( t \) are denoted as \( G_{l}^{\text{c}} (t) \) (\( G_{l}^{\text{c}} (t) \in {\mathbf{g}}_{l}^{\text{c}} \)) and \( C_{l} (t) \) (\( C_{l} (t) \in {\mathbf{s}}_{l}^{\text{c}} \)), respectively. If component \( l \) sojourns in state \( i \) at time \( t \), i.e., \( C_{l} (t) = i \), the performance capacity \( G_{l}^{\text{c}} (t) = g_{l,i} \). In this chapter, states \( \{ 1,2, \ldots ,N_{l}^{\text{c}} - 1\} \) are acceptable states; therefore, component \( l \) is viewed as being failed if the component sojourns in state \( N_{l}^{\text{c}} \).

In this chapter, the degradation profile of each component is assumed to follow a homogeneous discrete-time Markov process. Other stochastic models, such as non-homogenous Markov process and semi-Markov process, can also be adopted. As each component degrades from the best state to the worst state, the Markov model is irreducible, transient, and aperiodic. The one-step state transition matrix of the Markov model for component \( l \) is represented as:

$$ {\mathbf{P}}_{l} = \left[ {\begin{array}{*{20}c} {p_{l,(1,1)} } & {p_{l,(1,2)} } & \ldots & {p_{{l,(1,N_{l}^{\text{c}} )}} } \\ 0 & {p_{l,(2,2)} } & \ldots & {p_{{l,(2,N_{l}^{\text{c}} )}} } \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & {p_{{l,(N_{l}^{\text{c}} ,N_{l}^{\text{c}} )}} } \\ \end{array} } \right] , $$

where \( p_{l,(i,j)} = \Pr \{ C_{l} (t + \Delta t) = j|C_{l} (t) = i\} \) (\( 1 \le i \le j \le N_{l}^{\text{c}} \)) is the state probability of component \( l \) from state \( i \) to state \( j \) within a basic time interval \( \Delta t \). The state probability distribution of component \( l \) at time \( t \) is denoted by a probability vector \( {\mathbf{p}}_{l} (t) = [p_{l,1} (t),p_{l,2} (t), \ldots ,p_{{l,N_{l}^{\text{c}} }} (t)] \), where \( p_{l,i} (t) = \Pr \{ C_{l} (t) = i\} \). With the known state probability distribution of component \( l \) at time \( t \), i.e., \( {\mathbf{p}}_{l} (t) \), the state probability distribution of the component at time \( t + k\Delta t \) can be computed as follows:

$$ {\mathbf{p}}_{l} (t + k\Delta t) = {\mathbf{p}}_{l} (t) \cdot ({\mathbf{P}}_{l} )^{k} . $$
(1)

Based on the physical configuration and/or functional relations between components, the components in an MSS can be divided into \( M^{\text{sub}} \) subgroups that are considered as \( M^{\text{sub}} \) subsystems. The number of the states that subsystem \( m \) and the entire system can have are \( N_{m}^{\text{sub}} \) and \( N^{\text{sys}} \), respectively. Likewise, the performance capacity and state of subsystem \( m \) at time \( t \) are denoted as \( G_{m}^{\text{s}} (t) \) and \( S_{m} (t) \), respectively; the performance capacity and state of the entire system at time \( t \) are denoted as \( G(t) \) and \( S(t) \), respectively; states 1 is the best state each subsystem and the entire system; states \( N_{m}^{\text{sub}} \) and \( N^{\text{sys}} \) are the worst states of subsystem \( m \) and the entire system, respectively.

The states of each subsystem and the entire system are completely determined by the state combinations of their corresponding constituents. The structure function \( \phi_{m} ( \cdot ) \) that identifies the relation between subsystem \( m \) and its constituents are deterministic and known; the structure function \( \phi ( \cdot ) \) that identifies the relation between the entire system and its constituents are also deterministic and known. It is common that more than one state combination of components may result in particular subsystem and/or system state [19]. An MSS is considered reliable if the system sojourns in the acceptable states during the operation period. Therefore, the reliability of an MSS is defined as the sum of the probabilities of the system sojourning in the acceptable states.

2.2 Bayesian Networks

BNs [18], also known as belief networks, Bayesian belief networks, and casual networks, are inherently compact representations of multivariate statistical distribution functions. A BN contains a qualitative part, i.e., the direct acyclic graph (DAG), and a quantitative part, i.e., a set of conditional probability tables (CPTs). The DAG of a BN consists of a set of nodes denoting random variables \( \{ X_{1} ,\;X_{2} ,\; \ldots ,\;X_{n} \} \) and a set of links characterizing the probabilistic dependencies among nodes. The terms node and random variable are used interchangeably hereinafter. Based on the types of all nodes, a BN can be classified into one of the three categories [20], i.e., discrete BNs, continuous BNs, and hybrid BNs. This chapter limits the treatment to the discrete BNs in which all nodes are discrete.

Each node in a BN can manifest finite mutually exclusive states. A link, as a directed edge from \( X_{j} \) to \( X_{i} \), represents that \( X_{j} \) has a directed casual effect on \( X_{i} \). Therefore, \( X_{j} \) is considered a parent of \( X_{i} \), which can be denoted as \( X_{j} \; \in \;{\mathbf{pa}}(X_{i} ) \); whereas \( X_{i} \) is regarded as a child of \( X_{j} \). Particularly, a node without any parent nodes and child nodes are called a root node and a leaf node, respectively. The DAG of a BN reflects the casual relations between all nodes, whereas the CPTs of the BN characterize the strength of these casual relations quantitatively. For a node \( X_{i} \) with a parent set \( {\mathbf{pa}}(X_{i} ) \), the CPT of \( X_{i} \), denoted as \( \Pr \{ X_{i} |{\mathbf{pa}}(X_{i} )\} \) represents the conditional probability mass function of \( X_{i} \) under the condition of \( {\mathbf{pa}}(X_{i} ) \). Particularly, a set of marginal probability tables (MPTs) need to be assigned to all root nodes. An illustrative BN with six nodes is shown in Fig. 1. \( X_{1} \) and \( X_{3} \) are root nodes, whereas \( X_{5} \) and \( X_{6} \) are leaf nodes. The parent nodes of \( X_{2} \) and \( X_{4} \) are denoted as \( {\mathbf{pa}}(X_{2} )\; = \;X_{1} \) and \( {\mathbf{pa}}(X_{4} ) = \{ X_{1} ,X_{2} ,X_{3} \} \), respectively. The parent node of both \( X_{5} \) and \( X_{6} \) is \( X_{4} \).

Fig. 1.
figure 1

An illustrative BN

Based on the chain rule, the joint probability distribution of all the random variables in a BN can be decomposed into the product of a set of conditional probability distributions, and it is given by:

$$ \Pr \{ X_{1} ,X_{2} , \ldots ,X_{n} \} = \prod\limits_{i = 1}^{n} {\Pr \{ X_{i} |{\mathbf{pa}}(X_{i} )\} } . $$
(2)

As an example, the joint probability distribution of the BN in Fig. 1 is represented as:

$$ \begin{aligned} & \Pr \{ X_{1} ,X_{2} , \ldots ,X_{6} \} \\ & = \Pr \{ X_{1} \} \Pr \{ X_{3} \} \Pr \{ X_{2} |X_{1} \} \Pr \{ X_{4} |X_{1} ,X_{2} ,X_{3} \} \Pr \{ X_{5} |X_{4} \} \Pr \{ X_{6} |X_{4} \} \\ \end{aligned} . $$
(3)

When one or more nodes are observed/instantiated, or say, evidences are inputted into these nodes, BNs are capable of updating the probability distributions of other nodes without observation/instantiation/evidence via effective inference. Various efficient algorithms for exact or approximate probabilistic inference can be utilized to update the entire BN, such as variable elimination algorithm, junction tree algorithm, and Markov chain Monte Carlo (MCMC) methods. The details of the BN inference algorithms can be referred to the books by Jensen and Nielsen [18], and Koller and Friedman [21].

The BN in Fig. 1 is essentially a static model that can only represent the casual relations among nodes at a particular time instant. To characterize to the evolving behaviors of random variables over time, local models are necessary to be constructed for each unit of time. A local BN model at a particular time is called a time slice. Temporal links that are also directed edges are introduced to constitute a full model by connecting all the time slices in a chronological order. The full model is called a dynamic Bayesian network (DBN). The detailed procedures of constructing DBN models will be elaborated in Sect. 3.2.

Numerous software can be utilized to model practical problems by BN or DBN from different aspects. An overview of available software in the literature are provided herein without pretending to be exhaustive.

  • Various software with integrated and intuitive graphical interfaces are powerful and user-friendly, e.g., BayesiaLab , GeNIe , Hugin , Netica , and AgenaRisk . These software make BNs accessible to engineers without programming skills.

  • A diversity of packages in different programming environments are also available, e.g., various R packages on CRAN, BNT in MATLAB, and BayesPy in Python. These packages that can make BNs manipulable are efficient, flexible, and extendable enough for engineers with proficient programming skills.

  • BUGS (Bayesian inference Using Gibbs Sampling) is concerned with several flexible software that implement the approximate Bayesian inference using MCMC methods. The well-known WinBUGS , OpenBUGS , and JAGS are all a sort of BUGS software packages.

BNs can represent and characterize various uncertainties and dependencies in reliability engineering in an intuitive, flexible, and effective manner; therefore, BNs have become a very popular tool to address diverse practical reliability problems [22,23,24,25,26,27]. The reported works in the literature regarding to BN applications in BSSs can be essentially extended to multi-state cases. As each node of a BN can have multiple (more than two) mutually exclusive states, BNs have gained considerable concerns in MSS reliability modeling and assessment recently. Compared with classical reliability formalisms, such as fault trees [28,29,30], in both modeling and analysis features, BNs have showed significant advantages over the traditional frameworks. Therefore, BNs have been applied to a diversity of engineering cases, such as the search and rescue operations [31], medium voltage air insulated switch operation [32], axle and vehicle [33], power generating systems [19], cutter feeding control system [34, 35], water distribution system [36, 37], bridge condition modeling [38], and subsea blowout preventer [39,40,41].

Due to the powerful capabilities in modeling and reasoning, BNs were utilized to characterize both random and epistemic uncertainties as well as various dependencies in the context of MSSs. For example, to deal with epistemic uncertainty in reliability evaluation, BNs and DBNs were extended to evidential networks and dynamic evidential networks based on Dempster-Shafer evidence theory [34, 35, 42], respectively. Various failure dependencies between components, such as common cause failures (CCFs) [34, 43, 44] and cascading failures [45], were also modeled by BNs. As the system reliability can be updated based on BN inference algorithms if a component node is instantiated, BNs were adopted in the importance measure analysis [35, 46]. In addition to these aspects, BNs have also been extensively applied to system maintenance management in which BNs were used to infer the condition of a system or its components if some components and/or the entire system can be observed before maintenance decision-making [23, 47,48,49,50,51].

The temporal BN model of a system can be constructed to characterize the degradation/failure profile (temporal dependency) of the system. In general, temporal models can be divided into two broad categories based on the time representation, i.e., event-based approaches and time-sliced approaches. Based on the event-based approaches, Boudali and Dugan [52, 53] constructed a discrete-time BN and a continuous-time BN reliability modeling and analysis frameworks. Based on the time-sliced approaches, various DBN models were constructed to evaluate reliability of a system over time [19, 43, 54,55,56,57,58]. For example, Cai et al. [43] constructed a multi-phase DBN model to determine the safety integrity level of a safety instrumented system with CCFs. Jiang and Liu [19], and Xu et al. [57] developed a data-driven reliability assessment method based on DBNs by aggregating multi-level observation data. Khakzad [58] developed a DBN model to characterize the dynamic behaviors of the wildfire spread in wildland-industrial interfaces. Additionally, by decomposing an entire system model into several smaller modules, the object-oriented BNs were built up for large-scale, complex, and hierarchical systems [59,60,61,62,63].

To improve the modeling and inference efficiencies, various improved algorithms were proposed for BNs and DBNs, such as the topology optimization algorithm [64], dynamic discretization method [65], discretization of continuous random variables [66], compression inference algorithm [67], and improved compression inference algorithm [68].

3 Reliability Modeling by BNs

This section provides general procedures of constructing BN and DBN models for various typical MSSs, e.g., series systems, parallel systems, series-parallel systems, bridge systems, and phased-mission systems. Two kinds of failure dependencies among components are also considered in the BN and DBN models.

3.1 BN Models of Typical MSSs

The states of the components, subsystems, and the entire system of an MSS at a particular time are all inherently random variables. To characterize an MSS in the framework of BNs, the components, subsystems, and the entire system of an MSS are represented by nodes. For an MSS consisting of \( M^{\text{c}} \) components that can be divided into \( M^{\text{sub}} \) subsystems, a corresponding BN model of the system can be constructed using a set of nodes, denoted as \( \Upomega \; = \;\{ C_{1} ,\;C_{2} ,\; \ldots ,\;C_{{M^{\text{c}} }} ;\;S_{1} ,\;S_{2} ,\; \ldots ,\;S_{{M^{\text{sub}} }} ;\;S\} \). In the BN model, nodes \( C_{l} \) (\( l \in \{ 1,2, \ldots ,M^{\text{c}} \} \)), \( S_{m} \) (\( m \in \{ 1, \)\( 2, \ldots ,M^{\text{sub}} \} \)), and \( S \) correspond to component \( l \), subsystem \( m \), and the entire system, respectively. Directed edges that link different nodes are added based on the relations between the states of components, subsystems, and the entire system. For an MSS with all the components being s-independent, node \( C_{l} \) (\( l \in \{ 1,2, \ldots ,M^{\text{c}} \} \)) is a root node, whereas node \( S \) is a leaf node. If subsystem \( m \) (or the entire system) is composed of components \( \{ l_{1} ,l_{2} , \ldots ,l_{k} \} \) and subsystems \( \{ m_{1} ,\;m_{2} ,\; \ldots ,\;m_{n} \} \), directed edges for nodes \( \{ C_{{l_{1} }} ,\;C_{{l_{2} }} ,\; \ldots ,\;C_{{l_{k} }} ;\;S_{{m_{1} }} ,\;S_{{m_{2} }} ,\; \ldots ,\;S_{{m_{n} }} \} \) to node \( S_{m} \) (or node \( S \)) are added into the BN model.

The CPTs and MPTs of the nodes quantifies the directed edges in a BN model. For an MSS in which all the components are s-independent, the MPT of each root node \( C_{l} \) is the state probability distribution of component \( l \) at a particular time. The CPTs of each node \( S_{m} \) and leaf node \( S \) can be essentially obtained by the structure function of subsystem \( m \) and the entire system, respectively.

To provide the detailed procedures of constructing various BN models, several typical MSSs, i.e., series systems, parallel systems, series-parallel systems, bridge systems, and phased-mission systems, shown in Fig. 2, are used for illustration hereinafter. For a better comparison of the BN models for different system types, the components that constitute these systems are set to be the same. Each system is composed of five s-independent components, and the performance capacities of each component corresponding to its states are tabulated in Table 1.

Fig. 2.
figure 2

Configurations of several typical MSSs

Table 1. Performance capacities of each component

3.1.1 BN Models of the Illustrative Series System

As all the five components are connected in series, two candidate BN models of the illustrative series system, shown in Fig. 3, can be constructed. Although both two candidate BN models are correct, candidate BN model 2 is superior to model 1 because it has a simpler CPT for each child node than that of model 1. For candidate BN model 1, all the component nodes, i.e., nodes \( \{ C_{1} ,C_{2} , \ldots ,C_{5} \} \), are linked to the system node, i.e., node \( S \), directly; therefore, the CPT of node \( S \) is a Cartesian product of \( N_{1}^{\text{c}} \; \times \;N_{2}^{\text{c}} \; \times \; \ldots \; \times \;N_{5}^{\text{c}} \). For candidate BN model 2, three additional child nodes of the component nodes, i.e., nodes \( S_{1} \), \( S_{2} \), and \( S_{3} \), are added, and they can avoid an oversize CPT of the system node. The dimensions of the CPTs for nodes \( S_{1} \), \( S_{2} \), \( S_{3} \), and \( S \) in model 2 are \( N_{1}^{\text{c}} \times N_{2}^{\text{c}} \), \( N_{3}^{\text{c}} \times N_{1}^{\text{s}} \), \( N_{4}^{\text{c}} \times N_{2}^{\text{s}} \), and \( N_{5}^{\text{c}} \times N_{3}^{\text{s}} \), respectively. In this regard, candidate BN model 2 of the illustrative series system is preferable and will be used for further analysis hereinafter. Interested readers can also find more details in [64] where a topology optimization algorithm was proposed to address the inefficiency of a converging BN structure.

Fig. 3.
figure 3

Two candidate BN models of the series and parallel systems

For any multi-state series system consisting of \( n \) s-independent components \( \{ C_{{l_{1} }} ,C_{{l_{2} }} , \ldots ,C_{{l_{n} }} \} \), the system performance capacity at any time is determined by the performance capacities of all the components and is equal to \( G(t)\; = \;\hbox{min} \{ G_{{l_{1} }}^{\text{c}} (t),\;G_{{l_{2} }}^{\text{c}} (t),\; \ldots ,\;G_{{l_{n} }}^{\text{c}} (t)\} \). Therefore, the system state can be obtained based on the state combinations of all the components. As an example, for the series system in Fig. 2, the performance capacity of subsystem 1 at any time is completely determined by components 1 and 2. The relations between the performance capacities (states) of subsystem 1 and the corresponding state combinations of components 1 and 2 are given in Table 2. As a result, the CPT of node \( S_{1} \) in the BN model of the series system, shown in Table 3, can be obtained. Each element in Table 3 is a conditional probability of node \( S_{1} \) conditional on a particular state combination of nodes \( C_{1} \) and \( C_{2} \). In a similar manner, the CPTs of nodes \( S_{2} \), \( S_{3} \), and \( S \) can be obtained readily. The MPT of node \( C_{l} \) (\( l \in \{ 1,2, \ldots ,5\} \)) are essentially the state probability distributions of component \( l \) at a particular time, and it can be obtained by Eq. (1).

Table 2. Performance capacities and states of subsystem 1 of the series system
Table 3. CPT of node \( S_{1} \) of the series system

3.1.2 BN Models of the Illustrative Parallel System

For the illustrative parallel system that is composed of five s-independent components, two candidate BN models, shown in Fig. 3, can also be constructed. It is worth noting that the two DAGs of both the two candidate BN models for the illustrative series and parallel system are exactly the same. Likewise, candidate BN model 2 of the illustrative parallel system is preferable and will be used for further analysis hereinafter.

For any multi-state parallel system consisting of \( n \) s-independent components \( \{ C_{{l_{1} }} ,C_{{l_{2} }} , \ldots ,C_{{l_{n} }} \} \), the system performance capacity at any time is determined by the performance capacities of all the components and is equal to \( G(t) = \sum\nolimits_{k = 1}^{n} {G_{{l_{i} }}^{\text{c}} (t)} \). Therefore, the system state can be obtained based on the state combinations of all the components. As an example, for the parallel system in Fig. 2, the performance capacity of subsystem 1 at any time is completely determined by components 1 and 2. The relations between the performance capacities (states) of subsystem 1 and the corresponding state combinations of components 1 and 2 are given in Table 4. As a result, the CPT of node \( S_{1} \) in the BN model of the parallel system, shown in Table 5, can be obtained. It can be seen that the CPT of node \( S_{1} \) of the parallel system is different from the CPT of node \( S_{1} \) of the series system. The CPTs of nodes \( S_{2} \), \( S_{3} \), and \( S \) can be obtained readily in the same fashion.

Table 4. Performance capacities and states of subsystem 1 of the parallel system
Table 5. CPT of node \( S_{1} \) of the parallel system

3.1.3 BN Model of the Illustrative Series-Parallel System

Based on the system structure of the illustrative series-parallel system in Fig. 2, two candidate BN models, shown in Fig. 4, can also be constructed. Although candidate BN model 1 is intuitive, the CPTs of the child nodes in candidate BN model 2 are simplified by adding a subsystem node, i.e., node \( S_{2} \) of candidate BN model 2. Therefore, candidate BN model 2 of the illustrative series-parallel system is preferable and will be used for further analysis hereinafter.

Fig. 4.
figure 4

Candidate BN models of the series-parallel system

Corresponding to candidate BN model 2, subsystem 1 is composed of components 1 and 2 in parallel; subsystem 2 is composed of component 3 and subsystem 1 in series; subsystem 3 is composed of components 4 and 5 in parallel; the entire system consists of subsystems 2 and 3 in series. Consequently, the performance capacities of subsystems 1, 2, and 3 at time \( t \) are computed as \( G_{1}^{\text{s}} (t) = G_{1}^{\text{c}} (t) + G_{2}^{\text{c}} (t) \), \( G_{2}^{\text{s}} (t)\; = \;\hbox{min} \{ G_{3}^{\text{c}} (t),\;G_{1}^{\text{s}} (t)\} \), and \( G_{3}^{\text{s}} (t) = G_{4}^{\text{c}} (t) + G_{5}^{\text{c}} (t) \), respectively; the performance capacity of the entire system at time \( t \) is computed as \( G(t) = \hbox{min} \{ G_{2}^{\text{s}} (t),G_{3}^{\text{s}} (t)\} \). The similar analyses implemented in Tables 3 and 5 can also be done herein to obtain the CPTs of nodes \( S_{1} \), \( S_{2} \), \( S_{3} \), and \( S \).

3.1.4 BN Model of the Illustrative Bridge System

The BN model of the illustrative bridge system can be constructed based on the minimal success paths [42]. For the illustrative bridge system in Fig. 2, there exist four minimal success paths, i.e., \( \{ C_{1} ,C_{4} \} \), \( \{ C_{1} ,C_{3} ,C_{5} \} \), \( \{ C_{2} ,C_{5} \} \), and \( \{ C_{2} ,C_{3} ,C_{4} \} \). Therefore, the bridge system can be decomposed into two simplified sub-models, and the BN model of the bridge system can be constructed as shown in Fig. 5. In the BN model, nodes \( S_{3} \) and \( S_{6} \) represent sub-models 1 and 2, respectively; node \( S \) represents the entire bridge system. The performance capacities of subsystems 1, 2, and 3 at time \( t \) can be computed as \( G_{1}^{\text{s}} (t) = \hbox{min} \{ G_{3}^{\text{c}} (t),\;G_{5}^{\text{c}} (t)\} \), \( G_{2}^{\text{s}} (t)\; = \;G_{4}^{\text{c}} (t)\; + \;G_{1}^{\text{s}} (t) \), and \( G_{3}^{\text{s}} (t)\; = \;\hbox{min} \{ G_{1}^{\text{c}} (t),G_{2}^{\text{s}} (t)\} \), respectively; the performance capacities of subsystems 4, 5, and 6 at time \( t \) can be computed as \( G_{4}^{\text{s}} (t)\; = \;\hbox{min} \{ G_{3}^{\text{c}} (t),\;G_{4}^{\text{c}} (t)\} \), \( G_{5}^{\text{s}} (t)\; = \;G_{5}^{\text{c}} (t)\; + \;G_{4}^{\text{s}} (t) \), and \( G_{6}^{\text{s}} (t)\; = \;\hbox{min} \{ G_{2}^{\text{c}} (t),\;G_{5}^{\text{s}} (t)\} \), respectively; the performance capacity of the entire system at time \( t \) can be computed as \( G(t) = G_{3}^{\text{s}} (t) + G_{6}^{\text{s}} (t) \). Consequently, the CPTs of all the child nodes in the BN model can be obtained readily.

Fig. 5.
figure 5

Decomposition and BN model of the bridge system

3.1.5 BN Model of the Illustrative Phased-Mission Systems

The multi-state phased-mission system herein is intended to perform a mission with \( H \) phases. The system may reconfigure in different phases to meet varying mission demands, resulting in a distinct system structure in each phase. If a component is suspended in a phase, the component is assumed to not deteriorate during the phase and its state remains unchanged. The number of subsystems in phase \( h \) is denoted by \( M_{h}^{\text{sub}} \) (\( h \in \{ 1,2, \ldots ,H\} \)); the numbers of the states of subsystem \( m \) and the entire system in phase \( h \) are denoted by \( N_{m,h}^{\text{sub}} \) and \( N_{h}^{\text{sys}} \), respectively. The performance capacities of component \( l \), subsystem \( m \), and the entire system at time \( t \) in phase \( h \) are denoted by \( G_{l,h}^{\text{c}} (t) \), \( G_{m,h}^{\text{s}} (t) \), and \( G_{h} (t) \), respectively. The states of component \( l \), subsystem \( m \), and the entire system at time \( t \) in phase \( h \) are denoted by \( C_{l,h} (t) \), \( S_{m,h} (t) \), and \( S_{h} (t) \), respectively. It should be noted that \( S_{m} (t) \) hereinbefore denotes the state of subsystem \( m \) at time \( t \) for a general MSS, whereas \( S_{h} (t) \) herein represents the state of the entire system at time \( t \) in phase \( h \) for a multi-state phased-mission system. The duration of phase \( h \) is \( T_{h} \) (\( h \in \{ 1,2, \ldots ,H\} \)) times of the basic time interval.

The system survival at the end of a phase is not only determined by the system state at the end of the phase, but also depends on whether the system can survive at the end of the last phase. Therefore, a binary-state node, denoted as \( D_{h} \) (\( h \in \{ 1,2, \ldots ,H\} \)), is introduced herein to indicate whether the system can survive at the end of phase \( h \). By linking node \( D_{h} \) of adjacent phases by the directed edges, the probability of the system surviving in each phase can be characterized. Let \( D_{h} = 1 \) and \( D_{h} = 2 \) denote the system being in the functioning state and failure state at the end of phase \( h \), respectively. Therefore, the conditional probabilities of node \( D_{h} \) can be represented as follows:

$$ \Pr \{ D_{h} = 1|D_{h - 1} ,S_{h} \} = \left\{ {\begin{array}{*{20}c} 1 & {D_{h - 1} = 1\,{\text{and}}\, S_{h} {\text{ is acceptable}}} \\ 0 & {\text{otherwise}} \\ \end{array} } \right. , $$
(4)
$$ \Pr \{ D_{h} = 2|D_{h - 1} ,S_{h} \} = \left\{ {\begin{array}{*{20}c} 1 & {D_{h - 1} = 2 \,{\text{or}}\, S_{h} {\text{ is unacceptable}}} \\ 0 & {\text{otherwise}} \\ \end{array} } \right. . $$
(5)

A set of nodes, which is denoted as \( {\mathbf{\Upomega }} = \{ \Upomega_{1} ,\Upomega_{2} , \ldots ,\Upomega_{H} \} \), are used to construct the BN model of a phased-mission system. \( \Upomega_{h} \; = \;\{ C_{1,h} ,\;C_{2,h} ,\; \ldots ,\;C_{{M^{\text{c}} ,h}} ;\;S_{1,h} ,\;S_{2,h} ,\; \ldots ,\;S_{{M^{\text{sub}} ,h}} ;\;S_{h} ;\;D_{h} \} \) (\( h \in \{ 1,2, \ldots ,H\} \)) is the set of nodes in phase \( h \), where nodes \( C_{l,h} \;(l\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{c}} \} ), \) \( S_{m,h} (m\; \in \;\{ 1,\;2,\; \ldots ,\;M_{h}^{\text{sub}} \} ), \) and \( S_{h} \) correspond to component \( l \), subsystem \( m \), and the entire system in phase \( h \), respectively. In each phase, a local BN model can be constructed first based on the corresponding system structure. Directed edges are then added between component nodes across different phases to characterize the relations between different phases. The BN model of the illustrative phased-mission system in Fig. 2 can be constructed as shown in Fig. 6. The local BN model of phase \( h \) characterizes the phased-mission system at the end of phase \( h \). In phase 1, components 1, 2, and 3 are in operation, whereas components 4 and 5 are suspended. The performance capacities of subsystem 1 and the entire system at time \( t \) in phase 1 are denoted as \( G_{1,1}^{\text{s}} (t) = G_{1,1}^{\text{c}} (t) + G_{2,1}^{\text{c}} (t) \) and \( G_{1} (t)\; = \;\hbox{min} \{ G_{3,1}^{\text{c}} (t),\;G_{1,1}^{\text{s}} (t)\} \), respectively. Likewise, the performance capacities of the subsystem(s) and the entire system in phases 2 and 3 can be obtained. Consequently, the CPTs of all the subsystem nodes and the system nodes in the BN model can be obtained readily.

Fig. 6.
figure 6

BN model of the phased-mission system

If a component node in a phase is a root node, the MPT of the component node in the phase is essentially the state probability distribution of the corresponding component at a particular time. Nevertheless, if a component node in a phase is a child node, the CPT of the component node in the phase is essentially the state transition matrix of the corresponding component. As an example, for the BN model of the illustrative phased-mission system in Fig. 6, the CPTs of nodes \( C_{1,2} \) and \( C_{2,2} \) are the \( T_{2} \)-step state transition matrix of components 1 and 2, respectively; the CPTs of nodes \( C_{4,3} \) and \( C_{5,3} \) are the \( T_{3} \)-step state transition matrix of components 4 and 5, respectively; the CPT of node \( C_{3,3} \) is the \( T_{3} \)-step state transition matrix of component 3.

3.2 DBN Models of Typical MSSs

The BN models in Sect. 3.2 are all static models that can only represent an MSS at a particular time. The DBN model of an MSS can characterize the degradation process of the MSS during the operation period. By using a time slice to represent an MSS at a particular time, the DBN model of the MSS is inherently a discrete-time model. In a DBN model, all the time slices are the repetitive BN models of an MSS at a particular time. The time interval between two adjacent time slices is a basic time interval, i.e., \( \Delta t \). Suppose that the operation period is \( T \cdot \Delta t \), the number of time slices is thus equal to \( T + 1 \). Time slice \( t \) (\( t \in \{ 0,1, \ldots ,T\} \)) represents the local BN model at time \( t \). A set of nodes, denoted as \( {\mathbf{\Upomega }}\; = \;\{ \Upomega (0),\;\Upomega (1),\; \ldots ,\;\Upomega (T)\} \), are used to construct the DBN model of an MSS. \( \Upomega (t)\; = \;\{ C_{1} (t),\;C_{2} (t),\; \ldots ,\;C_{{M^{\text{c}} }} (t);\;S_{1} (t),\;S_{2} (t),\; \ldots ,\;S_{{M^{\text{sub}} }} (t);\;S(t)\} \) is the set of nodes in time slice \( t \), where \( C_{l} (t)\;(l\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{c}} \} ), \) \( S_{m} (t)\;(m\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{sub}} \} ), \) and \( S(t) \) correspond to component \( l \), subsystem \( m \), and the entire system at time \( t \), respectively. The MPT of node \( C_{l} (0) \) is the state probability distribution of component \( l \) at the beginning of use.

A temporal link from node \( C_{l} (t)\;(l\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{c}} \} ,\;t\; \in \;\{ 0,\;1,\; \ldots ,\;T - 1\} ) \) to node \( C_{l} (t + 1) \) is added to connect the two component nodes between two adjacent time slices, and it characterizes the degradation profiles of component \( l \) within a basic time interval. The strength of the temporal link from node \( C_{l} (t) \) to node \( C_{l} (t + 1) \) is quantified by the CPT of node \( C_{l} (t + 1) \) which is equivalent to the state transition matrix of component \( l \). The illustrative systems in Fig. 2 are used herein to provide detailed procedures of constructing the DBN models of various systems.

The DBN models of the illustrative series system in an extended form and an abstract form are shown in Fig. 7. In the extended form of the DBN model, all the time slices from time slice 0 to time slice \( T \) are displayed. In the abstract form of the DBN model, only a particular time slice, i.e., time slice \( t \), is displayed. The number attached to each temporal link, i.e., “1”, represents the number of time slices used for the temporal dependency. The number in the square box, i.e., \( T + 1 \), represents the total number of time slices in the DBN model. The DBN models of the illustrative parallel system in an extended form and an abstract form are also shown in Fig. 7. As discussed in Sect. 3.1, although the DAGs of the DBN models for the series and parallel systems are identical, the CPTs of the subsystem nodes and system nodes, i.e., nodes \( S_{m} (t)\;(m\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{sub}} \} ,\;t\; \in \;\{ 0,\;1,\; \ldots ,\;T\} ) \) and \( S(t)\;(t\; \in \;\{ 0,\;1,\; \ldots ,\;T\} ) \) are distinct. In a similar manner, the abstract forms of the DBN models of the illustrative series-parallel and bridge systems are shown in Fig. 8.

Fig. 7.
figure 7

DBN models of the series and parallel systems

Fig. 8.
figure 8

DBN models of the series-parallel and bridge systems

For a phased-mission system, a set of nodes, denoted as \( {\mathbf{\Upomega }}\; = \;\{ {\mathbf{\Upomega }}_{1} ,\;{\mathbf{\Upomega }}_{2} ,\; \ldots ,\;{\mathbf{\Upomega }}_{H} \} , \) are used to construct the DBN model, where \( {\mathbf{\Upomega }}_{1} \; = \;\{ \Upomega_{1} (0),\;\Upomega_{1} (1),\; \ldots ,\;\Upomega_{1} (T_{1} )\} \) and \( {\mathbf{\Upomega }}_{h} = \{ \Upomega_{1} (\sum\nolimits_{k = 1}^{h - 1} {T_{k} } + 1),\;\Upomega_{h} (\sum\nolimits_{k = 1}^{h - 1} {T_{k} } + 2),\; \ldots ,\;\Upomega_{h} (\sum\nolimits_{k = 1}^{h} {T_{k} } )\} \) \( (h\; \in \;\{ 2,\;3,\; \ldots ,\;H\} ) \) represent the local DBN models of the system in phase 1 and phase \( h \), respectively. \( \Upomega_{h} (t)\; = \;\{ C_{1,h} (t),\;C_{2,h} (t),\; \ldots ,\;C_{{M^{\text{c}} ,h}} (t);\;S_{1,h} (t),\;S_{2,h} (t),\; \ldots ,\;S_{{M^{\text{sub}} ,h}} (t);\;S_{h} (t);\;D_{h} (t)\} \) \( (h\; \in \;\{ 1,\;2,\; \ldots ,\;H\} ) \) represents time slice \( t \) in phase \( h \), where \( C_{l,h} (t)\;(l\; \in \;\{ 1,\;2,\; \ldots ,\;M^{\text{c}} \} ), \) \( S_{m,h} (t)\;(m\; \in \;\{ 1,\;2,\; \ldots ,\;M_{h}^{\text{sub}} \} ), \) \( S_{h} (t) \), and \( D_{h} (t) \) correspond to component \( l \), subsystem \( m \), the entire system, and system survival at time \( t \) in phase \( h \), respectively. \( t \) is elapse time from the beginning of use.

The DBN model of the illustrative phased-mission system is shown in Fig. 9. In each phase, a local DBN model is constructed to characterize the degradation process of the system in the phase. \( T_{1} + 1 \), \( T_{2} \), and \( T_{3} \) time slices are repeated in phases 1, 2, and 3, respectively. Particularly, the adjacent time slices at the end of phase \( h \) (\( h \in \{ 1,2\} \)) and the beginning of phase \( h + 1 \) are depicted to shown the detailed temporal dependencies between the two adjacent phases. As components 1 and 2 are in operation in both phases 1 and 2, in time slices \( T_{1} + 1 \) and \( T_{1} + 2 \), only two temporal links are added to the corresponding component nodes, i.e., the directed edge from node \( C_{l,1} (T_{1} + 1) \) to node \( C_{l,2} (T_{1} + 2) \) (\( l \in \{ 1,2\} \)). Likewise, as components 4 and 5 are in operation in both phases 2 and 3, in time slices \( T_{1} + T_{2} + 1 \) and \( T_{1} + T_{2} + 2 \), only two temporal links are added to the corresponding component nodes, i.e., the directed edge from node \( C_{l,2} (T_{1} + T_{2} + 1) \) to node \( C_{l,3} (T_{1} + T_{2} + 2) \) (\( l \in \{ 4,5\} \)). As components 4 and 5 are suspended in phase 1, the state probability distributions of nodes \( C_{4,2} (T_{1} + 2) \) and \( C_{5,2} (T_{1} + 2) \) are actually the corresponding state probability distributions of components 4 and 5 at the beginning of use, respectively. Nevertheless, as component 3 is in operation in phases 1 and 3 and in idle in phase 2, the state probability distribution of node \( C_{3,3} (T_{1} + T_{2} + 2) \) is equal to the state probability distribution of node \( C_{3,1} (T_{1} + 1) \).

Fig. 9.
figure 9

DBN model of the phased-mission system

3.3 Failure Dependencies in BN and DBN Models

The components in the BN and DBN models presented in Sects. 3.1 and 3.2 are assumed to be s-independent. Nevertheless, in real-world situations, the failure processes of the components may be inevitably s-dependent. BNs are a powerful tool to cope with various dependencies and can be utilized to model failure dependencies between components during their degradation processes. Two typical failure dependencies,.i.e., CCFs [44, 69,70,71,72] and immediate failure dependence (IFD) [73, 74], are considered in the illustrative series-parallel system herein.

CCFs are the failures of multiple dependent components within a system because of a share root cause or a common cause (CC) [69, 71], such as extreme environmental conditions or human errors. The presence of CCFs tends to increase the joint failure probability of a system, contributing significantly to the overall unreliability of systems subject to CCFs. Therefore, it is crucial to incorporate CC effects into the reliability modeling and assessment of systems subject to CCFs to avoid overestimation of system reliability measures.

An MSS can be subject to CCFs because of various elementary CCs. CCs are exclusive mutually and are external to the system. In general, CCs existing in an MSS can be denoted as \( \{ CC_{1} ,\;CC_{2} ,\; \ldots ,\;CC_{{n_{\text{CC}} }} \} \), where \( n_{\text{CC}} \) represents the number of elementary CCs. Therefore, a set of nodes, denoted as \( \Upomega \; = \;\{ C_{1} ,\;C_{2} ,\; \ldots ,\;C_{{M^{\text{c}} }} ;\;S_{1} ,\;S_{2} ,\; \ldots ,\;S_{{M^{\text{sub}} }} ;\;S;\;CC_{1} ,\;CC_{2} ,\; \ldots ,\;CC_{{n_{\text{CC}} }} ;\;U_{1} ,\;U_{2} ,\; \ldots ,\;U_{{M^{\text{c}} }} \} \), can be used to construct the BN model of an MSS with CCFs. Node \( C_{l} \) (\( l \in \{ 1,2, \ldots ,M^{\text{c}} \} \)) denotes the state probability distribution of component \( l \) caused by its own degradation, whereas node \( U_{l} \) denotes state probability distribution of component \( l \) incorporating the effects of CCs. Node \( U_{l} \) can be a null node if component \( l \) is not affected by any CCs. Node \( CC_{k} \) (\( k \in \{ 1,2, \ldots ,n_{\text{CC}} \} \)) has two states, i.e., \( CC_{k} \in \{ 1,2\} \). States 1 and 2 of node \( CC_{k} \) represent the non-occurrence and occurrence the \( k \) th CC, respectively. The inter-arrival time of the \( k \) th CC is assumed to be exponentially distributed with parameter \( \lambda_{k}^{\text{CC}} \). If component \( l \) is affected by \( n \) CCs \( \{ CC_{{k_{1} }} ,CC_{{k_{2} }} , \ldots ,CC_{{k_{n} }} \} \), the conditional probabilities of node \( U_{l} \) can be represented as follows:

$$ \Pr \{ U_{l} = N_{l}^{\text{c}} |C_{l} ;CC_{{k_{1} }} ,CC_{{k_{2} }} , \ldots ,CC_{{k_{n} }} \} = \left\{ {\begin{array}{*{20}c} 1 & {C_{l} = N_{l}^{\text{c}} {\text{ or }}\exists CC_{{k_{j} }} = 2} \\ 0 & {C_{l} \ne N_{l}^{\text{c}} {\text{ and }}\forall CC_{{k_{j} }} = 1} \\ \end{array} } \right. , $$
(6)
$$ \Pr \{ U_{l} = i|C_{l} ;CC_{{k_{1} }} ,CC_{{k_{2} }} , \ldots ,CC_{{k_{n} }} \} = \left\{ {\begin{array}{*{20}c} 1 & {C_{l} = i{\text{ and }}\forall CC_{{k_{j} }} = 1} \\ 0 & {C_{l} \ne i{\text{ or }}\exists CC_{{k_{j} }} = 2} \\ \end{array} } \right., \, i \ne N_{l}^{\text{c}} . $$
(7)

In the DBN model with CCFs, the temporal links are added to the CC nodes to characterize the occurrence of the CCs. The marginal probabilities of node \( CC_{k} (0) \) in time slice 0 can be denoted as \( \Pr \{ CC_{k} (0) = 1\} = 1 \) and \( \Pr \{ CC_{k} (0) = 2\} = 0 \). Due to the memoryless of the exponential distribution, the conditional probabilities of node \( CC_{k} (t) \) (\( t \in \{ 1,2, \ldots , \)\( T\} \)) can be represented as follows:

$$ \Pr \{ CC_{k} (t) = 1|CC_{k} (t - 1)\} = 1 - \exp ( - \lambda_{k}^{\text{CC}} \cdot \Delta t) , $$
(8)
$$ \Pr \{ CC_{k} (t) = 2|CC_{k} (t - 1)\} = \exp ( - \lambda_{k}^{\text{CC}} \cdot \Delta t) . $$
(9)

In addition, if component \( l \) is not affected by CCs, a temporal link will be added to node \( C_{l} (t) \) from time slice \( t - 1 \) (\( t \in \{ 1,2, \ldots , \)\( T\} \)) to time slice \( t \). On the contrary, if component \( l \) is affected by CCs, a temporal link will be added from node \( U_{l} (t - 1)\;(t\; \in \;\{ 1,\;2,\; \ldots ,\;T\} ) \) to node \( C_{l} (t) \) as \( U_{l} (t - 1) \) represents the actual condition of component \( l \).

The illustrative series-parallel system in Fig. 2 is used herein for further analysis. Suppose that two CCs exist in the system; \( CC_{1} \) affects components 1 and 2; \( CC_{2} \) affects components 4 and 5. Consequently, the BN and DBN models of the illustrative series-parallel system with CCFs are shown in Fig. 10. In the BN model, node \( U_{3} \) is omitted since component 3 is not affected by any CCs. As an example, based on Eqs. (6) and (7), the CPT of \( U_{1} \) is tabulated in Table 6. Two time slices of the DBN model are shown in Fig. 11 to present the details of the temporal links between the CC nodes and component nodes. In the DBN model, the CPT of node \( C_{l} (t)\;(t\; \in \;\{ 1,\;2,\; \ldots ,\;T\} ) \) is always the one-step transition matrix of component \( l \) regardless of the parent node of node \( C_{l} (t) \). Furthermore, the BN and DBN models with CCFs can be extended to more generalized cases, such as probabilistic CCFs [71, 72] and the case in which a CC can manifest multiple states [43].

Fig. 10.
figure 10

BN and DBN models of the series-parallel system with CCFs

Table 6. CPT of node \( U_{1} \) in the BN model with CCFs
Fig. 11.
figure 11

Two time slices of the DBN model with CCFs

IFD is common in real-world situations, and it refers to that the failure a component (influencing component) may cause immediate failures of some other components (affected components) [73, 74]. For instance, the failure of an electrical component creates a voltage spike that immediately triggers the failures of the neighboring components.

A set of nodes, denoted as \( \Upomega \; = \;\{ C_{1} ,\;C_{2} ,\; \ldots ,\;C_{{M^{\text{c}} }} ;\;S_{1} ,\;S_{2} ,\; \ldots ,\;S_{{M^{\text{sub}} }} ;\;S;\;U_{1} ,\;U_{2} ,\; \ldots ,\;U_{{M^{\text{c}} }} \} \), can be used to construct the BN model of an MSS with IFD. Node \( C_{l} \) (\( l \in \{ 1,2, \ldots ,M^{\text{c}} \} \)) denotes the state probability distribution of component \( l \) caused by its own degradation, whereas node \( U_{l} \) denotes state probability distribution of component \( l \) incorporating the effects of IFD. Likewise, node \( U_{l} \) can be a null node if component \( l \) is not affected by other components. Suppose that an immediate failure of affected component \( l \) occurs with probability \( p_{l}^{\text{IF}} \) if component \( l \) is not failed and one of its influencing components fails. If the failure of any of \( n \) components \( \{ C_{{k_{1} }} ,C_{{k_{2} }} , \ldots ,C_{{k_{n} }} \} \) can cause the failure of component \( l \), the conditional probability of node \( U_{l} \) can be represented as follows:

$$ \Pr \{ U_{l} = N_{l}^{\text{c}} |C_{l} ;C_{{k_{1} }} ,C_{{k_{2} }} , \ldots ,C_{{k_{n} }} \} = \left\{ {\begin{array}{*{20}c} 1 & {C_{l} = N_{l}^{\text{c}} } \\ {p_{l}^{\text{IF}} } & {C_{l} \ne N_{l}^{\text{c}} {\text{ and }}\exists C_{{k_{j} }} = N_{l}^{\text{c}} ,} \\ 0 & {C_{l} \ne N_{l}^{\text{c}} {\text{ and }}\forall C_{{k_{j} }} \ne N_{l}^{\text{c}} } \\ \end{array} } \right. $$
(10)
$$ \Pr \{ U_{l} = i|C_{l} ;C_{{k_{1} }} ,C_{{k_{2} }} , \ldots ,C_{{k_{n} }} \} = \left\{ {\begin{array}{*{20}c} 1 & {C_{l} = i{\text{ and }}\forall C_{{k_{j} }} \ne N_{l}^{\text{c}} } \\ {1 - p_{l}^{\text{IF}} } & {C_{l} = i{\text{ and }}\exists C_{{k_{j} }} = N_{l}^{\text{c}} } \\ 0 & {C_{l} \ne i \, } \\ \end{array} } \right., \, i \ne N_{l}^{\text{c}} . $$
(11)

In the DBN model with IFD, if component \( l \) is not affected by other components, a temporal link will be added to node \( C_{l} (t) \) from time slice \( t - 1 \) (\( t \in \{ 1,2, \ldots , \)\( T\} \)) to time slice \( t \). On the contrary, if the failure of any of \( n \) components \( \{ C_{{k_{1} }} ,C_{{k_{2} }} , \ldots ,C_{{k_{n} }} \} \) can cause the failure of component \( l \), a temporal link will be added from node \( U_{l} (t - 1)\;(t\; \in \;\{ 1,\;2,\; \ldots ,\;T\} ) \) to node \( C_{l} (t) \) since \( U_{l} (t - 1) \) represents the actual condition of component \( l \).

The illustrative series-parallel system in Fig. 2 is used herein for further analysis. Suppose that the failure of component 1 can cause an immediate failure of component 2; the failure of component 4 can also lead to a failure of component 5 immediately. Consequently, the BN and DBN models of the illustrative series-parallel system with IFD are shown in Fig. 12. In the BN model, nodes \( U_{1} \), \( U_{3} \), and \( U_{4} \) are omitted. As an example, based on Eqs. (10) and (11), the CPT of node \( U_{2} \) is tabulated in Table 7.

Fig. 12.
figure 12

BN and DBN models of the series-parallel system with IFD

Table 7. CPT of node \( U_{2} \) in the BN model with IFD

4 Reliability Assessment by DBNs

In this section, system reliability of the preceding MSSs can be assessed based on DBN models. If no evidence is inserted, the state probability distribution of the entire system at any time can be obtained by marginalizing the system node in the corresponding time slice. If some nodes are instantiated, the state probability distribution of the system node in any time slice can be updated by BN inference algorithms. Subsequently, by defining the acceptable states of an MSS, the reliability of the entire system can be estimated for any time instant.

4.1 BN Inference

Let the node set \( {\mathbf{\Upomega }}\; = \;\{ \Upomega (0),\;\Upomega (1),\; \ldots ,\;\Upomega (T)\} \) denote the DBN model of an MSS, where \( \Upomega (t)\; = \;\{ C_{1} (t),\;C_{2} (t),\; \ldots ,\;C_{{M^{\text{c}} }} (t);\;S_{1} (t),\;S_{2} (t),\; \ldots ,\;S_{{M^{\text{sub}} }} (t);\;S(t)\} \). The joint probability of the DBN model can be expressed as:

$$ \begin{aligned} \Pr({\varvec{\Omega}}) & = \Pr \left\{ {\bigcup\limits_{t = 0}^{T} {\left[ {\bigcup\limits_{l = 1}^{{M^{\text{c}} }} {C_{l} (t)} ;\bigcup\limits_{m = 1}^{{M^{\text{sub}} }} {S_{m} (t)} ;S(t)} \right]} } \right\} \\ & { = }\prod {\left\{ {\begin{array}{*{20}c} {\prod\limits_{l = 1}^{{M^{\text{c}} }} {\Pr \{ C_{l} (0)\} \prod\limits_{t = 1}^{T} {\Pr \{ C_{l} (t)|C_{l} (t - 1)\} } } } & {\text{component nodes}} \\ {\prod\limits_{m = 1}^{{M^{\text{sub}} }} {\prod\limits_{t = 0}^{T} {\Pr \{ S_{m} (t)|{\mathbf{pa}}(S_{m} (t))\} \quad \quad \quad \;} } } & {\text{subsystem nodes}} \\ {\prod\limits_{t = 0}^{T} {\Pr \{ S(t)|{\mathbf{pa}}(S(t))\} } \quad \quad \quad \quad \quad \;} & {\text{system nodes}} \\ \end{array} } \right.} \\ \end{aligned} . $$
(12)

The state probability distribution of the entire system at time \( t \) can be obtained by marginalizing node \( S(t) \), which is represented as [18]:

$$ \Pr \{ S(t)\} = \sum\nolimits_{{{\varvec{\Omega}}\backslash S(t)}} {\Pr \{ {\varvec{\Omega}}\} } . $$
(13)

During the operation period, the states of some components, subsystems, and the entire system can be observed by conducting condition monitoring periodically or non-periodically. If a component, subsystem, or the entire system is observed in a particular state at a particular time, the corresponding node in the DBN model of the system is instantiated with the observed state. Suppose that \( ne \) nodes in a DBN model are instantiated, the evidence of a DBN model is denoted as \( {\mathbf{e}} = \{ e_{{X_{1} }} ,e_{{X_{2} }} , \ldots ,e_{{X_{ne} }} \} \), where \( e_{{X_{i} }} \) (\( i \in \{ 1,2, \ldots ,ne\} \)) denotes the evidence of node \( X_{i} \), i.e., the observed state of a component, subsystem, or the entire system at a particular time. Consequently, when evidence \( {\mathbf{e}} \) is inputted into a DBN model, on the basis of the Bayes formula, the posterior probability distribution of the system state at time \( t \) can be obtained by marginalizing node \( S(t) \), and it is represented as follows [18]:

$$ \Pr \{ S(t)|{\mathbf{e}}\} = \frac{{\sum\nolimits_{{{\varvec{\Omega}}\backslash S(t)}} {\Pr \{ {\varvec{\Omega}},{\mathbf{e}}\} } }}{{\Pr \{ {\mathbf{e}}\} }} , $$
(14)

where \( \Pr \{ {\mathbf{e}}\} \) is the prior probability of evidence \( {\mathbf{e}} \). \( \Pr \{ {\mathbf{e}}\} \) can be calculated by marginalizing the instantiated nodes, i.e., nodes \( \{ X_{1} ,X_{2} , \ldots ,X_{ne} \} \), which is represented as follows [18]:

$$ \Pr \{ {\mathbf{e}}\} = \sum\nolimits_{{{\varvec{\Omega}}\backslash \{ X_{1} ,X_{2} , \ldots ,X_{ne} \} }} {\Pr \{ {\varvec{\Omega}},{\mathbf{e}}\} } . $$
(15)

Equations (13)–(15) can be calculated by various BN inference algorithms, such as variable elimination algorithm and junction tree algorithm. The details involved in the BN inference algorithms can be found in the books by Jensen and Nielsen [18], and Koller and Friedman [21]. Consequently, the state probability distribution of the entire system at time \( t \) can be evaluated by Eq. (13).

4.2 Reliability Assessment by Aggregating Multi-level Observation Data

The degradation processes of the components, subsystems, and the entire system of an MSS can be inspected by collecting condition monitoring data from sensors that are mounted at various physical levels of the system (component level, subsystem level, and system level). Observation data can be collected from multiple levels of an MSS simultaneously or asynchronously during the operation stage [7, 19, 75,76,77]. If an inspection is conducted at a particular time, the state probability distribution and reliability of an MSS can be updated by aggregating multi-level observation data. Moreover, if inspections are conducted chronologically during the operation period, the state probability distribution and reliability of an MSS can be updated dynamically. An evidence in the DBN model of an MSS is essentially the collected multi-level observation data. Therefore, the state probability distribution and reliability of an MSS can be updated using Eq. (14) once an evidence is inserted into the DBN model of the MSS. More details involved in updating system reliability dynamically by observation data during the operation period can be referred to [77,78,79]. The illustrative systems in Fig. 2 are used herein for further analysis.

For each of the five systems in Fig. 2, i.e., the series system, parallel system, series-parallel system, bridge system, and phased-mission system, the system is considered as reliable if the performance capacity of the entire system is not less than a user demand. The user demand of the five systems are set to be 3. The one-step state transition matrixes of all the components are given as follows:

$$ {\mathbf{P}}_{1} = \left[ {\begin{array}{*{20}c} {0.9185} & {0.0564} & {0.0251} \\ 0 & {0.9608} & {0.0392} \\ 0 & 0 & 1 \\ \end{array} } \right],\,{\mathbf{P}}_{2} = \left[ {\begin{array}{*{20}c} {0.9231} & {0.0474} & {0.0295} \\ 0 & {0.9734} & {0.0266} \\ 0 & 0 & 1 \\ \end{array} } \right], $$
$$ {\mathbf{P}}_{4} = \left[ {\begin{array}{*{20}c} {0.9579} & {0.0290} & {0.0131} \\ 0 & {0.9724} & {0.0276} \\ 0 & 0 & 1 \\ \end{array} } \right],\,{\mathbf{P}}_{5} = \left[ {\begin{array}{*{20}c} {0.9550} & {0.0308} & {0.0142} \\ 0 & {0.9704} & {0.0296} \\ 0 & 0 & 1 \\ \end{array} } \right], $$
$$ {\mathbf{P}}_{3} = \left[ {\begin{array}{*{20}c} { 0. 9 4 6 5} & { 0. 0 2 8 6} & { 0. 0 1 4 9} & { 0. 0 1 0 0} \\ 0& { 0. 9 5 8 9} & { 0. 0 2 8 1} & { 0. 0 1 3 0} \\ 0& 0& { 0. 9 8 0 2} & { 0. 0 1 9 8} \\ 0& 0& 0& 1\\ \end{array} } \right]. $$

The duration of the operation period is set at \( T = 50 \) units of time. For the phased-mission system, the durations of the three phases are set at \( T_{1} = 12 \) units of time, \( T_{2} = 18 \) units of time, and \( T_{3} = 20 \) units of time. Consequently, for the series, parallel, series-parallel, and bridge systems, system reliabilities at time \( t \), denoted as \( R(t) \), are shown in Fig. 13; for the phased-mission system, system reliability at time \( t \) is shown in Fig. 14.

Fig. 13.
figure 13

Original and updated system reliabilities of the four systems

Fig. 14.
figure 14

Original and updated system reliabilities of the phased-mission system

When one or more inspections are conducted chronologically, the system reliability of an MSS will be updated dynamically. The system-level or multi-level observation data of the five systems collected at two different time instants, i.e., \( t_{1} = 8 \) units of time and \( t_{2} = 20 \) units of time, are listed in Table 8. As a result, for each of the five systems, system reliability can be updated dynamically at the two inspection time instants. The updated system reliabilities of the series, parallel, series-parallel, and bridge systems are shown in Fig. 13. The updated system reliabilities of the phased-mission system are shown in Fig. 14.

Table 8. Multi-level observation data

5 Conclusions and Discussions

In this chapter, a holistic framework for MSS reliability modeling and assessment based on BNs and DBNs is presented. The basic characteristics of MSSs and BNs are presented. The detailed procedures of constructing the BN and DBN models of various MSSs are provided. The results show that BNs and DBNs can effectively represent and characterize dependency among components in MSSs. A reliability assessment approach by aggregating multi-level observation data is developed, which can update the system reliability dynamically once an additional inspection is conducted. The reliability modeling and assessment results of five typical MSSs show that BNs and DBNs are effective considerably in terms of modeling and assessing reliability of MSSs.

A crucial premise in this chapter is that the degradation process of each component in an MSS is characterized by a homogenous Markov process. Nevertheless, in real-world situations, the degradation process of a component may follow a non-homogenous Markov process or semi-Markov process. Under such a circumstance, one can calculate the transition probability matrix of a non-homogenous Markov process or semi-Markov process between any two time instants [12, 13, 80]. By setting the transition probability matrix as the corresponding CPT between two time slices, the proposed DBN models can be further extended to the non-homogenous Markov or semi-Markov case. Additionally, the CPTs of subsystem and system nodes in this chapter are all assumed to be deterministic. It is noted that probabilistic CPTs of subsystem and system nodes correspond to a generalized BN model which can reflect imperfect knowledge of system behaviors [28, 81].