1 Introduction

Probabilistic safety assessment (PSA) is performed by regulatory bodies to check whether the designs of nuclear power plants comply with regulatory requirements and by industry for identifying key vulnerabilities [1,2,3,4]. Traditional PSA methods, i.e., fault trees (FTs) and event trees (ETs) have critical limitations in practice, and the results may differ widely from real values due to imprecise descriptions of component aging and maintenance and binary modeling of component behavior (only faulty/safe states are considered) and the neglect of dynamics of the system (i.e., effect of the order and timing of failure events on the accident progression) [2, 5, 6]. In the PSA analysis of some advanced nuclear systems, such as the China Lead-based Research Reactor (CLEAR) [7], the fusion-driven subcritical system [8] and the International Thermonuclear Experimental Reactor (ITER) [9], high accuracy is required and such factors cannot be neglected.

Monte Carlo (MC) simulation is a popular dynamic method developed to overcome these limitations, [5, 10,11,12], but it is not efficient at simulating rare events in a complex system [13]. Although sampling algorithms of antithetic variable sampling [14], dagger sampling [15], and stratified sampling [16] can be applied to MC simulation to reduce the estimate error and the computational cost, the results are still unsatisfactory.

Biasing techniques can solve these problems effectively by forcing the events of interest to occur more frequently in the MC simulation. Some biasing techniques, such as importance sampling [17,18,19,20] and multi-canonical Monte Carlo (MMC) [21, 22], can reduce the estimate error and the computational cost.

To improve the simulation efficiency, in this paper we propose a new biasing method, the biasing transition rate method. Its idea is to bias transition rates of the components by adding virtual components to them in series to increase the probability of occurrence of the rare event. The performance is compared to that of the sole use of direct MC method.

2 The method

2.1 Assumptions

The biasing transition rate method is based on the following assumptions:

  1. 1.

    The system consists of l components, 1, …, l;

  2. 2.

    The components’ failure time and repair time follow an exponential distribution or a Weibull distribution, or linear aging based on these two distributions can be considered;

  3. 3.

    Every system state is possible at any time;

  4. 4.

    The system is s-coherent, and each of its working components is monotonously beneficial to the system; and

  5. 5.

    The unexpected event is a rare event, and 0 < P u < 1/2 is the probability of the unexpected event.

2.2 Biasing transition rate method

Let us consider a simple case in which a component transition occurs with a low transition rate λ. The idea of biasing transition rate is to add a virtual component with a transition rate to the component in series, so that transition rate of the integration of the real component and virtual component increases, and with Monte Carlo sampling it is easier to achieve the transition within the mission time of T m in each trial. The sequence branches into two at the transition time T t of the integration: In the first branch, the real component changes its state (i.e., from safe to faulty) and the probability (weight) of this sequence equals to the product of the contribution rate of the real component (1/(n + 1)) and the probability of the mother sequence P m; and the second branch keeps the original state, and the probability is the product of the contribution rate of the virtual component (n/(n + 1)) and the probability of the mother sequence. The sequences are simulated continuously until the absorbing state, or T m, is reached. The transition of the integration occurs with much greater probability in each trial. Figure 1 shows schematics of this method of component simulation. The red dash lines denote the sequences contributed by the real component, and the probability of each sequence is shown on the right.

Fig. 1
figure 1

(Color online) Schematics of biasing transition rate simulation for a component

The flowchart of this method is shown in Fig. 2. In a multi-component system simulation, the probability that the rare event occurs can be significantly increased by biasing the transition rates of some selected components. The probability (weight) of the jth sequence generated in the ith trial at mission time T m, can be expressed as

$$ p_{ij} = \left\{ {\begin{array}{*{20}l} {\sum\nolimits_{s = 1}^{{q_{ij} }} \theta \left( s \right), } \hfill & {q_{ij} \ge 1} \hfill \\ {0,} \hfill & {q_{ij} = 0} \hfill \\ \end{array} } \right., $$
(1)

where q ij is the number of branch points in the jth sequence generated in ith trial, and

$$ \theta \left( s \right) = \left\{ {\begin{array}{*{20}l} {\frac{1}{{\left( {n + 1} \right)}},} & {{\text{the}}\;{\text{s-th}}\;{\text{transition}}\;{\text{caused}}\;{\text{by}}\; {\text{the}}\;{\text{real}}\;{\text{component}}} \\ {\frac{n}{{\left( {n + 1} \right)}}, } & {\text{otherwise}} \\ \end{array} } \right., $$
(2)

where n is the biasing factor, a nonnegative number. A greater n means a greater number of branches created and longer time of computation. Given the increase in computational cost, it is suggested that the sum of the failure probabilities of all the biased components with the biased rates in the mission time lies below 0.1.

Fig. 2
figure 2

Flowchart of the biasing transition rate method

Four transition rates are considered in this paper:

  1. 1.

    An exponential distribution for the component transition (λ = λ 0 , λ 0 is the design transition rate, a constant)

  2. 2.

    A Weibull distribution for the component transition (λ = αλ 0 t α−1)

  3. 3.

    A linear aging model based on an exponential distribution (λ = λ 0 + k t, k is the aging factor)

  4. 4.

    A linear aging model based on a Weibull distribution (λ = αλ 0 t α−1 + k t, \( \alpha \) is the parameter factor in the Weibull distribution)

As an example, assuming that the transition time of a component follows an exponential distribution with failure rate λ, its probability of failure before time t can be expressed as

$$ F = 1 - e^{ - \lambda t} . $$
(3)

Assuming that the failure rate of the added virtual component is , taking the two components as an integration, the probability of the integration failing before time t is equal to

$$ F = 1 - e^{{ - \left( {n + 1} \right)\lambda t}} . $$
(4)

The mean time to transition is reduced from 1/λ to 1/[(n + 1)λ].

The biasing parameters for the four transition rates are given in Table 1.

Table 1 Parameter settings for the biasing transition rate method

To state advantages of this method, the estimator variation is theoretically analyzed in Sect. 2.3. Some parameters to evaluate performance of this method, such as the root mean square deviation (RMSD), the efficiency in collecting evidence of system failure, and the figure of merit, are analyzed in benchmark cases in Sect. 3.

2.3 Estimator based on the biasing transition rate method

Let z ij be the jth sequence generated in the ith trial of the biasing transition rate simulation. The failure probability of the system Q can be estimated by the estimator \( Q_{\text{b}} \) based on the biasing transition rate method,

$$ Q_{\text{b}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right), $$
(5)

where N is the number of MC cycles; M i is the sequence number of the ith trial; and ψ is a discrete function expressing the unexpected event, which can be expressed as Eq. (6) for the biasing transition rate method,

$$ \psi \left( {z_{ij} } \right) = \left\{ {\begin{array}{*{20}l} {p_{ij} , } \hfill & {\text{if the unexpected event occurs}} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.. $$
(6)

The variation of \( Q_{\text{b}} \) can be expressed by

$$ {\text{Var}}\left\{ {Q_{\text{b}} } \right\} = {\text{Va}}r\left( {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right)} \right) = \frac{1}{{N^{2} }}\left( {\mathop \sum \limits_{i = 1}^{N} {\text{Var}}\left( {\mathop \sum \limits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right)} \right)} \right). $$
(7)

The failure probability of the system Q can be estimated by the direct MC estimator \( Q_{\text{d}} \),

$$ Q_{\text{d}} = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \psi \left( {c_{i} } \right), $$
(8)

where c i donates the sequence in the ith trial. The variance of the estimator \( Var\left\{ {Q_{\text{d}} } \right\} \) is

$$ {\text{Var}}\left\{ {Q_{\text{d}} } \right\} = {\text{Var}}\mathop \sum \limits_{i = 1}^{N} \psi \left( {c_{i} } \right) = \frac{1}{{N^{2} }}\left( {\mathop \sum \limits_{i = 1}^{N} {\text{Var}}\left( {\psi \left( {c_{i} } \right)} \right)} \right). $$
(9)

For the direct MC method, the discrete function expressing the unexpected event can be expressed as

$$ \psi \left( {c_{i} } \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {\text{if the unexpected event occurs}} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.. $$
(10)

Since E(\( \sum\nolimits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right) \)) = E(ψ(c i )), and 0 < P u < 1/2, we have

$$ {\text{Var}}\left( {\mathop \sum \limits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right)} \right) \le {\text{Var}}\left( {\psi \left( {c_{i} } \right)} \right), $$
(11)

and

$$ {\text{Var}}\left\{ {Q_{\text{b}} } \right\} = \frac{1}{{N^{2} }}\left( {\mathop \sum \limits_{i = 1}^{N} {\text{Var}}\left( {\mathop \sum \limits_{j = 1}^{{M_{i} }} \psi \left( {z_{ij} } \right)} \right)} \right) \le \frac{1}{{N^{2} }}\left( {\mathop \sum \limits_{i = 1}^{N} {\text{Var}}(\psi \left( {c_{i} } \right))} \right) = {\text{Var}}\left\{ {Q_{\text{d}} } \right\}. $$
(12)

This proves that the biasing transition rate method can decrease the variance of the MC estimator.

3 Benchmark cases

3.1 Description of the system

To illustrate performance of the method, a system consisting of three components, as shown in Fig. 3, is used to benchmark the biasing transition rate method. The following four cases are considered:

Fig. 3
figure 3

Diagram of the benchmark system

  1. 1.

    An exponential distribution for the component life;

  2. 2.

    A Weibull distribution for the component life;

  3. 3.

    Linear aging based on an exponential distribution; and

  4. 4.

    An exponential distribution for the component life and for the component repair time.

The mission time is T m = 1000 h. For all the cases, the design failure rates are λ 1 = λ 2 = 10−4 and λ 3 = 10−5; the biasing factor is n = 9; and number of MC cycles is N = 1000. For Case 2, the parameter factors are α 1 = α 2 = α 3 = 1.1. For Case 3, the aging factors are k 1 = k 2 = 10−8 and k 3 = 10−9. For Case 4, design repair rates are u 1 = u 2 = 10−3 and u 3 = 0.

3.2 Results and discussion

Figure 4 shows the results of the biasing transition rate simulation and the direct MC simulation, compared with the reference results (i.e., the results using the minimal cut set method and the result of a direct MC simulation with a huge sampling size). The results of the two kinds of models are almost identical, which shows that the biasing transition rate method is effective at estimating reliability in the four cases. Figure 4 also shows that the estimates provided by the biasing transition rate method are smoother than those provided by the direct MC simulation, because the former can collect more evidences of system failure than the latter in the same number of MC cycles.

Fig. 4
figure 4

(Color online) Cumulative probabilities for system failure at T m = 1000 h, N = 1000, and different distributions of the component failure time, repair time, and linear aging

Table 2 lists the RMSDs and the efficiency for collecting evidence of the failure events for the two methods applied to the four cases. RMSD is a measure of the error between the estimated and reference values. To compute the RMSD for the two methods, the results of minimal cut set models (Cases 1–3) and the result of direct MC models with a huge number of MC cycles (Case 4) are set as the reference values. The results show that the RMSD value of the biasing transition rate model is much smaller than that of the direct MC model, hence the increased closeness to the real value. This matches the proof in Sect. 2.3: The biasing transition rate method is much more efficient at collecting evidence of system failure, a rare event, than the direct MC method, because the probability increases greatly for the transition of the integrated real and virtual components, and so does the probability that the system failure occurs in each trial in the biasing transition rate simulation. For example, in Case 1, the RMSD values of the direct MC model and the biasing transition rate model are 0.00647 and 0.00034, respectively, and the expected times to ‘system failure’ are 0.2031 and 0.0064, respectively.

Table 2 RMSD and efficiency of collecting evidence of failure events for the direct MC and biasing transition models

To quantify the efficiency of both methods in these cases, the figure of merit (FOM), a quantity to characterize the performance of a method, is introduced [23]:

$$ {\text{FOM}} = 1/\left( {\sigma^{2} T} \right), $$
(13)

where T is the computational cost and σ 2 is average of the squared deviation between the MC and the analytical values. The greater the FOM value is, the better the method. The figures of merit of both methods for the four cases are listed in Table 3. It can be seen that the performance is improved by the biasing transition rate method.

Table 3 FOMs (×104) for MC and BTR methods

4 Conclusion

A biasing transition rate method for safety assessment of a complex system by MC simulation is proposed. The estimator of this method is stated, and variance of the MC estimator is decreased. Four cases are used to benchmark this method. It is an effective method for modeling system failure, being more efficient at collecting evidence of rare events than the direct MC method in the same number of MC cycles. This method may be applied to the rare event simulation of a complex system to save computational cost. When applying an MC simulation with a deterministic code in a safety assessment of a nuclear system, its performance advantage may be more prominent. The performance can be further improved by coupling this method with another efficient method, the Monte Carlo dynamic event tree (MCDET).