Probabilistic Transient Stability Assessment and On-Line Bayes Estimation

Chiodo, Elio; Lauria, Davide

doi:10.1007/978-0-85729-088-5_8

Elio Chiodo³ &
Davide Lauria³

Part of the book series: Springer Series in Reliability Engineering ((RELIABILITY))

1638 Accesses
6 Citations

Abstract

It is a well-known fact that the increase in energy demand and the advent of the deregulated market mean that system stability limits must be considered in modern power systems reliability analysis. In this chapter, a general analytical method for the probabilistic evaluation of power system transient stability is discussed, and some of the basic contributes available in the relevant literature and previous results of the authors are reviewed. The first part of the chapter is devoted to a review of the basic methods for defining transient stability probability in terms of appropriate random variables (RVs) (e.g. system load, fault clearing time and critical clearing time) and analytical or numerical calculation. It also shows that ignoring uncertainty in the above parameters may lead to a serious underestimation of instability probability (IP). A Bayesian statistical inference approach is then proposed for probabilistic transient stability assessment; in particular, both point and interval estimation of the transient IP of a given system is discussed. The need for estimation is based on the observation that the parameters affecting transient stability probability (e.g. mean value and variances of the above RVs) are not generally known but have to be estimated. Resorting to “dynamic” Bayes estimation is based upon the availability of well-established system models for the description of load evolution in time. In the second part, the new aspect of on-line statistical estimation of transient IP is investigated in order to predict transient stability based on a typical dynamic linear model for the stochastic evolution of the system load. Then, a new Bayesian approach is proposed in order to perform this estimation: such an approach seems to be very appropriate for on-line dynamic security assessment, which is illustrated in the last part of this article, based on recursive Bayes estimation or Kalman filtering. Reported numerical application confirms that the proposed estimation technique constitutes a very fast and efficient method for “tracking” the transient stability versus time. In particular, the high relative efficiency of this method compared with traditional maximum likelihood estimation is confirmed by means of a large series of numerical simulations performed assuming typical system parameter values. The above results could be very important in a modern liberalized market in which fast and large variations are expected to have a significant effect on transient stability probability. Finally, some results on the robustness of the estimation procedure are also briefly discussed in order to demonstrate that the methodology efficiency holds irrespective of the basic probabilistic assumptions made for the system parameter distributions.

The singular and plural of names are always spelled the same; boldface characters are used for vectors; random variables (RVs) are denoted by uppercase letters.

Access provided by Autonomous University of Puebla. Download chapter PDF

Impact of Large Penetration of Correlated Wind Generation on Power System Reliability

Probabilistic small-signal stability analysis of power systems based on Hermite polynomial approximation

Article Open access 26 August 2021

Uncertainty Modeling Steps for Probabilistic Steady-State Analysis

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Stability assessment has long been recognized as a fundamental requirement in power system planning, design, operation and control. Transient stability can be defined as a property of an assigned power system to remain in a certain equilibrium point under normal conditions and reach a satisfactory equilibrium point after large disturbances such as faults, loss of generation, line switching, etc. [1]. Transient stability, therefore, constitutes a key aspect of modern power system reliability, and this fact is increasingly recognized in the modern power systems literature [2]. Indeed, some methods based on reliability theory are used in this chapter to perform an efficient assessment of power system stability. The traditional approach to transient stability analysis is deterministic, being based on the “worst case” approach. More specifically, transient stability quantitative assessment is generally performed on a three-phase fault on specific system buses as well as considering the load demand attaining its peak value over a prefixed time interval.

The application of probabilistic techniques for transient stability analysis was introduced in a series of articles by Billinton and Kuruganty [3–7], motivated by the random nature of:

the system steady-state operating conditions;
the time of fault occurrence;
the fault type and location;
the fault clearing phenomenon.

In fact, the steady-state operating conditions that heavily affect stability strongly depend on the load, which is a random process due to its intrinsic nature. This is especially evident in planning studies where the load level is the major source of uncertainty.

The time to clear the fault (fault clearing time, FCT), a crucial parameter in stability investigations, is also not known in advance, and so it should also be regarded as an RV. The probabilistic approach has also been explored in other significant articles such as [8], based on Monte Carlo simulations and [9], based on the “conditional probability” approach. An exhaustive account of the topic and the relevant bibliography can be found in the book by Anders [10, Chap. 12] which clearly states that: “stability analysis is basically a probabilistic rather than a deterministic problem”.

The analytical computation of the probability distributions of the intermediate RVs is one of the most challenging aspects due to the complexity of the mathematical models, as also pointed out by Anders [10, p. 577].

In a few articles [11–15], some theoretical results from probability theory and statistics have been utilized in order to develop an analytical approach to the transient stability evaluation of electrical power systems by performing critical considerations on the basic probability distributions. Instability probability (IP) over a certain period of time, with regard to a given fault, is defined and calculated by means of both critical clearing time (CCT) distribution and FCT distribution. This analytical approach, overcoming the drawbacks of Monte Carlo simulations, is very useful in actual operation since it permits straightforward sensitivity analysis of IP with regard to system parameters thus highlighting those which mostly influence the system stability characteristics and providing a quantitative tool for performing proper preventive control actions. For similar reasons, the proposed analytical probabilistic approach is also a powerful tool with regard to the practical aspect of the estimation of the basic parameters relevant to the transient stability assessment like IP or other measures of “stability margin” (SM). This topic—generally neglected, or dealt with in approximate “sensitivity analyses” in the literature—was faced in [16], where an effort was made to tailor a simple, analytical, transient stability probability estimator which allows the required characteristics of both efficiency and robustness to be obtained, in the framework of classical estimation.

In this article, a new Bayesian approach is proposed in order to provide this estimation: such an approach appears to be the most suitable one for on-line transient stability assessment. Numerical application performed confirms that the estimation technique is able to adequately “track” the transient stability in time, being far more efficient than the classical maximum likelihood (ML) estimation of the IP. This could be an interesting property in a modern liberalized market in which fast and large variations are expected to have a significant effect on transient stability probability.

In the final part of the chapter, some results on the robustness of the estimation procedure are also briefly discussed in order to illustrate that the methodology efficiency holds irrespective of the basic probabilistic assumptions effected with regard to “a priori” distributions of the various system parameters.

In the four Appendices to the chapter:

1.
a mathematical study of the IP versus the system parameters is illustrated in order to establish a proper “sensitivity analysis” of system stability which can be useful in the design stage.
2.
some basic properties of Bayesian estimation, relevant for the problem under study, are briefly mentioned and some properties already derived by the authors in previous articles for the interval estimation of the IP are also included.

2 Probabilistic Modelling for Transient Stability Analyses

2.1 Definition and Evaluation of Transient IP

Many state variables of an electrical power system possess an intrinsically stochastic nature and, consequently, a probabilistic description of transient stability aspects is able to infer interesting deductions also in terms of control actions for improving system robustness. For instance, the steady-state conditions, fault conditions and circuit breaker clearing times are not precisely known or predictable. The various involved uncertainties should be properly taken into account using suitable probabilistic models. In a probabilistic frame, both the network configuration and faults are described as random quantities. According to this approach, all faults potentially causing instability and all the possible network states at the instant of fault (e.g. in terms of the requested loads at load buses) have to be considered, together with their probability of occurrence. Once the cost of any consequence brought about by the loss of stability is known, IP assessment allows the instability risk to be evaluated. A risk value provides a quantitative measure for undertaking adequate preventive control actions for stability improvement and avoiding the need for conservative or “worst-case” criteria like those based on the classical deterministic analyses.

Hence, in order to effectively apply a probabilistic approach, a preliminary identification of the relevant statistical parameters has to be performed. This is a crucial step since they are potentially infinite: the choice can be made according to the required degree of accuracy.

Formally, let a proper probability space (O,P,S) be defined, where O is the sample space of all possible outcomes, S a sigma Algebra of events and P an additive probability measure over S. For the purpose of stability investigation, the sample space O may be defined as the product space O = O ₁ × O ₂, in which O ₁ is the set of all possible disturbances which can (potentially) affect system stability, O ₂ is the set of all possible “state vector” trajectories after the disturbance. This requires the definition of a proper “state vector” as a vector whose components are all the system variables whose values are the basis on which stability assessment is performed (see also following Eq. 2).

Let the random event I be the event of instability (over a given time horizon H of power system operation) and let (C ₁,…, C _m) be a finite set of random events constituting all the credible—and mutually exclusive—disturbances (“contingencies”) which can affect the system operation in H and potentially make system stability worse. Then, the IP in the interval H is provided, according to the total probability theorem, by:

$$ P(I) = \sum\limits_{j = 1}^{m} {P(C_{j} )P(I|C_{j} )} $$

(1)

where P(C _j) = probability of occurrence of the disturbance C _j in the time horizon under consideration; P(I|C _j) = probability of instability once the disturbance C _j occurred.

The above relation may also still hold, at least as an approximation, when the disturbances C _k are not mutually exclusive random events, provided that the joint probability of two (or more) disturbances is negligible^{Footnote 1}: this typically happens in very short time (VST) operation which the second part of this chapter focusses on.

In the following, IP strictly denotes a term like P(I|C) where C is the given fault.

As long as the fault statistics are known from available data of the system under consideration, the P(C _j) terms can be considered known terms; the P(I|C _j) terms are evaluated as shown in the following so that P(I) is readily obtained from the above relation.

A quite similar reasoning still applies if the fault location is modelled by an RV.

Basic RVs are load demand and the FCT. Aiming at the description of the system stability characteristics as a consequence of a given fault (for instance, a three-phase short-circuit), IP may be expressed as a function of these RVs.

For an assigned electrical power system, characterized by a state vector x ₀ at time t ₀ in which the fault is supposed to occur, let us denote the stability region of the post-fault equilibrium point with S. Naturally, x ₀ is an RV (more precisely, a random vector), mainly due to the random nature of the load demand which, as previously mentioned, has a significant effect on the operation state. Let τ be the FCT: denoting by x(t, τ; x ₀) the state vector trajectory at the time t after the fault clearing, the CCT for transient stability can be defined as follows:

$$ T_{\text{cr}} ({\mathbf{x}}_{0} ) = \sup \left\{ {\tau > 0:{\mathbf{x}}(t,\tau ,{\mathbf{x}}_{0} ) \in S,\forall t > t_{0} + \tau } \right\} $$

(2)

This relation clearly shows the dependence of the critical time on the random initial state x ₀, thus T _cr is also an RV. As discussed in [11–13], the FCT should also be regarded as an RV which will be denoted by T _cl.

By keeping in mind that the system maintains its stability conditions if and only if the FCT is smaller than the CCT, i.e. T _cl < T _cr, the IP for a given fault can be expressed as:

$$ q = P\left( {T_{\text{cl}} > T_{\text{cr}} } \right) $$

(3)

Formally, the model in the above relation is quite similar to the “Stress–Strength” model in reliability theory: indeed, if the failure of a certain device or system is caused by the occurrence of a “stress” T _cl greater than the “strength” T _cr of the device or system, than the above probability q represents the unreliability (failure probability) of the device or system.

Once the probability distributions of T _cl and T _cr are known, q = P(T _cl > T _cr) may be easily computed, as is well known in probability theory, as shown below. In fact, by describing both T _cl and T _cr as continuous non-negative RVs, with joint probability density function (pdf) f(t _cr, t _cl), the IP q is expressed as:

$$ q = P(T_{\text{cr}}< T_{\text{cl}} ) = \int\limits_{{t_{\text{cl}} = 0}}^{{t_{\text{cl}} = \infty }} {{\text{d}}t_{\text{cl}} } \int\limits_{{t_{\text{cr}} = 0}}^{{t_{\text{cr}} = t_{\text{cl}} }} {f(t_{\text{cl}} ,t_{\text{cr}} )} {\text{d}}t_{\text{cr}} $$

(4a)

In practice, the two RV are always considered in the literature as being statistically independent of each other since they are related to independent phenomena (as discussed in Sect. 2.3): under this assumption, let f _cl(t) and f _cr(t) be the marginal pdf of the RVs T _cl and T _cr, respectively, and let F _cl(t) and F _cr(t) be the correspondent cumulative probability distribution functions (cdf). The above expression may then be rewritten as follows:

$$ q = \int\limits_{0}^{\infty } {f_{\text{cr}} (t)(1 - F_{\text{cl}} (t)){\text{d}}t} $$

(4b)

Alternatively, by conditioning the instability event on the values of the clearing time T _cl, q can also be equivalently expressed as:

$$ q = \int\limits_{0}^{\infty } {f_{\text{cl}} (t)F_{\text{cr}} (t){\text{d}}t} $$

(4c)

In order to evaluate the probability q, the following preliminary steps have to be taken:

load demand and clearing time randomness have to be properly characterized in terms of distributions on the basis of realistic assumptions, by also taking the available data into account;
the CCT distribution has to be evaluated in terms of the load distribution since there is a conceptual and analytical relationship between them (this aspect will be adequately discussed in the following sections);
finally, the evaluation of the above integral has to be performed: often, this integration requires the use of numerical computation, but the analytical approach of this method allows it to be evaluated in a closed form.

This procedure is straightforward for the single-machine case (the so-called “one-machine infinite bus” system), as discussed in [11] since an analytical expression between T _cr and the load demand, based on the well-known equal area criterion, can be demonstrated. Besides, in [12], the procedure was also extended to a multi-machine system by resorting to the so-called “Extended Equal Area Criterion” [17]. This procedure allows difficulties arising in the evaluation of T _cr distribution to be overcome since T _cr can be analytically expressed as a function of the load demand.

2.2 Probabilistic Modelling of the CCT

In this section, the functional expression between the CCT and the load demand L at the instant of contingency is discussed. Due to the random nature of the load demand evolution and the unpredictability of the instant of fault, the load active power L has to be correctly regarded as an RV. This implies that T _cr is also an RV and its probability distribution may be calculated in terms of the load distribution which is generally estimated by load forecasting. In a quite general way, the load demand over time can be efficiently described through a continuous random process, L(t).

With reference to a generic time instant t, the load probability cumulative distribution function is denoted by F _L(l; t) and is defined over the non-negative real numbers as F _L(l; t) = P(L(t) ≤ l), l ≥ 0.

On the basis of the central limit theorem, it is generally assumed that L(t) can be described by a Gaussian random process [18]. The functional dependence between T _cr and L can be described in a compact way as T _cr = g(L) where g(·) is a continuous non-negative function over the positive real axis. Moreover, it can be proved that g(·) is a decreasing function of the active power L. Since T _cr is a function of the RV L and the function g(·) is continuous and non-negative, the CCT is also represented by a continuous non-negative RV. Once the distribution function of L is known, in principle, the distribution function of T _cr can be calculated by means of well-known theorems with regard to RV transformations applied to the analytical relation. Nevertheless, since the function g(·) is not analytically invertible, a closed-form expression for the probability distribution function of the variable T _cr cannot be obtained. In these cases, the problem of the distribution evaluation is often solved using a stochastic Monte Carlo simulation.

However, in [11], an approximate method for the analytical calculation of the probability distribution function of T _cr has been presented. The first step for the analytical evaluation is the approximation of the true characteristic g(L) with a simpler, invertible, analytical function. In particular, a log-linear model has proved to be very adequate when expressing the above characteristic for any given set of electrical parameters:

$$ T_{\text{cr}} = \beta_{0} \exp ( - \beta_{1} L) $$

(5a)

or:

$$\ln \left({T_{\text{cr}}}\right)= a - b \,L\,(a = \ln (\beta_{0})\,;\,b = \beta_{1} ) $$

(5b)

The model coefficients (β₀, β₁) are positive constants (so that: a is real, b positive), depending on the electrical parameters of the system. They can be efficiently determined by performing a linear regression of the natural logarithm of T _cr with regard to the load; i.e. according to the least-square method, a and b are chosen as the values minimizing the sum of the square deviations:

$$ S^{2} (a,b) = \sum\limits_{i = 1}^{n} {(\ln T_{{{\text{cr}},i}} - a - bL_{i} )^{2} } $$

(6)

the points (T _cr,i, L _i; i = 1,…, n) being chosen assuming a proper step in the interval (L ₁, L _n) in which they will probably occur. For instance, a μ ± 4σ interval may be chosen to represent the load values generated by a Gaussian distribution with mean μ and standard deviation σ.

On the basis of the model (see Sect. 2.1), the evaluation of the cdf of T _cr in terms of the probability distribution function of the load L is straightforward and, for non-negative values of the CCT, it is expressed by:

$$ F_{\text{cr}} (t_{\text{cr}} ) = 1 - F_{L} \left[ {{\frac{{a - \ln (t_{\text{cr}} )}}{b}}} \right],\quad t_{\text{cr}} \ge 0 $$

(7)

The above cdf is, of course, equal to zero for negative values of the argument. The above relation is quite general, i.e. independent of any particular assumption made about the load distribution. Moreover, it can be seen that the cdf of the load L(t)—i.e. the function F _L(·) which appears in the right-hand side of the previous equation—is dependent, of course, on time t (even if this not explicitly expressed in the previous equation); therefore, the expression of the cdf of T _cr also depends on time, and it is valid for any particular time instant t in which the fault occurs. The hypothesis of a Gaussian distribution, generally adopted to describe load, implies a Log-Normal distribution for the CCT and this model will be used as illustrated in the sequel. It should be stressed, however, that this distribution varies with time, although this fact may not be apparent at first sight. Indeed, time will not always be represented explicitly in the relevant equations which are often referred to at given short time intervals in which the above RV—the load, and thus the CCT—may be considered constant, and also the corresponding distribution. However, in the successive interval, this distribution is subject to changes. This should be quite clear in the framework of dynamic estimation.

2.3 Analytical Evaluation of IP: A General Methodology

As previously stated, the time interval needed for fault clearing (comprehensive of the time for the fast reclosure of the faulted line) should also be regarded as an RV, here denoted by T _cl. The arc extinction phenomenon, in fact, is intrinsically not deterministic; the randomness of T _cl may also be due to imperfect switching which can depend on the wear conditions of the poles caused by previous faults. Moreover, the (random) environmental conditions (temperature, humidity) also influence the clearing time T _cl. The RV T _cl is assumed to be continuous, non-negative and independent from the time instant of fault occurrence since the above phenomena can be considered independent of those which cause the fault. According to the definition of the CCT, it is natural to define the probability of instability after a contingency occurring at a given time instant t as:

$$ q = P(T_{\text{cr}} < T_{\text{cl}} ) = P\left\{ {g[L(t)] < T_{\text{cl}} } \right\} $$

(8)

In (8), the relation T _cr = g[L(t)] between the CCT and the load at (intended as “immediately before”) the instant t of the contingency is explicitly presented. In order to obtain IP over a prefixed time horizon (0, h), the statistics of the random process of faults should be taken into account. This means that (8) must be integrated with the probability distribution of the number of faults in (0, h) which is indeed a random process. A Poisson stochastic process [10, 19] may be generally assumed as valid for this purpose. However, a different and simpler approach is possible [11]. The stability event over (0, h) may be defined as the property whereby—in the whole interval–stress T _cl never exceeds strength T _cr, which depends on time t through function g. Hence, the IP can be defined as the probability that T _cr is exceeded by T _cl for at least one time instant t in (0, h): this happens if and only if, as time t varies in (0, h), T _cl exceeds the minimum value of T _cr attainable in this interval.

Mathematically speaking, the IP, q, can be expressed as follows:

$$ q = P\left\{ {\inf g[L(t)] < T_{\text{cl}} ;0 < t < h} \right\} $$

(9)

Let the peak load value, Λ = sup (L(t); 0 < t < h), over the interval (0, h) be introduced: since L(t) is a continuous random process, Λ is a continuous RV whose probability distribution function is denoted by F _Λ(λ). As previously mentioned, the function g[L(t)] is continuous and decreasing versus L. Therefore, the “minimum” CCT over (0, h), again denoted by T _cr, can be expressed as follows:

$$ T_{\text{cr}} = \inf g[L(t)] = g[\sup L(t)] = g(\Uplambda ) $$

(10)

Hence, the IP is expressed by:

$$ P(T_{\text{cr}} < T_{\text{cl}} ) = P\left\{ {g(\Uplambda ) < T_{\text{cl}} } \right\} $$

(11)

Hence, by keeping in mind the expression in (7), T _cr can be expressed in terms of the peak load Λ, so that its probability distribution function F _cr(t _cr) can be written as follows:

$$ F_{\text{cr}} (t_{\text{cr}} ) = 1 - F_{\Uplambda } \left[ {{\frac{{a - \ln (t_{\text{cr}} )}}{b}}} \right],\quad t_{\text{cr}} \ge 0 $$

(12)

where the constants a and b depend on the particular system, but are indeed constant with time (unless system topology changes; this case is excluded here but can be dealt with the same methodology, once it occurs, by simply computing the new values of a and b).

The problem can then be easily solved once the peak load and the clearing time distributions are known. The distribution function and the probability density function of the clearing time T _cl are denoted by F(t) and f(t), respectively. Assuming, as reasonable, that the variables T _cl and T _cr are statistically independent, the IP in Eq. 11 can then be calculated as in Eqs. 4b or 4c. This approach, taken from [13], expresses the IP over an arbitrarily large interval—once the pdf of peak load is known—and is useful if a planning horizon is being studied. It was presented here for the sake of completeness: the application in this chapter is in fact devoted to on-line stability assessment for VST applications, related to time intervals typically lasting 1 h or less, so that in those intervals, the load L may be considered as a constant, albeit unknown (random) value so that, in practice, it will be modelled through an RV instead of a stochastic process. However, such a distinction does not affect the methodology followed in the sequel, since—as anticipated—the Gaussian distribution which will be used here is widely employed for describing the load process since it is also common practice to describe the peak load uncertainty by means of a Gaussian RV whose expected value is the forecasted peak load. For very large time horizons, the Extreme value (EV) distribution is also a natural candidate for describing the peak load [20]: this model—as well as others—can also be handled in practice with no particular problems as shown in [13], using the same methodology illustrated here.

3 Analytical IP Evaluation for Gaussian Load and Log-Normal FCT

3.1 Analytical Expression of IP

In this section, the analytical expression of the statistical parameter q—the IP of the given system under a given fault occurrence—is discussed on the basis of reasonable assumptions for the distribution functions of the RVs L (and consequently T _cr) and T _cl. For the sake of notation simplicity, the RVs T _cr and T _cl will be, respectively, named T _x and T _y.

The load L is then assumed to be a N(μ _L, σ_L) RV: then, letting l be a given possible load value, L is characterized by the following pdf over (−∞ < l < +∞):

$$ f_{L} (l) = {\frac{1}{{\sigma_{L} \sqrt {2\pi } }}}\exp \left[ {{\frac{{ - (l - \mu_{L} )^{2} }}{{2\sigma_{L}^{2} }}}} \right] $$

The interval of possible values is (−∞ < l < +∞) only theoretically, being derived from the Gaussian representation. In fact, the probability of negative values for L should, of course, be equal to zero; in practice, it is known that P(L < l) ≈ 0 if l < μ − 3σ.

The natural logarithm of T _x is also normally distributed on the basis of the above-discussed relationship X = ln(T _x) = a − bL. Hence, the distribution of T _x can be described by a Log-Normal distribution. It can be seen that the authors have shown [11–13], by means of extended numerical simulations and adequate statistical tests, that the proposed Log-Normal model for the distribution of the critical time, when the load is a Gaussian RV, is very adequate.

The Log-Normal pdf with parameters α (scale) and β (shape) is expressed by:

$$ f(t;\alpha ,\beta ) = {\frac{1}{{\sqrt {2\pi } t\beta }}}\;{\text{e}}^{{{\frac{{ - (\ln t - \alpha )^{2} }}{{2\beta^{2} }}}}} \quad t \ge 0 $$

(13)

and the density f(t) is zero for t < 0.

In expression (13) α and β represent the mean value and the standard deviation of the natural logarithm of the Log-Normal variable, respectively; the mean value μ and the standard deviation σ, corresponding to Eq. 13, are:

$$ \mu = {\text{e}}^{{\alpha + {\frac{{\beta^{2} }}{2}}}} ;\quad \sigma = \mu \sqrt {{\text{exp}}{{(\beta^{2}) - 1}} } $$

(14)

In this section, T _x is thus assumed to follow a Log-Normal distribution, with parameters α_x and β_x.

From relationships (14), the parameters α_x and β_x can be obtained from the statistical parameters μ_L, σ_L and the regression coefficients a and b—denoting by X the natural logarithm of the CCT T _x—are expressed by:

$$ \begin{gathered} \alpha_{x} = E[X] = a - b\mu_{L} ; \hfill \\ \beta_{x} = \sqrt {{\text{Var}}[X]} = b\sigma_{L} \hfill \\ \end{gathered} $$

(15)

A proper probabilistic modelling for the clearing time T _y must also be introduced.

If there is a lack of experimental data, a Gaussian distribution is often assumed. However, the Gaussian distribution does not appear to be a very adequate and flexible choice, and the Log-Normal model is used instead, as in [11–13]: the Log-Normal pdf is indeed very flexible, since it can assume a large variety of shapes with positive “skewness index” which allows for a typical long “right tail” [21] whereas the Gaussian model only allows a single shape for the distribution of the clearing time, i.e. a symmetrical (bell-shaped) distribution around the mean value which is not likely to occur in real applications. The presence of a right tail in the Log-Normal density accounts for the possibility of relatively large clearing times compared with the expected value: thus, the Log-Normal assumption corresponds to a conservative approach which is appropriate when the exact distribution is unknown. The Log-Normal assumption for T _y also permits a straightforward analytical calculation of the IP, without being restrictive, since other distributions may be adopted with the same methodology as shown in [13], requiring only elementary numerical methods.

Furthermore, if the β coefficient of the Log-Normal pdf is small enough, the pdf tends to become symmetrical and may also satisfactorily approximate a Gaussian model.

In this section, the IP computation is performed under the previously discussed hypothesis that both the clearing time T _y and the minimum CCT T _x are described by Log-Normal, independent, RVs. It is, therefore, assumed that T _y has a Log-Normal distribution with parameters α_y and β_y, with density $ f_{{T_{y} }} (t_{y} ) $ expressed by (13). As a particular case, in VST applications the FCT may be considered as a known constant (as discussed in Sect. 3). With reference to the choice of α_y and β_y, they can be related to the values of $ \mu_{{T_{y} }} $ (mean value of T _y) and $ \sigma_{{T_{y} }} $ (standard deviation of T _y), on which some information could be known in practice.

Denoting by $ v_{y} = {\frac{{\sigma_{{T_{y} }} }}{{\mu_{{T_{y} }} }}} $ the coefficient of variation (CV) of T _y, the relations specifying α_y and β_y as functions of $ \mu_{{T_{y} }} $ and $ \sigma_{{T_{y} }} $ are the following:

$$ \beta_{y} = \sqrt {\ln (1 + v_{y}^{2} )} ,\quad \alpha_{y} = \ln \mu_{{T_{y} }} - {\frac{{\beta_{y}^{2} }}{2}} $$

(16)

Different values can be considered for the parameters $ \mu_{{T_{y} }} $ and v _y, in order to establish a sensitivity analysis. The IP variability versus the mean FCT $ \mu_{{T_{y} }} $ is particularly interesting since such a mean clearing time is a practical measure of the reliability level of the protection system.

The determination of the probability q, when T _x and T _y are Log-Normal and independent of each other, is now considered.

First, the following auxiliary RVs are introduced:

$$ X = \ln T_{x} ,\quad Y = \ln T_{y} $$

(17a)

$$ Z = X - Y $$

(17b)

Under the assumed hypotheses, the probability laws of the above RV X and Y are, respectively, N(α_x, β_x) and N(α_y, β_y), where, using from now on the symbol α_x instead of α_cr, and generally suffixes (x, y) instead of (cr, cl):

$$ \alpha_{x} = E\left[ X \right];\quad \alpha_{y} = E\left[ Y \right];\quad \beta_{x} = {\text{SD}}\left[ X \right];\quad \beta_{y} = {\text{SD}}\left[ Y \right] $$

(18a)

According to the well-known properties of the Gaussian distribution, the variable Z, being the difference between two independent Gaussian RVs, is also Gaussian with mean value and standard deviation given by:

$$ \mu_{Z} = E\left[ Z \right] = E\left[ X \right]-E\left[ Y \right] = \alpha_{x} - \alpha_{y} $$

(18b)

$$ \sigma_{Z} = {\text{SD}}\left[ Z \right] = \sqrt {{\text{Var}}[X] + {\text{Var}}[Y]} = \sqrt {\beta_{x}^{2} + \beta_{y}^{2} } $$

(18c)

It is opportune, although obvious, to remark that in practice μ_z is always positive (α_x > α_y) since the FCT must always be small enough when compared with the CCT for the system to possess an acceptable level of stability, namely possess a very small IP value. This—being IP = P(Z < 0)—can occur only if E[X] is larger than E[Y]: this intuitive fact will be confirmed by the computations in the following.

By introducing the standard Gaussian distribution function:

$$ \Upphi (x) = \int\limits_{ - \infty }^{x} {{\frac{1}{{\sqrt {2\pi } }}}{\text{e}}^{{ - {\frac{{u^{2} }}{2}}}} {\text{d}}u} $$

(19)

the IP can be easily computed as:

$$ q = P\left( {T_{x} < T_{y} } \right) = P\left[ {\ln \left( {T_{x} } \right) < \ln \left( {T_{y} } \right)} \right] = P\left( {X < Y} \right) = P\left( {Z < 0} \right) = \Upphi \left( { - {\frac{{\mu_{Z} }}{{\sigma_{Z} }}}} \right) = 1 - \Upphi \left( {{\frac{{\mu_{Z} }}{{\sigma_{Z} }}}} \right) $$

by using the well-known property: Φ(−x) = 1 − Φ(x), valid for each real number x.

Alternatively, the “Complementary standard Gaussian distribution function” (CSGCDF) can be used as we have done here:

$$ \Uppsi \left( x \right) = 1 - \Upphi \left( x \right) $$

(20)

Since the CSGDF Ψ(x) is a strictly decreasing function of x from the value Q(−∞) = 1 to the value Q(∞) = 0, and Q(0) = 0.5 (see Appendix 1 for some curves), the IP can be expressed by the more compact expression:

$$ q = \Uppsi (u) = \int\limits_{u}^{\infty } {{\frac{1}{{\sqrt {2\pi } }}}} \exp \, \left( { - {\frac{{\xi^{2} }}{2}}} \right){\text{d}}\xi $$

(20a)

where

$$ u = {\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \beta_{y}^{2} } }}} = {\frac{E(X) - E(Y)}{{\sqrt {V(X) + V(Y)} }}} $$

(20b)

The above quantity u, which plays a key role in the statistical assessment of the IP, can be defined as the “SM” of the system (under the given fault) since the larger the value of u, the smaller the IP. Confirmation of the fact that the relation (α_x > α_y) must always be satisfied in practice (although it is not mandatory on theoretical grounds) is that, unless this happens, the IP is greater than 0.5 (if α_x = α_y, then u = 0 → q = 0.5).

3.2 A Numerical Example

As a numerical example, typical values of the mean values of FCT and CCT (which will be used in the applications in the chapter) are μ _x = 0.145 s and μ _y = 0.10 s, respectively.

It is, therefore, assumed that the CCT and FCT follow two independent LN distributions with mean values as above; moreover, a common CV value of 0.1 is assumed for both the CCT and the FCT (i.e. v _x = v _y = 0.10). The following values of (α_x, α_y, β_x, β_y) correspond to these values:

$$ \alpha_{x} = - 1.9360,\quad \beta_{x} = 0.0998;\quad \alpha_{y} = - 2.3076,\quad \beta_{y} = 0.0998 $$

having used the relations already stated:

$$ \beta = \sqrt {\ln (1 + v^{2} )} ;\quad \alpha = \ln (\mu ) - {\frac{{\beta^{2} }}{2}} $$

It can be seen that α_x > α_y, as expected. The above values β_x and β_y in the example are equal since they depend on the CV value only.

Finally, the SM value is $ u = {\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \beta_{y}^{2} } }}} = 2.634 $ and the IP is then evaluated as q = Ψ(u) = 0.00423.

For high values of the SM like the one above, it is worth noting that (see Appendix 1 too) the IP is very sensitive to the variations of system parameters such as the mean FCT μ _y, as can also be seen by taking the derivative of q with regard to μ _y. For instance, if the mean FCT increases from the above 0.10 to 0.11 s (a 10% increase), with the same CVs, then the IP increases to 0.0251 (a 493% increase!). The IP variation compared with both CV variations is also very high. This is just an example of some analytical remarks, briefly discussed below, which may be useful in actual practice.

3.3 Some Final Remarks on IP Sensitivity and Its Estimation

Deferring a more detailed illustration of the IP expression to Appendix 1, this section concludes by highlighting some basic facts which are easily deduced when observing the expression of q:

$$ q = \Uppsi \left( u \right) = \Uppsi \left( {{\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \beta_{y}^{2} } }}}} \right) $$

(21)

Since Ψ is a decreasing function of its argument, then, as intuitive, q decreases (and stability improves) as the SM increases, e.g. the mean CCT μ _x increases, or the mean FCT μ _y decreases; for given values of μ _x and μ _y, it can be verified, also analytically (see Appendix 1), that the IP also increases when the CV of the CCT and/or FCT increases.

In other words, IP increases as the uncertainty about the above times increases: this consideration has the practical implication that, if uncertainty in load values (which entails uncertainty in the CCT) and/or in FCT is neglected (i.e. their CV values are assumed as zero), the IP may be undesirably underestimated.

It can also be seen that q(u) decreases very quickly towards 0—as exemplified in the above numerical example—especially when the SM u is large enough. This and other mathematical aspects of the relation between the IP and its parameters are discussed and also illustrated graphically, with some details in Appendix 1, in which a sensitivity analysis of the IP is also illustrated.

The great advantage of the proposed analytical approach—compared with numerical methods or Monte Carlo simulation—consists indeed in the very easy way that this approach enables us to perform this sensitivity analysis with regard to system parameters. This is clearly a very desirable property in view of an efficient system design (i.e. with regard to the protection system: taking decisions on how to improve performance of the protection system, lowering the mean value of the FCT, or improving data acquisition in order to reduce its SD or, with regard to the network topology: trying to devise the opportune actions in order to increase the mean value of the FCT and similar actions).

It can be seen that the IP value obtained by the above methodology is only a (statistical) point estimate of the “true” IP since it is obtained from estimated values of the true parameters α_y, α_x, β_y, β_x (as far as the CCT parameters are concerned, they are “forecasted” since they are obtained on the basis of a load forecast; the FCT parameters are estimated from available field or laboratory data). The topic of estimation is discussed in a Bayesian framework in the following sections where ML estimation is also mentioned.

The problem of IP sensitivity described above and estimation are closely related. The above results, and those in Appendix 1, show that, in view of the above high sensitivity of the IP to system parameters, particular attention should be paid to developing an efficient estimation of the characteristic parameters of the CCT and FCT.

4 Bayesian Statistical Inference for Transient Stability

4.1 Introduction

Bayesian inference [22–24] is becoming more and more popular as a powerful tool in all engineering applications including recent applications to power system analysis. This section and the following one, including the last ones which focus on numerical applications, are devoted to a novel methodology for the Bayesian statistical estimation, or briefly “Bayesian estimation” of the IP. In particular, here we are interested in developing a proper methodology for making inference about IP, once prior information and experimental data are available regarding the pdf of the unknown parameters of the IP, q, the transient IP.

It has been seen that the analytical expression of the IP value requires efficient statistical estimates of the true parameters α_y, α_x, β_y, β_x to be evaluated in actual practice (e.g. the CCT parameters, as pointed out before, depend on the load parameters, which are not known, but estimated as a consequence of a load forecast). The extreme IP sensitivity in the region of the values of practical interest (i.e. those yielding IP values of the order of 1e−3 or less) reinforces the need for an efficient estimation.

The aim of the inference is to establish both point and interval estimates of the unknown probability q = Q(u) given that the parameters (α_x, β_x, α_y, β_y) of the two LN distributions must be estimated on the basis of the available random samples (T _xk: k = 1,…, n) and (T _yk: k = 1,…, m).

Bayesian inference [22–25] successfully provides a coherent and effective probabilistic framework for sequentially updating estimates of model parameters as demonstrated by the ever increasing number of publications addressed to it in both theoretical and applied fields. Bayes estimation, therefore, appears to be quite adequate for on-line sequential estimation of model parameters. For well-known reasons, moreover, it is particularly efficient (compared with traditional classic estimation, based on ML methods, briefly mentioned in the final part of this section) when rare events are of interest, as is the case here. This is so true that it is currently proposed even when there are no data (see, e.g. [26] for a recent application).

The core of the Bayesian approach is the description of all uncertainties present in the problem by means of probability, and its philosophical roots lay in the subjective meaning of probability [25]. According to such philosophy, the unknown parameters to be estimated are considered RVs, characterized by given distributions whose meaning is not a description of their “variability” (parameters are indeed considered fixed but unknown quantities) but a description of the observer’s uncertainty about their true values. Let ω = (ω₁, ω₂,…, ω_n) be the n-dimensional vector of the parameters to be estimated. The first step in a Bayes estimation process is to introduce—in order to express the available knowledge on the parameters before observing data—a “prior” probability distribution, characterized—in the continuous case here considered—by a joint (n-dimensional) pdf over the parameter space Ω:

$$ g({\varvec {\omega}}) = g_{1,2, \ldots n} (\omega_{1} ,\omega_{2} , \ldots ,\omega_{n} ),\quad {\varvec {\omega}} \in \Upomega $$

(22)

This prior pdf is often—but not always—chosen “subjectively”, which does not mean “arbitrarily”, but means “on the basis of the knowledge available to the analyst”, also using “objective” pieces of information which in most cases could not be used in classical (frequentist) statistical estimation [22–25] which does not admit the existence of a prior pdf.

Then, the data D are observed according to a formal probability model which is assumed to represent the probabilistic mechanism for some (unknown) value of ω which has generated the observed data D. This model gives rise to the “likelihood function” (LF), L(D|ω), i.e. the conditional probability of the data, given ω ∈ Ω. After observing the data D, all the new (updated) available knowledge is contained in the corresponding posterior distribution of ω. This is represented by a posterior joint probability density, g(ω|D), obtained from Bayes’ theorem:

$$ g(\omega |D) = {\frac{L(D\left| \omega \right.)g(\omega )}{{\int {\int \ldots \int\limits_{\Upomega } {L(D\left| \omega \right.)g(\omega ){\text{d}}\omega } } }}} $$

(23)

where the denominator is the n-fold integral over the whole parameter space Ω. Then, if a function τ = τ(ω) of the parameters in ω is the subject of estimation, according to the well-known “mean square error” (MSE) criterion, the best Bayes estimate “point” estimate—denoted by τ°—is given by the posterior mean of τ, given the data D. This may be obtained by well-known rules related to the expectation of a function of RV [19] by:

$$ \tau^{ \circ } = E[\tau |D] = \int {\int \ldots \int\limits_{\Upomega } {\tau (\omega )g(\omega \left| D \right.){\text{d}}\omega } } $$

(24)

The particular case τ (ω) = ω_κ—for any given k value, k = 1,2,…, n—yields the Bayes estimation of any single parameter ω_κ (k = 1,2,…, n).

Alternatively, by denoting the prior pdf of τ by h(·), i.e. the pdf induced—by a proper manipulation of the pdf g(ω)—on the space of τ values by the transformation τ = τ(ω), and introducing, analogously, the posterior pdf of τ, h(τ|D), the above expectation may be obtained equivalently by the following integral:

$$ \tau^{ \circ } = E[\tau |D] = \int\limits_{\Upxi } {\tau \cdot h(\tau \left| D \right.){\text{d}}\tau } $$

(25)

with Ξ being the space of τ values. In practice, and also in this application, it is very difficult, if not impossible, to deduce an analytical expression for the posterior pdf of τ, and the above expectation may be more easily obtained by the integral over Ω even if, in most cases, it is evaluated numerically or by means of simulation.

Unlike classical estimation, which is inherently focussed on the point estimate, here this is only a particular piece of information: indeed, Bayesian inference aims to express all the available knowledge on the parameters not by a single value but by means of the complete posterior pdf of τ, h(τ|D), denoting by h(·) the pdf induced—by a proper transformation of the pdf g(ω)—on the space of τ values by the transformation τ = τ(ω). The point estimate is only a “synthesis” of this pdf. This pdf (which would have no meaning in the classical inference since it is regarded as an unknown constant, not an RV) is the “key” information provided by Bayes estimation since it allows any probabilistic statement about the values of τ to be expressed. Typically this pdf is used to form a “Bayesian confidence interval” (BCI) or “Bayesian credible interval” of the unknown τ, defined as:

$$ {\text{BCI}}(\tau ;\pi ) = (\tau_{1} ,\tau_{2} ) $$

(26)

so that P(τ₁ < τ < τ₂) = π, where π is a given probability. The BCI is expressed in terms of the posterior pdf of τ as follows:

$$ P(\tau_{1} < \tau < \tau_{2} |D) = \int\limits_{{\tau_{1} }}^{{\tau_{2} }} {h(\tau |D){\text{d}}\tau = \pi } ,\quad 0 < \pi < 1 $$

(27)

In practice, the above relation is generally not sufficient to find the BCI but further requirements, such as a search for the “Highest Posterior Density” regions [23], allow the determination of both the unknowns (τ₁, τ₂) above.

4.2 A General Methodology for Bayesian Inference on Transient IP

Let us transpose the above concepts of Bayesian inference to the estimation of the IP:

$$ q = \Uppsi \left( u \right),\quad u = {\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \, \beta_{y}^{2} } }}} $$

(28)

with u being the SM whereas Ψ(z) is the CSGCDF Ψ(z) = 1 − Φ(z):

$$ \Uppsi (z) = \int\limits_{z}^{\infty } {{\frac{1}{{\sqrt {2\pi } }}}} \exp \left( { - {\frac{{\xi^{2} }}{2}}} \right){\text{d}}\xi ,\quad z \in ( - \infty , + \infty ) $$

(29)

For the purpose of Bayes estimation, in the most general case, all the parameters shall be considered unknowns. Therefore, in the Bayes approach, the four parameters (α_x,α_y,β_x,β_y) and also the SM u and the same IP, q, are regarded as a realization of RV which will be denoted by the following capital letters in the sequel: ^{Footnote 2}

$$ \alpha_{x} \to M_{x} ;\quad \alpha_{y} \to M_{y} ;\quad \beta_{x}^{2} \to V_{x} ;\quad \beta_{y}^{2} \to V_{y} $$

(30)

The symbols M and V are also chosen for “mnemonic” reasons since, as mentioned below, they correspond to mean values and variances, in particular of the logarithm of the FCT (T _x) and of the logarithm of the CCT (T _y):

$$ M_{x} = E\left[ {\ln T_{x} } \right],\quad V_{x} = {\text{Var}}\left[ {\ln T_{x} } \right],\quad M_{y} = E\left[ {\ln T_{y} } \right],\quad V_{y} = {\text{Var}}\left[ {\ln T_{y} } \right] $$

(31)

Bayes estimation, therefore, consists of assessing prior distributions to the above parameters (M _x,M _y,V _x,V _y), and then evaluating point and interval estimates of the unknown parameter IP, described by the RV Q, function of the RV U:

$$ Q = \Uppsi \left( U \right),\quad U = {\frac{{M_{x} - M_{y} }}{{\sqrt {V_{x} + V_{y} } }}} $$

(32)

The above relations specify the relation between the basic RV (M _x,M _y,V _x,V _y) and the IP Q:

$$ Q = Q(M_{x} ,M_{y} ,V_{x} ,V_{y} ),{\text{ with}}:(M_{x} ,M_{y} ) \in \Re^{2} ;\quad \left( {V_{x} ,V_{y} } \right) \in \Re^{ + 2} $$

(33)

These relations appear to be quite complicated, in particular due to the presence of the special function Ψ(U), whatever the choice of the prior pdf of the basic RV, a topic which is dealt with below. The same argument also applies, a fortiori, to the posterior pdf. In practice, it is impossible to evaluate the pdf of Q—be it prior or posterior—analytically. It can be, however, handled numerically by resorting to a reasonable “Beta approximation”, for example—introduced in a different study by Martz et al. [27] and also used by the authors in the above-mentioned (a.m.) article [16] in the framework of ML estimation. This approximation is illustrated in Appendix 3 and has been shown to be very adequate, although not being the only possible approximation, since every pdf over (0, 1) which can be rather smooth and flexible may be a good candidate (possible alternative choices studied by the authors are also mentioned in Appendix 2). In any case, once the numerical pdf of Q is obtained, its usefulness in this application seems to consist, first of all, in establishing a proper “upper confidence bound” for IP, i.e. a q _UP value that makes the probability of the “desirable” event (Q < q _UP) high enough, say 0.95 or 0.99. Therefore, by denoting this high probability value with η, interest may be focussed on the determination of a q _UP value so that:

$$ P\left( {Q < q_{\text{UP}} } \right) = \eta $$

(34)

With a sufficiently large probability value, we are, therefore, assured that the “true” IP, Q, is smaller than an “upper bound” q _UP, which is the Bayes counterpart of the confidence level of 100η%.

An important characteristic, perhaps the most important one, of the Bayes inference methods is the one, already mentioned, of allowing any probabilistic statement on the values under investigation, here the IP, to be expressed, e.g. in terms of the above BCIs.

This is the core of “Bayes inference”, which is something more than pure estimation, and this is also the reason behind the heading of this section (“Bayesian inference” rather than “Bayesian estimation”). Also, in many practical cases (e.g. in order to control if some prefixed requirements or standards are met by system performances), an interval estimate may be more significant than the point estimate alone.

However, also in view of the analytical or numerical difficulties mentioned above associated with the establishment of the BCI, in actual practice, there is no doubt that the typical objective of the Bayes methods is to assess the point estimate of Q. This is the topic dealt with from now on. This point estimate of Q may be evaluated after the assessment or evaluation of the:

prior parameters’ pdf: $ g_{{m_{x} ,m_{y} ,v_{x} ,v_{y} }} $ (m _x,m _y,v _x,v _y), briefly denoted as g(m _x,m _y,v _x,v _y);
the LF: L(D|(m _x,m _y,v _x,v _y), which is given in this case by the conditional joint pdf of the observed data (FCT values, i.e. the times T _xk, and CCT values, i.e. times T _yk, recorded in the interval of interest for the IP prediction). This joint pdf is conditional to the parameters (m _x,m _y,v _x,v _y);
posterior parameters’ pdf: $ g_{{M_{x} ,M_{y} ,V_{x} ,V_{y} }} $ (m _x,m _y,v _x,v _y|D), briefly denoted as g(m _x,m _y,v _x,v _y|D), obtained by the prior pdf and the LF by means of the Bayes theorem as illustrated above.

Finally, the Bayes estimate, denoted by Q°, of the IP Q is given in principle by the four-dimensional integral:

$$ Q^{ \circ } = E[Q|{\user2{D}}] = \int {\int {\int {\int\limits_{\user 2\Upomega } {Q(m_{x} ,m_{y} ,v_{x} ,v_{y} )g(m_{x} ,m_{y} ,v_{x} ,v_{y} \left| {\user2{D}} \right.){\text{d}}m_{x} {\text{d}}m_{y} {\text{d}}v_{x} {\text{d}}v_{y} } } } } $$

(35)

with Ω the parameter space above specified for the four parameters (m _x,m _y,v _x,v _y), Q(m _x,m _y,v _x,v _y) = Ψ(u) (lowercase letters are used for the single determinations of the RV being studied), $ u = {\frac{{m_{x} - m_{y} }}{{\sqrt {v_{x} + v_{y} } }}} $, and Ψ the above CSGDF.

As far as the choice of prior pdf both for the above parameters is concerned, it is well known that the most simple natural candidates are the so-called “conjugate prior pdf” [23, 24]: this means adopting Gaussian prior pdf for the mean values M _x and M _y and Inverted Gamma prior pdf for the variances V _x and V _y (see Appendix 2). These are indeed the prior pdf of mean and variances for both the normal and the Log-Normal sampling distributions [21–24].

The above integral may appear quite cumbersome, yet its evaluation can be made—at least in some cases—relatively simple, observing that the particular form of the random IP, Q(U), can be reformulated in terms of the RV: M = M _x − M _y, V = V _x + V _y, $ S = \sqrt V . $ Hence Q = Ψ(M/S).

If prior and posterior information on the four parameters (M _x,M _y,V _x,V _y) is recast into prior and posterior information on the difference M = (M _x − M _y) and the sum V = (V _x + V _y), the above integral, hence, reduces to a double integral (with respect to the pdf of M and S) which can be solved using methods related to Bayes estimation of Gaussian probabilities [23, 24].

The above transformation between the pdf of (M _x,M _y,V _x,V _y) and those of M and V may be effected by elementary RV transformations, taking advantage of the assumed s-independence between the CCT and the FCT which logically implies the s-independence between their mean values and variances. For instance, adopting conjugate Gaussian prior pdf both for M _x and M _y, assumed as s-independent, then the prior pdf for M = (M _x − M _y) is again Gaussian with obvious values of the parameters; the same holds, as is well known, for the posterior pdf. As far as the variances are concerned, the same reasoning does not apply for the above-mentioned conjugate Inverted Gamma pdf which is typically adopted as the prior pdf. However, if one is able to express information directly in terms of the sum of variances V = (V _x + V _y), by using an Inverted Gamma pdf for V, the classical results of Bayes estimation for the Gaussian model (mentioned in Appendix 2) may still be applied. In general, however, if other prior models are chosen, the above estimation must be carried out numerically. This poses no particular problem nowadays since specific codes and algorithms have been devised for such purposes [22–24].

A major simplification occurs in the particular case considered in the application of this contribution, i.e. in VST applications in which the FCT may be considered in practice as a known constant (as discussed in the following section).

4.3 A Simplified Method for Bayes Estimation of the IP in the Event of VST Stability Prediction

The general theoretical problem of the Bayes estimation for the IP, discussed above, will not be pursued here in view of the VST application of this contribution. In this case, indeed, an event of instability—in a very short interval lasting typically 1 h—is very unlikely as confirmed by the typical values of the IP illustrated in the previous section and observed in actual system practice. Observing new FCT values is a very rare event.

For practical purposes, the FCT T _y can, therefore, be considered a known constant instead of an RV. This constant value is the prior estimate of the FCT before observing data since no inference can be made out of them; i.e. T _y assumes a deterministic value, t _y = t*, estimated from previous experiences.

Alternatively, the FCT may be considered as an RV, with the assumed law LN(α_y,β_y), whose parameters are characterized by their prior pdf, since it is highly improbable that new data can change our information about the FCT in a VST interval (or the data are so rarely acquired that they do not change the prior pdf much).

The two cases are equivalent, as will be shown later, so that in the sequel reference will be made to the first one (i.e. a deterministic value, t*, of the FCT T _y is assumed).

Let t be the FCT and T _x the RV describing the CCT in the given interval under investigation. Two alternative hypotheses can be assumed for the RV describing the load: (1) the load has a constant value (i.e. it is unknown, but constant in time) due to the interval shortness; (2) the load variations with time are considered—adopting a more rigorous approach—not negligible: then, reference to the peak load is made in the interval. As previously shown, the two cases are formally equivalent.

Then, the IP value in that interval is given—by using the usual transformation from an LN cdf into a Gaussian one:

$$ Q = P\left( {T_{x} < t} \right) = P\left[ {\left( {X < \ln \left( t \right)} \right.} \right] = \Uppsi \left( {{\frac{{\alpha_{x} - \ln (t)}}{{\beta_{x} }}}} \right) $$

(36a)

always using

$$ \Uppsi (z) = \int\limits_{z}^{\infty } {{\frac{1}{{\sqrt {2\pi } }}}} \exp \, \left( { - {\frac{{\xi^{2} }}{2}}} \right){\text{d}}\xi $$

Obviously, the above relation could also be deduced from the general one:

$$ Q = \Uppsi \left( {{\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \, \beta_{y}^{2} } }}}} \right) $$

with α_y = ln(t); β_y = 0 (since T _y is deterministic, as is its logarithm Y, so that its only assumed value, ln(t), coincides with its mean value whereas its SD β_y is zero).

The consequent IP expression is, therefore, equal to:

$$ \Uppsi \left( {{\frac{{\alpha_{x} - \tau }}{{\beta_{x} }}}} \right);\quad \tau = \ln \left( t \right) $$

(36b)

which can be handled like that of a Gaussian cdf, as shown in the sequel, in order to perform a Bayes point estimation of the IP.

In the framework of Bayes estimation, let us assume that the mean value α_x of X = ln(T _x)—being T _x the CCT—is an RV, denoted as M (with analogy with the previous section) whereas its SD β_x = s—as assumed in common practice—is known.^{Footnote 3}

Let the prior information about the unknown parameter M be described by a conjugate prior normal distribution with known parameters (m ₀, s ₀), i.e. M ∽ N(m ₀, s ₀) so that the prior pdf of M (bear in mind that M, the mean value of the RV X = ln(T _x) can be negative^{Footnote 4}):

$$ g(m) = {\frac{1}{{s_{0} \sqrt {2\pi } }}}\exp \left[ { - {\frac{{(m - m_{0} )^{2} }}{{2s_{0}^{2} }}}} \right],\quad m \in \Re \, $$

(37a)

By using results in Appendix 2, the posterior pdf of M, after observing data X, is again Gaussian:

$$ g(m|{\mathbf{X}}) = {\frac{1}{{s_{1} \sqrt {2\pi } }}}\exp \left[ { - {\frac{{(m - m_{1} )^{2} }}{{2s_{1}^{2} }}}} \right],\quad m \in \Re $$

(37b)

with posterior mean and variance given by:

$$ m_{1} = E[M|{\mathbf{X}}] = {\frac{{s^{2} m + ns_{0}^{2} M_{n} }}{{s^{2} + ns_{0}^{2} }}} $$

(38a)

$$ s_{1}^{2} = {\text{Var}}[M|{\mathbf{X}}] = {\frac{{s_{0}^{2} s^{2} }}{{ns_{0}^{2} + s^{2} }}} $$

(38b)

where

$$ M_{n} = (1/N)\sum\limits_{k = 1}^{N} {X_{k} } ;\quad s = \beta = {\text{SD}}\left[ X \right] $$

(38c)

being X _k a generic log-CCT value of the sample X.

The Bayes point estimate of the IP is, therefore, given by:

$$ \begin{aligned} Q^{ \circ }&=\int\limits_{ - \infty }^{\infty } {Q(m)g(m|X){\text{d}}m} \\ & =\int\limits_{ - \infty }^{\infty } {\Uppsi \, \left( {{\frac{m - \tau }{s}}} \right){\frac{1}{{s_{1} \sqrt {2\pi } }}}} \exp \, \left( { - {\frac{{(m - m_{1} )^{2} }}{{2s_{1}^{2} }}}} \right){\text{d}}m \\ \end{aligned} $$

(39)

which, after some manipulation, after changing the variable to: $ z = {\frac{m - \tau }{s}} $ (see [19], in chapter titled Transmission Expansion Planning: A Methodology to Include Security Criteria and Uncertainties Using Optimization Techniques), can be shown to be equal to:

$$ Q^{ \circ } = \Uppsi \left( {{\frac{{m_{1} - \tau }}{{s_{1} }}}} \right) = \int\limits_{z}^{\infty } {{\frac{1}{{\sqrt {2\pi } }}}} \exp \left( { - {\frac{{\xi^{2} }}{2}}} \right){\text{d}}\xi ,\quad z_{1} = {\frac{{m_{1} - \tau }}{{s_{1} }}} $$

(40)

This estimator will be used in the final numerical application related to VST stability prediction, in which, on the basis of an adequate dynamic model of the load (and the CCT), the posterior means and variances will be updated at each time step.

In these applications, typically only one datum of the CCT is observed at each step (so in the above relations n = 1 will be used)—i.e. at the generic kth step—the measured or forecasted load value L _k. This “data” scarcity renders the Bayes estimation more attractive, as discussed above.

Finally, let us briefly examine the second case, mentioned above, with regard to the knowledge of the pdf of the RV T _y. Let us assume that it is an RV, and not a constant as above, letting the parameters of the RV T _y, i.e. (α_y, β_y)—denoted as (α, β) in the sequel—be distributed according to their prior pdf, which remain unchanged after every interval, since no new FCT value is obtained. Let us assume, as above, that only the mean of Y, α = α_y is unknown with a prior conjugate Gaussian distribution N(μ _o, σ_o). Consequently, the pdf of Y (conditional to α_y = α) and the pdf of α are, respectively:

$$ Y|\alpha \sim N(\alpha ,\beta );\quad \alpha \sim N(\mu_{0} ,\beta_{0} ) $$

Then, using the total probability theorem for continuous RV [19] or known results from Bayesian estimation theory [22–24] (see also Appendix 2), it can be seen that the marginal pdf is still a Gaussian pdf:

$$ Y \sim N(\mu_{0} ,\beta *),{\text{ with }}\beta^{*} = \sqrt {\beta_{0}^{2} + \beta^{2} } . $$

In the light of this fact, it is not difficult to show that (40) still holds with properly re-arranged values of the constants (α_x, β_x).

4.4 A Mention of the Classical Estimation of the IP

Here, only a brief account of classical (ML) estimation of Q is given in order to compare it with the one adopted here. Some details can be found in [21, 28]. As stated in Sect. 4.1, let us assume that the following data are available: X = (X ₁,.., X _n), where X _k = ln(T _xk), k = 1,…, n: i.e. X is a random sample of n elements constituted by the natural logarithms of a CCT sample, and let Y = (Y ₁,.., Y _m), where Y _k = ln(T _yk), k = 1,…, m: i.e. Y is a random sample of m elements constituted by the natural logarithms of the FCT sample which can be obtained from field or laboratory data on the system protection components, with regard to the assumed kind of fault.

By referring, for easier notation, to estimated quantities with capital letters, the most widely adopted estimators (A _x, B _x, A _y, B _y) of the above four parameters—for the well-known properties of the ML estimation [21, 28]—are given, for the LN variables under study, by:

$$ A_{x} = \left( {\frac{1}{n}} \right)\sum\limits_{k = 1}^{n} {X_{k} ;} \quad B_{x} = \sqrt {\left( {\frac{1}{n}} \right)\sum\limits_{k = 1}^{n} {\left( {X_{k} - A_{x} } \right)^{2} } } $$

(41)

$$ A_{y} = \left( {\frac{1}{m}} \right)\sum\limits_{k = 1}^{m} {Y_{k} ;} \quad B_{y} = \sqrt {\left( {\frac{1}{m}} \right)\sum\limits_{k = 1}^{m} {\left( {Y_{k} - A_{y} } \right)^{2} } } $$

(42)

These estimators indeed maximize, compared with any other function of the data, the LF L[(X,Y)|(α_x, α_y, β_x, β_y)].

In practice, these estimators coincide with the sample estimators of the mean values (A _x and A _y) and standard deviations (B _x and B _y) of the Normal RV X = ln(T _x) and Y = ln(T _y), and show some desirable properties such as consistency. Moreover, the log-mean estimators A _x and A _y are also unbiased estimators of α_x and α_y, respectively.

Then, the ML estimator Q* of q is given by:

$$ Q^{\ast} = \Uppsi \left( {{\frac{{A_{x} - A_{y} }}{{\sqrt {B_{x}^{2} + \, B_{y}^{2} } }}}} \right) $$

(43)

In [16], the authors analysed the classical point and interval estimates of Q based on an estimator of this type whose properties are not easy to assess.

As a final remark, it should be clear that, when prior information are available, as in most engineering applications and also in this case, the Bayes estimator definitely performs better then the ML estimators. This is especially evident in on-line estimation, as will be shown later, since very few data can be collected for inference. Typically, indeed, no data are available on FCT if the fault does not occur, and this non-occurrence is of course very likely; only one datum is available on CCT, based on the forecasted load value for the time interval under investigation.

5 Dynamic Bayesian Estimation of Mean CCT and IP for VST Applications

5.1 Introduction

In this section, devoted to VST system operation, the principle of recursive Bayesian estimation is applied for a fast and efficient on-line evaluation of the mean CCT (actually, of its natural logarithms), and thus of the IP, in a dynamic framework. This evaluation exploits:

1.
the above-discussed relation between the CCT and the system load L;
2.
the probabilistic knowledge of the time evolution of the load which is generally available in VST applications.

With regard to point (1), reference is made here for illustrative purposes to a single-machine system,^{Footnote 5} or to a system which is reducible to it. The above discussed log-linear characteristic is therefore assumed to hold—at any given instant (for a given network topology)—between the CCT T _x and the load L:

$$ T_{x} = \beta_{0} \exp \left( { - \beta_{1} L} \right) $$

(44)

with model coefficients (β₀, β₁) which are positive known constants, depending on the electrical parameters of the system. As mentioned above, they can be determined by performing a linear regression of the natural logarithms of T _x with respect to the load. This is accomplished after computing the CCT values off-line, for the given network and fault, by means of an appropriate system model based on the classical Lyapunov direct methods for transient stability analysis and sensitivity. Therefore, by denoting—as before—the natural logarithm of T _x by X and the values of X and L at a given time instant t _k by (X _k, L _k), respectively, the following relation is assumed:

$$ X_{k} = a-bL_{k} $$

(45)

with

$$ X_{k} = \ln \left( {T_{xk} } \right);\quad a = \ln \left( {\beta_{0} } \right)\left( {a \in \Re } \right);\quad b = \beta_{1} \left( {b \in \Re^{ + } } \right) $$

This linear relation between the logarithm of the CCT (LCCT in the following) and the system load is the basis for dynamic estimation. In particular, the proposed Bayes recursive estimation uses known results in dynamic estimation—such as the Kalman filter theory—which are well established under the hypothesis that the series {X _k} to be estimated is a Gaussian time series. Such hypothesis is true if the load L(t) is a Gaussian process as above assumed (generalizations to other kinds of load distribution are of course possible without particular problems).

With regard to point (2), namely, the load evolution in time, an adequate load evolution model must be chosen like those adopted for VST load forecasting algorithms.

In particular, we must consider the given time instants t ₁, t ₂,…, t _k,… of interest for VST operation (typically, the successive hours of a certain time interval in which the network topology is assumed as fixed). Then, the following “dynamic linear model” (DLM) [29], or “autoregressive model” is often satisfactorily adopted for the stochastic process (L _k, k = 1,2…), which is supposed to generate the load values at times t _k according the “system equation”:

$$ L_{k + 1} = L_{k} + \lambda_{k} \quad \left( {k = 0,1,2 \ldots } \right) $$

(46a)

in which{λ_k} is a “White Gaussian Noise” (WGN) sequence, i.e. a set of IID Gaussian RV with mean 0, and known SD, denoted by W. This is formally expressed as:

$$ \lambda_{k} \sim {\text{WGN}}\left( {0,W} \right) $$

(46b)

The sequence is “initiated” by a value L ₀ (load value at time t = 0) which is (like all the L _k values) an RV, as appropriate in a Bayes framework, with known pdf representing our prior information. It is also assumed to be a Gaussian RV (with known mean $ \mu_{{L_{0} }} $ and known SD $ \sigma_{{L_{0} }}),$ statistically independent of any finite set of the sequence {λ_k}

$$ L_{0} \sim N\left( {\mu_{{L_{0} }} ,\sigma_{{L_{0} }} } \right) $$

(46c)

The above model tries to capture a reasonable “Markovian” dependence between the successive random values L _k+1 and L _k in a simple way, suitable for VST applications. However, it may be extended without excessive difficulty to cover, e.g. more complex autoregressive model, such as ARIMA processes, or non-linear models [29].

Generally, the values of the load L _k are not measurable with precision but their acquisition is subject to forecasting or measurement errors (also taking into account possible time delays or even missing values in the acquisition process). The following “observations equation” is typically adopted for the estimation of the DLM:

$$ Y_{k} = L_{k} + \nu_{k} \quad \left( {k = 1,2 \ldots } \right) $$

(47a)

where {ν_k} is another WGN sequence, with mean 0, and known SD sν, statistically independent of the sequence {λ_k} and all the other RVs in the model:

$$ \nu_{k} \sim {\text{WGN}}\left( {0,S} \right) $$

(47b)

The above assumptions assure that both L _k and Y _k are Gaussian sequences.

Analogously to “Kalman filtering” language [29], the basic DLM equations (46a) and (47a) can be, respectively, regarded, as the “state equation” and the “measurement equation”.

In order to define a similar DLM for the LCCT values, and repeating for convenience equation (45), let us define the sequences:

$$ X_{k} = a-bL_{k} ,\quad Z_{k} = a - bY_{k} ,\quad \xi_{k} = - b\lambda_{k} ,\quad \eta_{k} = - b\nu_{k} $$

(48)

It is easy to see that these definitions, observing that {ξ_k} and {ν_k} are still WGN sequences, allow the definition of a DLM for the sequence of the LCCT values as follows:

$$ X_{k + 1} = X_{k} + \xi_{k} , \quad Z_{k} = X_{k} + \eta_{k} $$

(49a)

where (ξ_k, η_k) are, respectively, the system and measurement noise for the DLM of the LCCT.

The above assumptions for (X ₀, ξ_k, η_k) are formally expressed as follows:

$$ X_{0} \sim N(\mu_{0} ,\sigma_{0} ),\quad \xi_{k} \sim {\text{WGN}}(0,w),\quad \eta_{k} \sim {\text{WGN}}(0,s) $$

(49b)

The SD w of the model and the SD s of the measures, appearing in the above relations, are clearly related to the above SD (W,S) of λ_k and ν_k by the following, obvious, linear relations^{Footnote 6}:

$$ w = bW;\quad s = bS $$

(50)

Finally, the initial mean and SD of the LCCT sequence X _k, i.e. those of X ₀ (first equation of 49b), denoted simply by (μ₀,σ₀) are obviously expressed in terms of the corresponding initial load L ₀ parameters $ (\mu_{{L_{0} }} ,\sigma_{{L_{0} }} ) $ in (46c) as follows:

$$ \mu_{0} = a - b\mu_{{L_{0} ;}} \quad \sigma_{0} = b\sigma_{{L_{0} }} $$

(51)

It is apparent from the second equation (48) and the above hypotheses summarized in (49a) and (49b) that Z _k, the observed LCCT Z _k, being the sum of two Gaussian independent RV, X _k and ξ_k, is still a Gaussian RV whose marginal pdf is easily deducible (it is sufficient to compute its mean value and variance, as shown below). Moreover, if X _k should be known, the conditional distribution of Z _k—being ξ_k a Gaussian RV with zero mean—would be a Gaussian one with mean equal to X _k, and known SD w. Formally^{Footnote 7}:

$$ \left( {Z_{k} \left| {X_{k} } \right.} \right)\sim N(X_{k} ,w) $$

Therefore, in the framework adopted here for the estimation process, X _k is the unknown (unobservable) mean value of the observable Gaussian RV Z _k, with known SD w. In other words, interest here is focused on the estimation of the mean value of the LCCT, so that results mentioned above (and recalled in Appendix 2) related to estimation of the unknown mean value of a Gaussian RV may be adopted. For brevity, the term “LCCT” (both in the acronym form or in the complete one) will, however, still be used in the sequel instead of the more correct “mean LCCT”.

The X _k sequence is “initiated” by a value X ₀ which, based on prior information for the load L ₀, is again assumed to be a Gaussian RV, as above reported, statistically independent of any finite set of the sequences {ξ_k} and {η_k}.

5.2 Estimation Methodology

Once the measurements (z ₁,z ₂,…, z _k) have been assigned until time instant t _k, the optimal dynamical state estimate $ \hat{X}_{k} $ of the “true” state X _k at time t _k—according to the Bayesian approach to estimation—is provided by a posteriori “MSE” minimization:

$$ {\text{MSE}} = E\left[ {\left( {\hat{X}_{k} - X_{k} } \right)^{2} \left| {z_{1} ,z_{2} , \ldots ,z_{k} } \right.} \right] $$

(52)

This can be accomplished, as will be shown, using recursive Bayesian estimation (Appendix 2, see also [28]) which is substantially resumed by the following recursive relationship.

$$ {\hat{\rm x}}_{k} = E\left[ {\rm x}_{k} \left| {z_{1},z_{2}, \ldots ,z_{k} } \right. \right] = {}^{ - }{\hat{\rm x}}_{k} + G_{k} \left( {z_{k} - {}^{ - }{\hat {\rm x}}_{k} } \right) $$

(53)

where $ {}^{ - }{\hat{\rm x}}_{k} $ represents the state estimate at instant t _k, before z_k knowledge, i.e. the a priori estimate at stage k, and G _k is a constant which is obtained as shown below on the basis of the above “minimum MSE” criterion. The above relation is substantially equivalent to Kalman Filter, but is obtained using the Bayes estimation process, as discussed in [29]: this method has the advantage over the classic Kalman Filter derivation of accounting for the random nature of state X and of allowing the computation of any probabilistic statement about this state. The constant G _k corresponds to the well-known “Kalman gain” [28].

The following stages to which the Bayes procedure is applied can be defined:

Stage “0”, or “a priori” Stage: “Stage 0” means the initial stage before any observation is available. Therefore, in this stage the only available information is the a priori characterization for the RV X at time instant t₀:

$$ X_{0} \sim N(\mu_{0} ,\sigma_{0} ) $$

(54)

Thus, from a Bayesian point of view with quadratic “Loss function”, the initial optimal estimate is $ \hat{X}_{0} = \mu_{0} . $

Stage 1: In this stage and in following stages, according to the Bayes methodology, two kinds of information are available, before and after the measurement—which here is the first observation z₁—is acquired. The first (prior) information yields the prior estimation, the latter (posterior) information yields the posterior estimation.

Before the first measurement z ₁ is performed, the following a priori estimation can be given:

$$ X_{1} = X_{0} + \xi \Rightarrow X_{1} \sim N\left( {{}^{ - }\mu_{1} ,{}^{ - }\sigma_{1}^{{}} } \right) $$

(55)

where the prior mean value and variance^{Footnote 8} are determined by:

$$ \begin{gathered} {}^{ - }\mu_{1} = E[X_{1} ] = E[X_{0} ] + E[\xi ] = \hat{X}_{0} = \mu_{0} \hfill \\ {}^{ - }\sigma_{1}^{2} = {\text{Var}}[X_{1} ] = {\text{Var}}[X_{0} ] + {\text{Var}}[\xi ] = \sigma_{0}^{2} + w^{2} \hfill \\ \end{gathered} $$

(56)

Once the measurement z ₁ is known, the aim is directed towards X ₁ estimation conditional to z ₁. Denoting by z ₁ the observed realization of the RV Z ₁. Z ₁ is still a Gaussian RV, with conditional mean (given X ₁) equal to X ₁, and SD equal to that of η₁, i.e. s. Formally

$$ Z_{1} = X_{1} + \eta_{1} \Rightarrow \left( {Z_{1} \left| {X_{1} } \right.} \right)\sim N(X_{1} ,s) $$

(57)

and since E[Z ₁|X ₁] = X ₁, it can be deduced that the posterior mean (i.e. the Bayes estimate) of X ₁ is:

$$ \hat{X}_{1} = E\left[ {X_{1} \left| {z_{1} } \right.} \right] = \mu_{0} + {\frac{{{}^{ - }\sigma_{1}^{2} }}{{s^{2} + {}^{ - }\sigma_{1}^{2} }}}(z_{1} - \mu_{0} ) = \mu_{0} + {\frac{{\sigma_{1}^{2} }}{{s^{2} }}}(z_{1} - \mu_{0} ) $$

(58)

where

$$ \sigma_{1}^{2} = {\frac{{{}^{ - }\sigma_{1}^{2} s^{2} }}{{{}^{ - }\sigma_{1}^{2} + s^{2} }}} = {\frac{{\left( {\sigma_{0}^{2} + w^{2} } \right)s^{2} }}{{\sigma_{0}^{2} + w^{2} + s^{2} }}} $$

(59)

The posterior estimate is used as the prior for the next stage according to recursive Bayesian estimation, as illustrated in Appendix 2. By applying this algorithm recursively, the following result at time t _k can be obtained.

Generic Stage k: By applying recursive Bayesian estimation, we can immediately verify (e.g. by induction) that the following relation—clearly appearing as the general case of (59)—holds at time instant t_k:

$$ \sigma_{k}^{2} = {\frac{{\left( {\sigma_{k - 1}^{2} + w^{2} } \right)s^{2} }}{{\sigma_{k - 1}^{2} + w^{2} + s^{2} }}} $$

(60)

Similarly, the following recursive formulation for the Bayesian estimate at time t _k can be obtained:

$$ \hat{X}_{k} = E\left[ {X_{k} \left| {z_{1} ,z_{2} , \ldots ,z_{k} } \right.} \right] = \hat{X}_{k - 1} + G_{k} \left( {z_{k} - \hat{X}_{k - 1} } \right) $$

(61)

where

$$ G_{k} = {\frac{{\sigma_{k}^{2} }}{{s^{2} }}} = {\frac{{\left( {\sigma_{k - 1}^{2} + w^{2} } \right)}}{{\sigma_{k - 1}^{2} + w^{2} + s^{2} }}} $$

(62)

As is well known, this estimate exhibits the noticeable property to minimize the posterior MSE for every time instant t _k, expressed by:

$$ {\text{MSE}} = E\left[ {\left( {\hat{X}_{k} - X_{k} } \right)^{2} \left| {z_{1} ,z_{2} , \ldots ,z_{k} } \right.} \right] $$

(63)

Of course, the recursive procedure allows knowledge of the pdf of X _k at each stage k (needless to say, unlike the “static” Bayes estimation, the posterior pdf of X changes with time), and also allows Bayes estimation of the IP. This is directly deduced using results derived in Sect. 4.3: since (61) is the posterior mean of the LCCT and σ ²_k the posterior variance, by re-arranging Eq. 40, the following recursive Bayesian estimate of the IP at time t _k is obtained:

$$ \hat{Q}_{k} = \Uppsi \left( {{\frac{{\hat{X}_{k} - \tau }}{{\sigma_{k} }}}} \right) $$

(64)

Confidence intervals, particularly the previously illustrated “upper confidence bounds” may also be computed for both the LCCT and the IP. In the following numerical application, for sake of brevity the estimation procedure is illustrated only for the LCCT sequence, which is a Gaussian one, so that its results are more easily interpretable. Moreover, the confidence interval assessment is straightforward for the LCCT sequence whereas for the IP sequence it can be computed by applying the procedure illustrated in Appendix 3 at each step using the a.m. Beta approximation, since no analytical result exists. Numerical simulations results were similar as regards parameter point estimation. A numerical example of the BCI computation is still reported in Appendix 3, only for the IP, being it straightforward for the LCCT.

5.3 Concluding Remark

The proposed procedure is based on a relation (CCT–Load) which can be analysed and computed off-line—for the given network—once for all, so that on-line estimation shown here does not require the solution of the system model at each iteration. This allows the time duration of the intervals in which the stability is assessed to be shortened and is favourable to reliable and efficient security assessment. A distributed version of the proposed Kalman filtering approach can be applied in the case of large power systems [30].

6 Numerical Application of the Bayes Recursive Dynamic Model

In this section, a simple numerical application—based on typical load and CCT values and simulated patterns of the load process in time—is presented in order to illustrate the on-line estimation of the LCCT in VST operation.

The load process is assumed to follow the DLM model, with (k = 1,2…):

$$ L_{k + 1} = L_{k} + \lambda_{k} {\text{ and }}Y_{k} = L_{k} + \nu_{k} $$

(65)

in which

{λ_k} is a “WGN” sequence, WGN(0, W);
{ν _k} is another WGN sequence, WGN(0, S);

the two sequences are statistically independent of each other and the other RV in the model.

For simplicity of notation, the SDs in the above WGN sequences (λ_k, ν_k) are, respectively, denoted as (W, S) instead of (σ_λ, σ_ν) as in the previous section. The lowercase letters (w, s) will be used for the LCCT sequence.

The evolution model of the LCCT corresponding to (65) is, as already deduced:

$$ X_{k + 1} = X_{k} + \xi_{k} {\text{ and }}Z_{k} = X_{k} + \eta_{k} $$

(66)

with the already discussed basic assumptions

$$ X_{0} \sim N(\mu_{0} ,\sigma_{0} ),\quad \xi_{k} \sim {\text{WGN}}(0,w),\quad \eta_{k} \sim {\text{WGN}}(0,s) $$

(67)

and with the SD of the WGN sequences (ξ_k, η_k), respectively, equal to w = bW, s = bS.

For the sake of a numerical example, let us assume that the starting value of the system load, L ₀, measured in p.u., is a Gaussian RV with mean $ \mu_{{L_{0} }} = 0.8750\;{\text{p}} . {\text{u}} . $ and SD $ \sigma_{{L_{0} }} = 0.0417\;{\text{p}} . {\text{u}} . $

These values imply that L ₀, with probability 0.9973, assumes values in the following interval (0.75–1 p.u.) of amplitude equal to $ 6\sigma_{{L_{0} }} $ around the mean value $ \mu_{{L_{0} }} . $

Let us also assume that, in the log-linear model X = a − bL, the following values of the regression coefficients have been computed: a = 1.7242, b = 4.1774.

Consequently, the mean and SD of X, i.e. the parameters of the LN pdf of the CCT, are equal to $ \mu_{{X_{0} }} = - 1.9310; $ and $ \sigma_{{X_{0} }} = 0.1741. $ The mean value corresponds to a CCT of about 0.145 s, which was used for the numerical examples illustrated above.

The numerical results, obtained by means of stochastic simulation of the above sequences, will be expressed in relation to the values of the “primary” SD values (S, W) of the load model and the initial load variance, V ₀, i.e. the variance of L ₀, a value which is chosen by the analyst in a Bayes methodology, on the basis of her/his prior information.

More specifically, the values (0.05 and 0.1) will be used for S and W, and these values will be swapped with each other in the course of the application to obtain at least some basic information on the sensitivity of the results in relation to variations of the model parameters. The initial variance, V ₀, of the model (i.e. the variance of X ₀, a value which is subjectively chosen by the analyst in a Bayes methodology) will alternatively also assume 2 values, 0 or 1, corresponding to different degrees of belief in the prior information (very strong in the first case, slight in the second one). An example of a possible sample path of the load sequence with these parameters (and V ₀ = 0), simulated by means of the “normrnd” function of MATLAB^®, is illustrated in Fig. 1 in a time interval covering 500 h of system operation (the corresponding series of the LCCT values will be shown in Fig. 2, darker curve).

To each load sequence, generated by a DLM corresponds, as discussed above, an LCCT sequence of X _k values, also constituting a DLM, which are estimated by the recursive approach by the values X _k°. In Fig. 2, for a sample path of N = 120 time values, the sequence of LCCT values and of its estimated values are shown, obtained with the same values of W, S, V ₀ as in Fig. 1.

The efficiency of the estimation method is evaluated using extensive Monte Carlo simulations [31]. In particular, the model performance has been summarized for any given set of time instants (t ₁, t ₂,…, t _N) by the average squared error (ASE)^{Footnote 9}:

$$ {\text{ASE}} = \frac{1}{N}\sum\limits_{j = 1}^{N} {\left( {\zeta_{j}^{^\circ } - \, \zeta_{j} } \right)^{2} \, } $$

(68)

in which ζ_j is the quantity to be estimated (in this case, the LCCT X _j at any given instant t _j) and ζ ⁰_j is its estimate. In practice, given the length N of a sequence (here N = 120 is chosen), M simulated sequences have been generated by the same algorithm and the average of the squared errors values obtained has been reported as a sample estimate of the “true” squared error. Extensive simulations were based on a number M = 10⁴ of replications for each simulated trial; only a significant subset of the relevant results are reported in the following.

The ASE obtained using the traditional ML method has also been evaluated since the ML estimator at time k is equal—as is well known from estimation theory for the mean of a normal RV—to the sample mean of the k observed values Z _j (j = 1,…, k) so far. The precision (as measured by the relative bias and the maximum relative estimation error) of the dynamic Bayes estimator of the LCCT has also been verified. The basic statistics—estimated at the end of each simulation case study—which describe the efficiency of the proposed estimates, and which will be reported below—are:

ASEB: average squared error of the bayes estimator;
ASEL: average squared error of the ML estimator;
ARE = ASEL/ASEB: average relative efficiency of the Bayes estimator compared with the ML estimator.

The ARE ratio, which is in a sense the dynamical counterpart of the classic “relative efficiency” of the Bayes estimator compared with the ML estimator used for a “static” parameter is indeed a synthetic measure of efficiency of the estimation method. The more the ARE value exceeds unity, the more efficient the Bayes estimate is when compared with the ML estimate.

A lot of different combinations of values of the model parameters (W, S, V ₀) have been adopted to explore the estimation performances. In the following, the eight combinations indicated below will be reported for the triplet (W, S, V ₀):

1.
(0.05, 0.10, 0)
2.
(0.05, 0.10, 1)
3.
(0.10, 0.05, 0)
4.
(0.10, 0.05, 1)
5.
(0.025, 0.05, 0)
6.
(0.025, 0.05, 1)
7.
(0.05, 0.025, 0)
8.
(0.05, 0.025, 1)

It is recalled that the above quantities are the SD describing uncertainty in the system load model (i.e. the equations in L _k and Y _k from which the ones for the LCCT are derived). The SD of the LCCT dynamic model, s and w, are larger than the correspondent load model parameters (W, S) values indicated here since s = bS and w = bW, and b = 4.1774, as reported above.

It is seen that in the first four cases the values (0.05, 0.10) or (0.05, 0.10) are used for S and W, and every combination is obtained from the previous one by changing the value of the initial variance from V ₀ = 0 to V ₀ = 1, or by swapping the values of S and W. For instance, in cases 3 and 4, the values of W and S are swapped in relation to cases 1 and 2. In cases 1 and 3, V ₀ = 0 was chosen; in cases 2 and 4, V ₀ = 1 was chosen.

An analogous method has been used to form the combinations (5)–(8), by using the values (0.025, 0.05) for (W, S), i.e. half of the values (0.05, 0.10) used in the first four combinations. It is noticed that the combination of SD values of the first four combinations may be too high (particularly, because of the value 0.1 for S or W), especially for VST applications. Indeed some unrealistic value has been obtained in the course of the simulations for the LCCT (and, thus, for the IP). They have been reported here only to show that the estimation procedure works quite well even in these unrealistic cases, in which high SD values could imply high estimation errors.

Indeed, it is observed that—as typically occurs—the different choices do not affect the performances of the methodology.

Out of the many numerical simulations which have been performed, the most significant have been reported in the two tables of this section, Table 1 being relevant to the first four combinations, Table 2 relevant to the other four combinations.

Table 1 Some results of the estimation performances with different combination of values of the (load) model parameters, with (W, S) = (0.05, 0.10) or (0.10, 0.05)

Full size table

Table 2 Some results related to estimation performances with other different combination of values of the (load) model parameters with (W, S) = (0.025, 0.05) or (0.05, 0.025)

Full size table

For each case, the results of three different simulations (proofs), amongst all the ones performed, are reported. In particular: proof #1 is—for any given sample size—the one with the “worst” results (i.e. when the ARE gets the lowest observed value); proof #3 is the one with the “best” results (i.e. when the ARE gets the highest observed value); proof #2 gives the average results for the REFF, thus resulting intermediate between proof #1 and proof #3. So, a total number of 12 proofs is shown in each table. For example, in Table 1, the case 1.1 (with ARE = 3.4485) precedes case 1.2 (with ARE = 7.2570). The three cases 1.1, 1.2 and 1.3 are all relevant to the same combination of values of (W, S and V ₀) i.e. (0.05, 0.10 and 0), the first of the eight combinations above reported.

One of the results in Table 1 (the case 1.2) is also reported in Fig. 2, already mentioned.

As a general comment to the above results, it is noticed (by looking at the ASEB values) that the Bayes estimate errors are per se reasonably limited. Moreover, the relative performance in relation to the ML estimate—as measured by the ARE index—is always much greater than 1. To evaluate the precision of the estimates, other significant quantities have also been evaluated such as the average and maximum relative error of the Bayes estimates, with similar results.

It must also be remarked that, even in the “worst” cases (e.g. cases 1.1, 2.1, etc. in both tables), the ARE index is always greater than 1. Indeed, the reported results point out the efficiency of the proposed Bayesian approach, even in the case if “unrealistic” high SD as the ones in Table 1.

Finally, it has been already reported that the SD of the LCCT dynamic model, s and w, is relatively large in the application here illustrated. This could imply relatively large estimation errors whereas the reported results show that these errors are very limited, a fact which strengthens the efficiency of the estimation procedure.

For the sake of brevity, the procedure for obtaining the BCI is briefly illustrated in Appendix 3, with reference to the IP estimation.

The above good performances of the Bayes estimates with respect to the ML ones are consistent (and—to a certain extent—to be expected on theoretical grounds) with Bayesian statistical theory, as long as the Bayes estimates are evaluated assuming the “right” a priori distribution of the load, i.e. the one actually used in performing the simulation of the random samples. So, it is very opportune to assess, as mentioned in the introduction, the robustness of the proposed methodology when the “a priori” hypotheses about such distribution are not valid. This is the object of the following section.

7 Some Numerical Robustness Analyses

Finally, also a simple “robustness” analysis of the proposed methodology has been performed: with respect to the initial prior distribution of the load (see Sect. 7.1), and with respect to the system load random errors prior distribution (see Sect. 7.2). For the purpose, many simulations (more than those shown here) have been carried out, assuming different (Extreme value, Log-Normal, Uniform and others) prior pdf for the initial load value L ₀ or the system equation errors, instead of the Gaussian one assumed for the calculations.

7.1 A Numerical Robustness Analysis with Respect to the Initial Load Value Prior Distribution

First, some results relevant to a robustness analysis with respect to the initial load value prior distribution are reported. Since such kind of robustness is generally well established and accepted, for brevity only four cases are presented for each table, corresponding to the mean values of ASEB, ASEL, and ARE. As in the previous section, the first table is relevant to the SD values (W, S) = (0.05, 0.10) or (0.10, 0.05); the second table is relevant to the SD values (W, S) = (0.025, 0.05) or (0.05, 0.025).

In tables 3 and 4, some significant results relevant to the Uniform pdf as a prior pdf for L ₀ are reported, with satisfying results which were indeed expected since the well-known robustness properties of the Kalman filter. In previous section, it was assume that the starting value of the system load, L ₀, measured in p.u., is a Gaussian RV with mean $ \mu_{{L_{0} }} = 0.8750\;{\text{p}} . {\text{u}} . {\text{ and SD }}\sigma_{{L_{0} }} = 0.0417\;{\text{p}} . {\text{u}} . $ In the present case, the values of X ₀ were generated, in each simulation trial, according to a Uniform prior pdf on an interval (a _L, b _L), with the same mean and SD, so that: a _L = 0.8028, b _L = 0.9472.

Table 3 Mean values of ASEB and ASEL, and the correspondent ARE values, related to a robustness analysis with respect to a Uniform prior pdf for the initial load value. Table relevant to the SD values (W,S)=(0.05,0.10) or (0.10,0.05).

Full size table

Table 4 Mean values of ASEB and ASEL, and the correspondent ARE values, related to a robustness analysis with respect to a Uniform prior pdf for the initial load value: (0.025, 0.05) or (0.05, 0.025)

Full size table

Tables 3 and 4 report some results (the mean values of ASEB and ASEL obtained in all the simulations, and the correspondent ARE value) related to a robustness analysis of the proposed methodology with respect to a Uniform prior pdf for the initial load value L ₀ instead of the Gaussian one assumed for the calculations.

The calculations were performed as if the Gaussian model, which is a basic assumption of the procedure, was the “true” model generating system errors, whilst in fact the Uniform model was the true one.

The results of this robustness analysis—and more other simulation results with different prior pdf, not shown here—still confirm the adequacy of the estimation procedure.

7.2 A Numerical Robustness Analysis with Respect to System Equation Random Errors Distribution

In addition to the previous ones, a similar robustness analysis has been performed also with respect to the random errors pdf, assumed this time to be a Uniform or an EV distribution instead of a Gaussian one, with the same mean and variance (it is reminded indeed that such parameters are assumed known). Being the pdf referred to errors distribution, they all have zero mean.

By the term “Extreme Value” model, it is meant the “Largest Extreme Value” model one characterized by the following cdf:

$$ F(t;\chi ,\delta ) = \exp \left\{ { - \exp \left[ { - (t - \chi )/\delta } \right]} \right\}\left( { - \infty < t < + \infty } \right) $$

(69)

with parameters: χ real, δ positive.

As already mentioned at the end of Sect. 2, the EV distribution is another natural candidate for the probabilistic description of the load, if interest is focussed on the peak load value [20]. It is indeed the most suitable for large time horizons, but it has already been shown (see Sect. 2.3) that a rigorous approach should be referred to peak load values, independently from the time horizons width. However, in this case, for the purpose of a robustness analysis, the EV model has been reported principally as an interesting alternative model since its pdf has a very different shape from the Gaussian one.

Also these results, shown in Tables 5 through 8—where they are reported in the same format of those in Sect. 6—confirm the estimation robustness. Figure 3 is referred to a typical case of those in Table 5.

Table 5 Some results related to a robustness analysis of the proposed methodology with respect to a Uniform error pdf for the system model of load sequence L _k instead of the Gaussian one assumed for the calculations

Full size table

Table 6 Some results related to a robustness analysis of the proposed methodology with respect to a Uniform error pdf for the system model

Full size table

Table 7 Some results related to a robustness analysis of the proposed methodology with respect to an EV error pdf for the system model of load sequence L _k instead of the Gaussian one assumed for the calculations

Full size table

All the above results show that the performances of the Bayes estimates are always scarcely sensitive to the assumed prior distribution or even to the model distribution. In practice, the ASEB values are not significantly changed with respect to the previous cases, and it must be remarked that, even in the “worst” cases, the “efficiency” index, ARE, is always greater than 1. Similar results were obtained also for different model parameters values—i.e. different values of (W, S, V ₀)—and for different random errors pdf. So, the Bayes estimator appears to be robust with respect to “wrong” prior assumptions.

Table 8 Some results related to a robustness analysis of the proposed methodology with respect to an EV error pdf for the system

Full size table

7.3 A Final Comment

The authors believe that, although deduced under simple hypotheses and/or system models, the above results—along with those already established in the field of power system or component reliability studies [32, 33]—could encourage new advanced applications of Bayesian inference in Power System analysis. Its use is indeed not yet widespread in stability or security studies, although a Bayesian classifier has been recently proposed for power system probabilistic security assessment [34]. Further developments of Bayes applications in the field of Stability surely require advanced computational tools, which are nowadays increasing in number and efficiency, as recently illustrated in [35].

Notes

1.
Note that, without this assumption, the above equation is generally wrong, although it appears without any justification in many papers and books.
2.
From now on we use different symbols for the four parameters—when they are considered as RV—to avoid confusion with other symbols used in this section (see Sect. 4.4) and in Appendix 2 where the capitals (A _x, A _y, B _x, B _y) corresponding to (α_x, α_y, β_x, β_y) denote specific ML estimators (it is recalled that RV are denoted by capitals).
3.
The case in which the SD should be unknown poses no problems. Indeed it can be dealt with, implying only a little computational effort, by means of well-known methods like those mentioned in Appendix 2.
4.
In the numerical examples or applications of this chapter, measuring times in seconds as done here, both X and Y have negative mean values.
5.
The generalization to multi-machine systems, illustrated by the authors in [12], can be accomplished without difficulties by adopting the “Extended Equal Area Criterion”.
6.
Note that if Y = a ± bX, where X and Y are RV and (a,b) constants, then SD[Y] = |b|SD[X] (the SD is intrinsically non-negative).
7.
The notation (R | S) ~ N(α,β), being R and S two RV, denotes that the conditional distribution of R, given S, is N(α,β).
8.
A prior estimate of a parameter ζ is denoted here by $ {}^{-}{\zeta} $
9.
The ASE index should not be confused with the MSE, which was defined in the previous section: the (theoretical) MSE evaluates the statistical mean square error between ζ _j and ζ ⁰_j for any fixed time t _j with respect to the posterior conditional distribution. Instead, the ASE is an empirical measure (deduced from the sample) which takes into account the precision of estimation for all the RV ζ _j (j = 1,…, N) of the sequence.
10.
In practice, the function K(u) coincides with the “Hazard Rate function” of a standard Gaussian RV, as defined in Reliability applications (see, e.g. [37], where also the linearity of h(t) is discussed).
11.
The suffix “0” is typically used to denote prior parameters, e.g. (μ₀, σ₀) in this case.
12.
It should be remarked, however, that—in the Bayesian setting here adopted—the choice of prior parameters only reflects the information of the analyst, or her/his degree of uncertainty. So, this choice—at least from a “philosophical” point of view [25]—does not need to be “reasonable”, neither it must be necessarily accepted by others. An effort has been made nonetheless, here as in he whole chapter, to choose “realistic” values from a practical engineering point of view.
13.
In this example, in which the mean FCT is assumed to be the only RV f the problem, the ICB could be easily computed by means of the Gaussian cdf, by using known results some known results on Bayesian inference [23, 24, 32]. However, the presented example is kept simple on purpose, since it only serves to illustrate a methodology, which we have proven to be valid also in the general case (in which no analytical solution exists) as far as we know.

Abbreviations

BCI:: Bayesian confidence interval
cdf:: Cumulative distribution function
CSGDF:: Complementary standard Gaussian distribution function
CCT:: Critical clearing time (T _cr or T _x)
CV:: Coefficient of variation
D :: Set of observed data used for inference
DLM:: Dynamic linear model
E[R]:: Expectation (or “mean value”) of the RV R
EV:: Extreme value distribution
FCT:: Fault clearing time
F(x):: Generic cdf
f(x):: Generic pdf
g(ω), g(ω|D):: Prior and posterior pdf of a generic parameter ω
G(r, ϕ):: Gamma distribution with parameters (r, ϕ)
IID:: s-Independent and identically distributed (random variables)
IP:: Instability probability
LCCT:: Logarithm of the CCT
LF, L(D|β):: Likelihood function, conditional to given parameter β
L, L(t):: Load (at time t)
ML:: Maximum likelihood
MSE:: Mean square error
LN(α,β):: Log-Normal distribution with parameters α and β
N(μ,σ):: Normal (Gaussian) distribution with mean μ and SD σ
pdf:: Probability density function
RV:: Random variable
s :: Standard deviation of measurement errors in the DLM of the LCCT
S :: Standard deviation of measurement errors in the DLM of the load
SD, σ:: Standard deviation
s-independent:: Statistically independent
SD[Y]:: Standard deviation of the RV Y
SM:: Stability margin (i.e. the quantity u defined below), for a given fault
σ² :: Denotes a variance
T_cr or T_x:: Critical clearing time
T_cl or T_y:: Fault clearing time
u :: (α_x − α_y)/(β ²_x + β ²_y )^1/2
v_x,v_y:: CV values of the CCT and FCT, respectively
VST:: Very short time
Var[R],V(R):: Variance of the RV R
w :: Standard deviation of system equation error in the DLM of the LCCT
W :: Standard deviation of system equation error in the DLM of the load
Λ:: Peak value of the load L(t), over a given time interval
WGN:: White Gaussian noise
X :: Logarithm of the CCT
Y :: Logarithm of the FCT
α_x :: E[X]
α_y :: E[Y]
β ²_x :: Var[X]
β ²_y :: Var[Y]
ζ°:: Bayes estimate of a generic parameter ζ
ζ*:: ML estimate of a generic parameter ζ
μ :: Denotes a mean value (expectation)
$ \hat{\mu }_{k} $ :: Bayes estimate of a “dynamic” parameter μ at time k
Γ(·):: Euler–Gamma function
μ _r :: Expectation of the generic RV R
Φ(z):: Standard normal cdf
Ψ(z):: 1 − Φ(z) (Complementary standard Gaussian distribution function)
φ(z):: Standard normal pdf
R ∽ N(α,β):: The RV R has a Gaussian distribution N(α, β) (and similarly for the LN model, etc.)

References

Kundur P (1993) Power system stability and control. Electric Power System Research, Power System Engineering Series, McGraw-Hill
Google Scholar
Amjady N (2004) A framework of reliability assessment with consideration effect of transient and voltage stabilities. IEEE Trans Power Syst 19(2):1005–1014
Article Google Scholar
Billinton R, Kuruganty PRS (1979) An approximate method for probabilistic assessment of transient stability. IEEE Trans Reliab 28(3):255–258
Article Google Scholar
Billinton R, Kuruganty PRS (1980) A probabilistic index for transient stability. IEEE Trans PAS 99(1):195–206
Google Scholar
Billinton R, Kuruganty PRS (1981) Probabilistic assessment of transient stability. IEEE Trans PAS 100(5):2163–2170
Google Scholar
Billinton R, Kuruganty PRS (1981) Probabilistic assessment of transient stability in a practical multimachine system. IEEE Trans PAS 100(5):3634–3641
Google Scholar
Billinton R, Kuruganty PRS (1981) Protection system modelling in a probabilistic assessment of transient stability. IEEE Trans PAS 100(7):3664–3671
Google Scholar
Anderson PM, Bose A (1983) A probabilistic approach to power system stability analysis. IEEE Trans PAS 102(4):2430–2439
Google Scholar
Hsu YY, Chang CL (1988) Probabilistic transient stability studies using the conditional probability approach. IEEE Trans PAS 3(4):1565–1572
Google Scholar
Anders GJ (1990) Probability concepts in electric power systems. Wiley, New York
Google Scholar
Chiodo E, Gagliardi F, Lauria D (1994) A probabilistic approach to transient stability evaluation. IEE Proc Generat Transm Distrib 141(5):537–544
Article Google Scholar
Chiodo E, Lauria D (1994) Transient stability evaluation of multimachine power systems: a probabilistic approach based upon the extended equal area criterion. IEE Proc Generat Transm Distrib 141(6):545–553
Article Google Scholar
Allella F, Chiodo E, Lauria D (2003) Analytical evaluation and robustness analysis of power system transient stability probability. Electr Eng Res Rep NR16:1–13
Google Scholar
Chiodo E, Gagliardi F, La Scala M, Lauria D (1999) Probabilistic on-line transient stability analysis. IEE Proc Generat Transm Distrib 146(2):176–180
Article Google Scholar
Ayasun S, Liang Y, Nwankpa CO (2006) A sensitivity approach for computation of the probability density function of critical clearing time and probability of stability in power system transient stability analysis. Appl Math Comput 176:563–576
Article MATH MathSciNet Google Scholar
Allella F, Chiodo E, Lauria D (2003) Transient stability probability assessment and statistical estimation. Electric Power Syst Res 67(1):21–33
Article Google Scholar
Pavella M, Murthy PG (1994) Transient stability of power systems. Theory and practice. Wiley, New York
Google Scholar
Breipohl AM, Lee FN (1991) A stochastic load model for use in operating reserve evaluation. In: Proceedings of the 3rd international conference on probabilistic methods applied to electric power systems, London, 3–5 July 1991, IEE Publishing, London
Google Scholar
Papoulis A (2002) Probability, random variables, stochastic processes. McGraw Hill, New York
Google Scholar
Belzer DB, Kellogg MA (1993) Incorporating sources of uncertainty in forecasting peak power loads. A Monte Carlo analysis using the extreme value distribution (with discussion). IEEE Trans Power Syst 8(2):730–737
Article Google Scholar
Crow EL, Shimizu K (1988) Lognormal distributions. Marcel Dekker, New York
MATH Google Scholar
Robert CP (2001) The Bayesian choice. Springer Verlag, Berlin
MATH Google Scholar
Press SJ (2002) Subjective and objective Bayesian statistics: principles, models, and applications, 2nd edn. Wiley, New York
Book Google Scholar
O′Hagan A (1994) Kendall’s advanced theory of statistics: vol 2b, Bayesian inference. E. Arnold Editor, London
Google Scholar
De Finetti B, Galavotti MC, Hosni H, Mura A (eds) (2008) Philosophical lectures on probability. Springer Verlag, Berlin
MATH Google Scholar
Jin M (2009) Estimation of reliability based on zero-failure data. In: Proceedings of the 8th international conference on reliability, maintainability and safety, ICRMS 2009, 20–24 July 2009, pp 308–309
Google Scholar
Martz HF, Hamm LL, Reed WH, Pan PY (1993) Combining mechanistic best-estimate analysis and level 1 probabilistic risk assessment. Reliab Eng Syst Saf 39:89–108
Article Google Scholar
Rohatgi VK, Saleh AK (2000) An Introduction to Probability and Statistics, 2nd edn. Wiley, New York
Google Scholar
West M, Harrison J (1999) Bayesian forecasting and dynamic models. Springer Verlag, Berlin
Google Scholar
Khan UA, Moura JMF (2008) Distributing the Kalman filter for large-scale systems. Part I. IEEE Trans Signal Process 56(10):4919–4935
Article MathSciNet Google Scholar
Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer Verlag, Berlin
MATH Google Scholar
Erto P, Giorgio M (2002) Generalised practical Bayes estimators for the reliability and the shape parameter of the Weibull distribution. In: Proceedings of PMAPS 2002: probabilistic methods applied to power systems, Napoli, Italy
Google Scholar
Chiodo E, Mazzanti G (2006) Bayesian reliability estimation based on a Weibull stress-strength model for aged power system components subjected to voltage surges. IEEE Trans Dielectric Electric Insulat 13(1):146–159
Article Google Scholar
Kim H, Singh C (2005) Power system probabilistic security assessment using Bayes classifier. Electric Power Syst Res 74:157–165
Article Google Scholar
Robert CP (2007) Bayesian core: a practical approach to computational bayesian statistics. Springer Verlag, Berlin
MATH Google Scholar
Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, 2nd edn. Wiley, New York
MATH Google Scholar
Martz HF, Waller RA (1991) Bayesian reliability analysis. Krieger Publishing, Malabar, FL
MATH Google Scholar
Allella F, Chiodo E, Lauria D, Pagano M (2001) Negative log-gamma distribution for data uncertainty modelling in reliability analysis of complex systems: methodology and robustness. Int J Qual Reliab Manag 18(3):307–323
Article Google Scholar

Download references

Acknowledgements

The authors wish to thank sincerely prof. Francesco Gagliardi, of the University of Naples Federico II, Italy, for encouraging them in undertaking the researches which constituted the foundations of this study.

Author information

Authors and Affiliations

Electrical Engineering Department, University of Naples Federico II, Via Claudio 21, I-80125, Naples, Italy
Elio Chiodo & Davide Lauria

Authors

Elio Chiodo
View author publications
You can also search for this author in PubMed Google Scholar
Davide Lauria
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elio Chiodo .

Editor information

Editors and Affiliations

, Department of Microelectronics and Compu, Technical University of Lodz, Lodz, Poland
George Anders
Department of Engineering, Università degli studi del Sannio, Piazza Roma 21, Benevento, 82100, Italy
Alfredo Vaccaro

Appendices

Appendix 1: An Analytical Study of the IP

Under the assumed hypothesis of LN pdf for both the CCT and the FCT, it was shown that the IP, q = P(CCT < FCT) = P(T _x < T _y), can be expressed by:

$$ q = \Uppsi (u) = \int\limits_{u}^{\infty } {{\frac{1}{{\sqrt {2\pi } }}}} \exp \left( { - {\frac{{\xi^{2} }}{2}}} \right){\text{d}}\xi $$

(70)

where u is the “SM”:

$$ u = {\frac{{\alpha_{x} - \alpha_{y} }}{{\sqrt {\beta_{x}^{2} + \beta_{y}^{2} } }}} = {\frac{E(X) - E(Y)}{{\sqrt {V(X) + V(Y)} }}} $$

(71)

being X = ln(T _x); Y = ln(T _y), and denoting by V(X) and V(Y) their variances.

The function q = q(u) is shown in Fig. 4. Since q(u) decreases very quickly towards 0, especially when u is large enough, two different curves are shown: one (left curve) is relevant to the interval (0 < u < 2.5), the other (right curve) relevant to the interval (2.5 < u < 5): the latter is the one which often occurs in practice since, in this interval, the IP typically assumes realistic small values, less than 6e−3. To appreciate the quickness with which q(u) decreases, the following values are given as examples:

q(2.0) = 2.28e−2;
q(2.5) = 6.20e−3;
q(3.0) = 1.30e−3;
q(5.0) = 2.85e−7

In order to appreciate the variation of the IP as a function of the SM u, the following well-known asymptotic approximation of the Ψ function which may be found in many books (e.g. [36]) is given:

$$ \Uppsi (u) \approx {\frac{1}{{u\sqrt {2\pi } }}}\exp \left( { - {\frac{{u^{2} }}{2}}} \right) = {\frac{\phi (u)}{u}},\quad {\text{for }}u{\text{ large enough}} $$

(72)

with ϕ(u) being the standard Gaussian pdf. In practice, the above approximation is satisfactory for u ≥ 3 (e.g. it yields 0.0015 for u = 3, with a relative error of less than 0.8%). From the above relation, it is readily shown (as discussed in the following) that the relative variation of q with u is in practice linear in u for typical values of u: therefore, the larger—as desired—the SM is, the more abrupt the variation (decrease) in the IP value. Indeed, a curve of the relative variation in the IP versus the argument u is shown in Fig. 5. It is apparent that such a function, which is of course negative (and decreasing), is approximately linear with u, especially for high u values.

Indeed, the relative variation of the function Q(u) may be analysed using the derivative of its logarithm since:

$$ {\frac{{{\text{d}}Q}}{Q}} = \left[ {{\frac{{Q^{\prime}(u)}}{Q}}} \right]{\text{d}}u = D[\ln Q(u)]{\text{d}}u. $$

(73)

For what has been discussed above, the function: K(u) = D[ln(Qu)] tends to approach the value (−u) if u → ∞. However, K(u) is readily expressed for any finite value of u by means of available statistical functions, since:

$$ K(u) = D[\ln \Uppsi (u)] = - {\frac{\phi (u)}{\Uppsi (u) \, }} $$

(74)

So, the availability of the standard Gaussian pdf and cdf (e.g. the functions “normpdf” and “normcdf” in MATLAB) provides an easy computations of K(u)^{Footnote 10} and the possibility of drawing graphs such as the one in Fig. 5.

Moreover, by virtue of the above asymptotic approximation, we get the above-mentioned linear approximation of K(u), which is also confirmed by Fig. 5:

$$ K\left( u \right) \to - u,\quad {\text{as }}u \to + \infty $$

(75)

The above relationship between the IP and the SM u can be readily expressed (see also [13]) as a function of the basic statistical parameters of the CCT—or the ones of the load on which the CCT depends—and the clearing time.

This allows a rapid sensitivity analysis of the IP for these parameters. For instance, using the already relations between the LN parameters and the mean value and the CV of the LN distributions given above, the dependence of q on μ and v (the CV value) of both the FCT and CCT is straightforward. The following curves are obtained assuming, for illustrative purposes only, a common value v of the CV, i.e. from the expression:

$$ q = \Uppsi \left( u \right);\quad u = {\frac{{\ln \mu_{x} - \ln \mu_{y} }}{{\sqrt {2\ln (1 + v^{2} )} }}} $$

(76)

The curves in Fig. 6 describe the variation of q (in %) as a function of the mean FCT, for a fixed value of the mean CCT, chosen equal to 0.1 s. as in the numerical examples of the chapter and with 2 different values of the (common) value of the CV, i.e. v = 0.10 and v = 0.12.

These curves illustrate the high or extreme variability of the IP versus the mean FCT and also the CV. This last aspect is confirmed by the curve depicted in Fig. 7 which expresses the IP—on a logarithm scale—versus the CV, assuming mean CCT value = 0.1 s and mean FCT value = 0.145 s. All the above aspects are very important in view of the estimation process, and this is why they have been illustrated in detail.

Appendix 2: Bayes Point Estimation for the Gaussian Model

2.1 Known Variance

For the purposes of making inference in the application in this chapter, some known results [22–24] on Bayes point estimation for the Gaussian model have been applied. Indeed, the Log-Normal model assumed for both FCT and CCT can be easily converted in the Gaussian one by means of a logarithmic transformation. Some results which are specific to the Log-Normal model are given, for example, in [21].

Let us assume that X = (X ₁,…, X _n) is a random sample of n elements generated by a Gaussian model with the same mean μ, and SD σ; let μ be an unknown to be estimated whereas σ is known. Therefore, for each k = 1,…, n, the conditional pdf of X _k, for a given value of μ is a N(μ, σ) pdf. Formally X _k|μ ~ N(μ,σ).

Let the prior information about the unknown parameter μ be described by a prior Normal distribution with known parameters (μ₀, σ₀), i.e. μ ~ N(μ₀, σ₀),^{Footnote 11} so that the prior pdf is:

$$ g(\mu ) = {\frac{1}{{\sigma_{0} \sqrt {2\pi } }}}\exp \left[ { - {\frac{{(\mu - \mu_{0} )^{2} }}{{2\sigma_{0}^{2} }}}} \right],\quad \mu \in \Re \, $$

(77)

The prior parameters (μ₀, σ₀) are also denoted “hyper-parameters” and are assumed to be known. Before observing data, the “best” estimator of μ cannot be that its prior mean: E[μ] = μ₀, with prior variance:

$$ {\text{Var}}[\mu ] = \sigma_{0}^{2} + {\frac{{\sigma^{2} }}{n}} $$

(78)

The LF of the observed sample X = (X ₁,…, X _n), conditional to μ, is expressed by:

$$ L\left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{X} |\mu } \right) = \left( {2\pi \sigma^{2} } \right)^{{{\frac{ - n}{2}}}} {\text{e}}^{{ - \sum {\left( {x_{i} - \mu } \right)}^{2} /2\sigma^{2} }} $$

(79)

Then, multiplying the above two functions for applying the Bayes theorem, after some algebra, the following well-known result is obtained for μ°, the Bayes estimator of μ, i.e. the posterior mean:

$$ \mu^{ \circ } = E[\mu |X] = {\frac{{\sigma^{2} \mu_{0} + n\sigma_{0}^{2} M_{n} }}{{\sigma^{2} + n\mu_{0}^{2} }}} = {\frac{{{\frac{{\sigma^{2} }}{n}}\mu_{0} + \sigma_{0}^{2} M_{n} }}{{{\frac{{\sigma^{2} }}{n}} + \sigma_{0}^{2} }}} $$

(80)

where M _n is the sample mean (which is equal in this case to the classical ML estimator of μ):

$$ M_{n} = (1/N)\sum\limits_{k = 1}^{N} {X_{k} } $$

(81)

In the final equation expressing μ°, the prior variance (σ ²₀ ) and the one of Mn (σ ²/n) are clearly indicated. In this form, the above relationship shows the known property that the Bayesian estimator of μ can be, in a suggestive way, expressed as the weighted mean (a linear convex combination, in fact) of the prior estimator and the sample mean. The posterior variance is given by:

$$ {\text{Var}}\left[ {\mu |X} \right] = {\frac{{\sigma_{0}^{2} \sigma^{2} }}{{n\sigma_{0}^{2} + \sigma^{2} }}} = {\frac{{{\frac{{\sigma^{2} }}{n}}\sigma_{0}^{2} }}{{{\frac{{\sigma^{2} }}{n}} + \sigma_{0}^{2} }}} $$

(82)

2.2 Unknown Variance

Although unknown variance is not considered in the application of this chapter, it seems opportune to mention it, even if very briefly.

Let us first consider known mean, μ = m. The conjugate prior pdf for the variance, here denoted by V, is the so-called “Inverted Gamma” model, characterized by the following pdf, with argument v (a realization, of course positive, of the RV V) and (positive) parameters r and ϕ:

$$ g(v;r,\phi ) = {\frac{{\phi^{r} }}{{v^{(r + 1)} \Upgamma (r)}}}\exp ( - \phi /v),\quad v > 0 $$

(83)

in which Γ(·) is the Euler–Gamma special function. The “Inverted Gamma” pdf is so denoted since it can be deduced as the pdf of the reciprocal of a Gamma RV. It is not difficult to deduce that the posterior pdf of V is again an Inverted Gamma pdf. This result derives from expressing the LF of the observed sample X = (X ₁,…, X _n) as above—Eq. 79—but conditioning to V = ν:

$$ L\left( {\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{X}|v} \right) = \left( {2\pi v} \right)^{{{\frac{ - n}{2}}}} {\text{e}}^{{ - \sum {\left( {x_{i} - m} \right)}^{2} /2v}} $$

(84)

(here, the mean m is assumed to be a known constant whereas the variance v is the argument under investigation). By multiplying the above prior pdf and LF, it is apparent that the posterior pdf of V is again an Inverted Gamma pdf and the updated values of r and ϕ are obvious.

Then, let us also consider the general case of unknown mean μ, i.e. both mean μ and variance V are unknown. Here, the most adopted prior model for μ is again described—conditionally to the variance V—by a Gaussian prior pdf. This prior model, multiplied by the above Inverted Gamma pdf for V, constitutes the so-called “Normal Inverted Gamma” prior pdf. This is indeed the conjugate prior model, as the joint posterior pdf of μ and V is again a Normal Inverted Gamma pdf [23, 24]. A similar model can also be developed in the dynamic framework [29].

Appendix 3: A BCI for the IP Using the Beta Distribution

In order to establish a BCI, a numerical procedure derived from a similar one, proposed in [27] and already proved satisfactory by the authors in [16], is illustrated. In [16], it was used in a different context (the one of classical statistic estimation) whereas here it is revised in the Bayesian framework. As discussed above, in the Bayesian approach the IP Q is an RV in (0, 1), depending on the four random parameters (α_x,α_y,β_x,β_y). The need for a numerical procedure is based on the fact that an analytical expression of such a pdf is impossible to find. A reasonable choice for its characterization is the approximation of its true pdf with a suitable distribution such as the Beta which is very flexible for describing RV in (0, 1) and is capable of producing a large variety of shapes. The Beta is in fact the most commonly used distribution for describing random probabilities because it is also a conjugate pdf under a Binomial sampling [22–24, 36]. The analytical expression of the Beta pdf, as a function of the values q assumed by the RV Q in (0, 1), is [19, 36, 37]:

$$ f\left( {q;\omega ,\xi } \right) = \left( {\begin{array}{*{20}c} {\Upgamma (\omega + \xi ){\frac{{q^{\omega - 1} (1 - q)^{\xi - 1} }}{\Upgamma (\omega )\Upgamma (\xi )}}} & {(0 < q < 1)} \\ 0 & {\text{elsewhere}} \\ \end{array} } \right. $$

(85)

where Γ(·) is the already introduced Gamma special function, and ω and ξ are positive shape parameters. Mean value and variance of the Beta distribution are given by:

$$ \mu_{\text{B}} = {\frac{\omega }{\omega + \xi }};\quad \sigma_{\text{B}}^{2} = \mu_{\text{B}}^{2} \left\{ {{\frac{\xi }{\omega (\omega + \xi + 1)}}} \right\} $$

(86)

In order to choose the approximating Beta pdf for Q, an adequate choice of the two parameters (ω, ξ) must be made, for instance—as proposed in [27]—by equating the above Beta statistical parameters (μ _B, σ ²_B ) to opportune (as explained in the following) values of the mean value M and variance V, thus obtaining the following equations which give the Beta parameters as functions of the mean M and variance:

$$ \omega = M{\frac{{\left( {M - M^{2} - V} \right)}}{V}};\quad \xi = {\frac{\omega (1 - M)}{M}} $$

(87)

The above mean value M and variance V of the RV Q cannot be of course obtained from its (unknown) distribution, but an excellent approximation for them is obtained: the resulting approximate values are, respectively, denoted as (M′, V′). They are obtained, still following [27], by considering an expansion of Q = Ψ(U) expressed as a function, say G = G(α_x,α_y,β_x,β_y), of the four variables (α_x,α_y,β_x,β_y) in a Taylor series about the point Π₀ = (A _x,A _y,B _x,B _y), being (A _x,A _y,B _x,B _y) the a.m. ML estimators (see Sect. 4.4) of the random parameters (α_x,α_y,β_x,β_y). In particular, expanding Q in a Taylor series about Π₀ up to second-order terms, the following values (M′, V′) are obtained by the well-known “Delta method” or the “statistical differentials” method [19, 36]:

$$ M^{\prime} = \Upphi (U) - 0.5\Upphi (U)\left[ {{\frac{{A_{y} - A_{x} }}{{B^{3} }}}} \right]\chi $$

(88)

$$ V^{\prime} = \Upphi (U)^{2} \left[ {{\frac{{B_{x}^{2} }}{{nB^{2} }}} + {\frac{{B_{y}^{2} }}{{mB^{2} }}} + 0.5\left( {A_{y} - A_{x} } \right)^{2} {\frac{{B_{x}^{4} }}{{n_{1} B^{6} }}} + 0.5\left( {A_{y} - A_{x} } \right)^{2} {\frac{{B_{y}^{4} }}{{m_{1} B^{6} }}}} \right] $$

(89)

being: m ₁ = m − 1, n ₁ = n − 1;

$$ U = {\frac{{A_{x} - A_{y} }}{{\sqrt {B_{x}^{2} + B_{y}^{2} } }}}. $$

(90)

$$ B = \sqrt {\left( {B_{x}^{2} + B_{y}^{2} } \right)} ; $$

(91)

$$ \chi = \left[ {{\frac{{B_{x}^{2} }}{n}} + {\frac{{B_{y}^{2} }}{m}} + \frac{1}{2}\left( {A_{y} - A_{x} } \right)^{2} {\frac{{B_{x}^{4} }}{{n_{1} B^{4} }}} - 1.5{\frac{{B_{x}^{4} }}{{n_{1} B^{2} }}} + \frac{1}{2}\left( {A_{y} - A_{x} } \right)^{2} {\frac{{B_{y}^{4} }}{{m_{1} B^{4} }}} + 1.5{\frac{{B_{y}^{4} }}{{m_{1} B^{2} }}}} \right] $$

Φ(x) and ϕ(x), respectively, the already introduced standard Normal cdf and pdf.

Indeed, let us denote by F _B (q; ω, ξ) the generic Beta cdf of the RV Q, evaluated in q, with parameters (ω, ξ), i.e.

$$ F_{\text{B}} (q;\omega ,\xi ) = P\left( {Q < q} \right) $$

(92)

This Beta distribution is used for inference on the BCI. For example, the estimated η-quantile of the above IP is given—still denoting by τ′ the value of a true parameter τ estimated by this procedure—by:

$$ Q^{\prime} = F_{\text{B}}^{ - 1} (\eta \cdot ;\omega^{\prime},\xi^{\prime}) $$

(93)

i.e. by the inverse function of the above Beta cdf F _B(x; ω′, ξ′) evaluated in η, namely the solution, q*, of: η = F _B(q*; ω′, ξ′). So, any “upper confidence bound” for the IP mentioned in Sect. 4 (see Eq. 34) can be computed easily, since the Beta quantiles are largely available in most software packages (e.g. using the function “Betainv” of MATLAB). The above equation is equivalent indeed to the following:

$$ P(Q < Q^{\prime} \eta ) = \eta $$

(94)

And thus Q’_η = F ⁻¹_B (η; ω′, ξ′) coincides with the upper confidence bound of probability (degree of belief) η. Of course, this procedure also allows easy estimation of the whole distribution of Q and establishes any desired confidence interval for the IP, also a bilateral one.

A simple practical numerical example is given to evaluate the BCI. In this example the FCT T _y is therefore assumed to be deterministic, with value t _y = 0.1 s, and the LCCT mean α_x = E[ln(T _y)] is assumed to follow a prior Gaussian distribution with a mean value equal to 0.145 s and an SD equal to 1% of the mean value (i.e. a CV value equal to 0.01). The values (0.10 s, 0.145 s) of the CCT and of the mean FCT, respectively, are typical values, equal to those used in the computations already performed in the VST application of the present chapter. A CV value equal to 0.01 for the mean FCT may be also a reasonable value for describing uncertainty in such kind of on-line applications.^{Footnote 12} The following values of parameters (α_x,α_y,β_x,β_y) correspond to the above CCT and FCT values: α_y = ln(0.1) = −2.3026, β_y = 0, β_x = 0.0998 whereas α_x is an RV with the above pdf.^{Footnote 13}

For illustrative purposes, a simulated sample of N = 10⁴ values of M = α_x was generated and the corresponding empirical pdf of the IP has been evaluated and compared with the approximated theoretical Beta pdf obtained as mentioned above. The goodness of fit of this Beta pdf to the random sample has been validated through the Kolmogorov–Smirnov test of hypothesis [28]. For graphical evidence, this is also confirmed by histograms such as the one in Fig. 8 in which the frequency histograms of the sampled IP values (measured in per cent) and the corresponding hypothetical frequency distribution obtained by the a.m. Beta pdf are superimposed. Also the “Q–Q plots” [28] confirmed this adequacy, as also shown in [16].

To be more specific, the values M′ and V′ of the above mean and variance approximations resulted equal, for such an example, to: M′ = 0.0155%; V′ = (0.0155)². A value of 0.0155 is therefore obtained for the SD S′ = √V, with a corresponding CV equal to 0.8326, much higher than the CV = 0.1 of the basic RV M, thus confirming the already discussed high variability of the IP. This is confirmed by the values assumed by the 5th and 95th percentiles of the IP sample, i.e. 0.0032 and 0.0399, respectively, with an increase of 1147% from the former to the latter.

The Beta pdf corresponding to values of M = M′ and V = V′, which is shown in Fig. 8, has the following values of the two parameters (ω, ξ), obtained from (M ₀, V′) as described above: ω = ω′ = 1.4047, ξ = ξ′ = 89.2197. This computation closes the procedure of finding a BCI.

For instance, the 0.95 upper confidence bound of the above IP is given by:

$$ Q_{0.95} = F_{\text{B}}^{ - 1} (0.95;\omega^{\prime},\xi^{\prime}) = 0.041\% $$

(95)

which is very close to the sample value above reported (0.0399), with a relative difference of less than 3%. So, under this approximation:

$$ P\left( {Q < 0.041\% } \right) = 0.95 $$

(96)

In other words, we can be confident, with a subjective probability equal to 0.95, that the IP is less than 0.041%. As apparent, and as already discussed in Sect. 4, this information has a greater meaning than a simple point estimate, and is consistent with the establishment of possible standards.

Of course, approximately the same values of the statistical parameters of the IP and of its BCI could be obtained by performing a Monte Carlo simulation instead of computing a Beta cdf but the procedure should be repeated—in the dynamical framework here focussed on—at each time step, which is at least tedious if not time-consuming. A much more important advantage of the Beta approximation over Monte Carlo simulation is that the former allows an analytical sensitivity analysis which is cumbersome when done by simulation.

Of course, other pdf approximations can be devised for the above purposes. In our studies, an LN approximation also seemed to be adequate for describing IP randomness. However, the Beta pdf has the advantage of being theoretically limited to the interval (0, 1). Another possible adequate model in this interval is the “Negative Log-Gamma” Distribution; it was introduced in [37] and already discussed and satisfactorily adopted by the authors in studies on uncertainty characterization in reliability analyses [38], and is worth being studied also in the present context.

Appendix 4: Recursive Application of Bayes Estimation

Recursive Bayes estimation is based on repeated application of the Bayes theorem which shows the coherence of the updating process and its adequacy for a “dynamical” estimation, i.e. an estimation procedure involving stochastic processes. Let us perform a statistical inference for an unknown parameter θ, characterized by a prior pdf g(θ). Once a set of data D is observed, let the posterior pdf be g(θ|D):

$$ g(\theta |D) = g(\theta )L(D|\theta )/P(D) $$

(97)

In view of dynamical applications, we can imagine acquiring data D in a two-stage process so that D consists of two sets of data—denoted by D ₁ and D ₂—observed in succession. Then, by repeatedly using the Bayes theorem and the “chain rule” for joint probabilities, the above posterior pdf for θ may be alternatively obtained by expanding the previous equation in the following “two stage” process:

$$ \begin{gathered} g(\theta |D_{1} \cap D_{2} ) = g(\theta )L(D_{1} \cap D_{2} |\theta )/P(D_{1} \cap D_{2} ) = g(\theta \left| {D_{1} )L(D_{2} } \right|\theta \cap D_{1} )/P\left( {D_{2} |D_{1} } \right) \hfill \\ = g_{1} (\theta )L_{1} (D_{2} |\theta )/P_{1} \left( {D_{2} } \right) \hfill \\ \end{gathered} $$

(98)

having denoted with the suffix “1” every probability (or pdf) conditional to data D ₁, i.e.

$$ g_{1} (\theta ) = g(\theta \left| {D_{1} );\quad L_{1} (D_{2} } \right|\theta ) = L(D_{2} |\theta \cap D_{1} );\quad P_{1} \left( {D_{2} } \right) = P\left( {D_{2} |D_{1} } \right) $$

(99)

As can be seen, the posterior pdf g(θ|D ₁ ∩ D ₂) can be obtained by applying the Bayes theorem “starting” with a prior pdf g(θ|D ₁), which is the posterior pdf after observation of D ₁, then applying the same conditioning to the LF L(D ₂|θ) and the probability of data D ₂. The updating process may be indefinitely continued in this way through successive stages, transforming every posterior information gained at the end of stage k into prior information for the next stage:

$$ g(\theta |D1 \cap D2 \cdots \cap D_{k + 1} ) = \, g(\theta |D_{1} \cap D_{2} \cdots \cap D_{k} )L(D_{k + 1} |\theta \cap D_{1} \cdots \cap D_{k} )/C $$

(100)

where C is the “constant” (with respect to θ):

$$ C = L(D_{k + 1} |D_{1} \cdots \cap D_{k} ) $$

(101)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chiodo, E., Lauria, D. (2011). Probabilistic Transient Stability Assessment and On-Line Bayes Estimation. In: Anders, G., Vaccaro, A. (eds) Innovations in Power Systems Reliability. Springer Series in Reliability Engineering. Springer, London. https://doi.org/10.1007/978-0-85729-088-5_8

Download citation

DOI: https://doi.org/10.1007/978-0-85729-088-5_8
Published: 16 February 2011
Publisher Name: Springer, London
Print ISBN: 978-0-85729-087-8
Online ISBN: 978-0-85729-088-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Probabilistic Transient Stability Assessment and On-Line Bayes Estimation

Abstract

Similar content being viewed by others

Impact of Large Penetration of Correlated Wind Generation on Power System Reliability

Probabilistic small-signal stability analysis of power systems based on Hermite polynomial approximation

Uncertainty Modeling Steps for Probabilistic Steady-State Analysis

Keywords

1 Introduction

2 Probabilistic Modelling for Transient Stability Analyses

2.1 Definition and Evaluation of Transient IP

2.2 Probabilistic Modelling of the CCT

2.3 Analytical Evaluation of IP: A General Methodology

3 Analytical IP Evaluation for Gaussian Load and Log-Normal FCT

3.1 Analytical Expression of IP

3.2 A Numerical Example

3.3 Some Final Remarks on IP Sensitivity and Its Estimation

4 Bayesian Statistical Inference for Transient Stability

4.1 Introduction

4.2 A General Methodology for Bayesian Inference on Transient IP

4.3 A Simplified Method for Bayes Estimation of the IP in the Event of VST Stability Prediction

4.4 A Mention of the Classical Estimation of the IP

5 Dynamic Bayesian Estimation of Mean CCT and IP for VST Applications

5.1 Introduction

5.2 Estimation Methodology

5.3 Concluding Remark

6 Numerical Application of the Bayes Recursive Dynamic Model

7 Some Numerical Robustness Analyses

7.1 A Numerical Robustness Analysis with Respect to the Initial Load Value Prior Distribution

7.2 A Numerical Robustness Analysis with Respect to System Equation Random Errors Distribution

7.3 A Final Comment

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix 1: An Analytical Study of the IP

Appendix 2: Bayes Point Estimation for the Gaussian Model

2.1 Known Variance

2.2 Unknown Variance

Appendix 3: A BCI for the IP Using the Beta Distribution

Appendix 4: Recursive Application of Bayes Estimation

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation