1 Introduction

A neural network with basis functions that remain invariant under the Fourier transform is used for fault diagnosis of nonlinear systems. The proposed neural model follows the concept of wavelet networks (Zhang and Benveniste 1993; Addison 2002; Karimi and Lohmann 2006). By employing Gauss-Hermite activation functions which are localized both in space and frequency, the neural network allows better approximation of the multi-frequency characteristics of the monitored system (Cannon and Slotine 1995; Bernard and Slotine 1997; Krzyzak and Sasiadek 1991; Lin 2006; Sureshbabu and Farell 1999). Gauss-Hermite basis functions have some interesting properties (Refregier 2003; Rigatos 2006): (i) they remain almost unchanged by the Fourier transform and satisfy an orthogonality property, which means that the weights of the associated neural network demonstrate the energy which is distributed to the various eigenmodes of the nonlinear system’s dynamics, (ii) unlike wavelet basis functions the Gauss-Hermite basis functions have a clear physical meaning since they represent the solutions of differential equations describing stochastic oscillators (see Rigatos and Tzafestas 2006) and each neuron can be regarded as the frequency filter of the respective eigenfrequency.

The concept of the proposed Fault Detection and Isolation (FDI) method is as follows: the neural network with Gauss-Hermite polynomial activation functions is used for approximating the nonlinear system’s dynamics out of a set of input-output data. Thus the output of the neural network provides a series expansion that takes the form of a weighted sum of Gauss-Hermite basis function. Knowing that the Gauss-Hermite basis functions are orthogonal and remain unchanged under the Fourier transform, subjected only to a change of scale, one has that the considered neural network provides the spectral analysis of the output of the monitored system. Actually, the squares of the weights of the output layer of the neural network denote the distribution of the energy of the monitored signal from the nonlinear system into the associated spectral components. Moreover, since Gauss-Hermite basis functions satisfy an orthogonality property the sum of the square of the output layer weights of these neural networks stands for a measure of the energy contained in the output of the monitored system. By observing changes in the amplitude of the aforementioned spectral components of the nonlinear system’s dynamics one can have also an indication about malfunctioning of the monitored system and can detect the existence of failures. Moreover, since specific faults are associated with amplitude changes of specific spectral components of the system, fault isolation can be also performed.

The proposed Fault Detection and Isolation method can be applied to several electromechanical systems, e.g. vehicles (Ippolito et al. 2003), rotating machines (Basseville and Nikiforov 1993; Zhang et al. 1994, 1998), ac motors (Rigatos and Siano 2011a, b, 2012; Grelle et al. 2006), power generators and other components of the power grid (Galdi et al. 2008, 2009; Rigatos et al. 2009, 2012a) etc. From the application point of view the fault diagnosis approach presented in this paper complements the condition monitoring methods for components of the power grid which were developed and analyzed in (Rigatos et al. 2009, 2012a, b; Piccolo et al. 2010; Rigatos and Siano 2012). In this research work, as first case study, the problem of fault diagnosis of electric power transformers is considered. Significant information about the thermal condition of oil-immersed power transformer and about their ageing and failure risks can be obtained through monitoring of the transformer’s Hot Spot Temperature (HST) (Rigatos et al. 2012b; Piccolo et al. 2010; Rigatos and Siano 2012; Galdi et al. 2000; Ippolito and Siano 2004). The HST can be measured with the placement of sensors at a specific point of the mineral oil volume which serves as both the insulating and cooling material for the transformer’s functioning. A deviation of HST from the anticipated temperature profile is probably an indication of ageing of the transformer or in some cases of pre-failure situations. By modeling the HST variations with the use of neural network that contains Gauss-Hermite polynomial basis functions one obtains (i) a numerical model that associates the HST with other parameters of the power transformer such as ambient temperature, top oil temperature and load current, (ii) indications about the spectral characteristics of the HST signal and the distribution of its energy content to various spectral components associated with the basis functions. By recording changes in the amplitude of these spectral components one can detect the existence of failures in the power transformer and can identify which are the components of the transformer that are responsible for malfunctioning. As a second case study the problem of fault diagnosis of the doubly-fed induction generator has been examined (Galdi et al. 2008, 2009; Rigatos et al. 2009, 2012a, b; Piccolo et al. 2010; Rigatos and Siano 2012). The dynamics of the rotor current has been modeled with the use of a Gauss-Hermite neural network and the associated spectral components have been obtained. Variation in the energy spectrum of the rotor’s current provided again information about the existence of failures and about the association of faults with specific components of the turbine-generator system.

The structure of the paper is as follows: in Sect. 2 feed-forward neural networks are analyzed and their use in nonlinear systems modeling is explained. In Sect. 3 feed-forward neural networks with Gauss-Hermite activation functions are introduced and their distinctive properties are explained such as orthogonality of the basis functions and invariance to Fourier transform. In Sect. 4 basic principles of signals spectral analysis are presented and the use of Fourier transform in calculating a signal’s energy content and power spectral density is explained. In Sect. 5 the use of neural networks with Gauss-Hermite basis functions in modeling the thermal dynamics of electric power transformers is explained. It is shown how these neural networks enable spectral analysis of the HST signal in power transformers and how based on the spectral content of these signals one can perform fault detection and isolation. In Sect. 6 the use of neural networks with Gauss-Hermite basis functions in modeling the rotor current dynamics of doubly-fed induction generators is explained. It is shown how the Gauss-Hermite neural networks enable spectral analysis of the rotor’s current signal in electric power generators and how the associated spectral content can be used for fault diagnosis. Finally, in Sect. 7 concluding remarks are stated.

2 Feed-forward neural networks for nonlinear systems modelling

The proposed fault diagnosis approach for nonlinear dynamical systems, can be implemented with the use of feed-forward neural networks. The idea of function approximation with the use of feed-forward neural networks (FNN) comes from generalized Fourier series. It is known that any function ψ(x) in a L 2 space can be expanded, using generalized Fourier, series in a given orthonormal basis, i.e.

$$ \psi(x)={\sum\limits_{k=1}^{\infty}}{c_k}{\psi_k(x)},\quad a \leq x \leq b $$
(1)

Truncation of the series yields in the sum

$$ S_M(x)={\sum\limits_{k=1}^M}{a_k}{\psi_k(x)} $$
(2)

If the coefficients a k are taken to be equal to the generalized Fourier coefficients, i.e. when \(a_k=c_k={\int_a^b}\psi(x){\psi_k(x)}dx,\) then Eq. (2) is a mean square optimal approximation of ψ(x).

Unlike generalized Fourier series, in FNN the basis functions are not necessarily orthogonal. The hidden units in a FNN usually have the same activation functions and are often selected as sigmoidal functions or Gaussians. A typical feed-forward neural network consists of n inputs \(x_i, i=1,2,\ldots,n,\) a hidden layer of m neurons with activation function \(h: R \rightarrow R\) and a single output unit (see Fig. 1a). The FNN’s output is given by

$$ \psi(x)={\sum\limits_{j=1}^n}{c_j}h \left({\sum\limits_{i=1}^n}{w_{ji}}{x_i}+{b_j}\right) $$
(3)

The root mean square error in the approximation of function ψ(x) by the FNN is given by

$$ E_{RMS}=\sqrt{{1 \over N}{\sum_{k=1}^N}({\psi(x^k)}-\hat{\psi}(x^k))^2} $$
(4)

where \(x^k=[x_1^k,x_2^k,\ldots,x_n^k]\) is the k-th input vector of the neural network. The activation function is usually a sigmoidal function \(h(x)={1 \over {1+e^{-x}}}\) while in the case of radial basis functions networks it is a Gaussian (Haykin 1994). Several learning algorithms for neural networks have been studied. The objective of all these algorithms is to find numerical values for the network’s weights so as to minimize the mean square error E RMS of Eq. (4). The algorithms are usually based on first and second order gradient techniques. These algorithms belong to: (i) batch-mode learning, where to perform parameters update the outputs of a large training set are accumulated and the mean square error is calculated (back-propagation algorithm, Gauss-Newton method, Levenberg-Marquardt method, etc.), (ii) pattern-mode learning, in which training examples are run in cycles and the parameters update is carried out each time a new datum appears (Extended Kalman Filter algorithm) (Rigatos and Zhang 2009).

Fig. 1
figure 1

a Feed-forward neural network, b neural network with Gauss-Hermite basis functions

Unlike conventional FNN with sigmoidal or Gaussian basis functions, Hermite polynomial-based FNN remain closer to Fourier series expansions by employing activation functions which satisfy the property of orthogonality (Zuo et al. 2009). Other basis functions with the property of orthogonality are Hermite, Legendre, Chebyshev, and Volterra polynomials (Refregier 2003; Rigatos 2006; Yang and Cheng 1996).

3 Neural networks using Hermite activation functions

3.1 The Gauss-Hermite series expansion

Next, as orthogonal basis functions of the feed-forward neural network Gauss-Hermite activation functions are considered. These are the spatial components X k (x) of the solution of Schrödinger’s differential equation and describe a stochastic oscillation:

$$ X_k(x)=H_k(x)e^{-{x^2} \over 2}, \quad k=0,1,2,\ldots $$
(5)

where H k (x) are the Hermite orthogonal functions (Fig. 2). The Hermite functions H k (x) are the eigenstates of the quantum harmonic oscillator. The general relation for the Hermite polynomials is

$$ H_k(x)=(-1)^k{e^{x^2}}{d^{(k)} \over dx^{(k)}}{e^{-x^2}} $$
(6)

According to Eq. (6) the first five Hermite polynomials are:

$$ H_0(x)=1,\quad H_1(x)=2x,\quad H_2(x)=4x^2-2,\quad H_3(x)=8x^3-12x,\quad H_4(x)=16x^4-48x^2+12 $$

It is known that Hermite polynomials are orthogonal, i.e. it holds

$$ {\int\limits_{-\infty}^{+\infty}}e^{-x^2}{H_m(x)}{H_k(x)}dx= \left\{ \begin{array}{ll} {2^k}{k!}\sqrt{\pi} & if\, m=k \\ 0 & if\, m\,{\neq}\,k \end{array} \right. $$
(7)

Using now, Eq. (7), the following basis functions can be defined (Refregier 2003):

$$ \psi_k(x)=[{2^k}{\pi^{1 \over 2}}{k!}]^{-{1 \over 2}}{H_k(x)}e^{-{x^2 \over 2}} $$
(8)

where H k (x) is the associated Hermite polynomial. From Eq. (7), the orthogonality of basis functions of Eq. (8) can be concluded, which means

$$ {\int\limits_{-\infty}^{+\infty}}{{\psi_m}(x)}{\psi_k(x)}dx= \left\{ \begin{array}{ll} 1 & if \, m=k \\ 0 & if \, m\,{\neq}\,k \end{array} \right. $$
(9)
Fig. 2
figure 2

a First five one-dimensional Hermite basis functions, b analytical represenation of the 1D Hermite basis function

Moreover, to succeed multi-resolution analysis Gauss-Hermite basis functions of Eq. (8) are multiplied with the scale coefficient α. Thus the following basis functions are derived (Refregier 2003)

$$ \beta_k(x,\alpha)=\alpha^{-{1 \over 2}}{\psi_k}(\alpha^{-1}x) $$
(10)

which also satisfy orthogonality condition

$$ {\int\limits_{-\infty}^{+\infty}}{{\beta_m}(x,\alpha)}{\beta_k}{(x,\alpha)}dx= \left\{ \begin{array}{ll} 1 & if \, m=k \\ 0 & if\, m\,{\neq}\, k \end{array} \right. $$
(11)

Any function \(f(x),\, x \in R\) can be written as a weighted sum of the above orthogonal basis functions, i.e.

$$ f(x)={\sum\limits_{k=0}^{\infty}}{c_k}{\beta_k}(x,\alpha) $$
(12)

where coefficients c k are calculated using the orthogonality condition

$$ c_k={\int\limits_{-\infty}^{+\infty}}f(x){\beta_k(x,\alpha)}dx $$
(13)

Assuming now that instead of infinite terms in the expansion of Eq. (12), M terms are maintained, then an approximation of f(x) is succeeded. The expansion of f(x) using Eq. (12) is a Gauss-Hermite series. Eq. (12) is a form of Fourier expansion for f(x). Equation (12) can be considered as the Fourier transform of f(x) subject only to a scale change. Indeed, the Fourier transform of f(x) is given by

$$ F(s)={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}f(x)e^{-jsx}dx \Rightarrow f(x)={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}F(s)e^{jsx}ds $$
(14)

The Fourier transform of the basis function \(\psi_k(x)\) of Eq. (8) satisfies (Refregier 2003)

$$ \Uppsi_k(s)={j^k}{\psi_k}(s) $$
(15)

while for the basis functions β k (x, α) using scale coefficient α it holds that

$$ B_k(s,\alpha)={j^k}{\beta_k}(s,\alpha^{-1}) $$
(16)

Therefore, it holds

$$ f(x)={\sum\limits_{k=0}^{\infty}}{c_k}{\beta_k}(x,\alpha) \begin{array}{c} \Huge{F} \\ \rightarrow \ \end{array} F(s)={\sum\limits_{k=0}^{\infty}}{c_k}{j^n}{\beta_k}(s,\alpha^{-1}) $$
(17)

which means that the Fourier transform of Eq. (12) is the same as the initial function, subject only to a change of scale. The structure of a feed-forward neural network with Hermite basis functions is depicted in Fig. 1b.

3.2 Neural networks using 2D Hermite activation functions

Two-dimensional Hermite polynomial-based neural networks can be constructed by taking products of the one dimensional basis functions B k (x, α) (Refregier 2003). Thus, setting x = [x 1, x 2]T one can define the following basis functions (Refregier 2003)

$$ B_k(x,\alpha)={1 \over \alpha}B_{k_1}(x_1,\alpha)B_{k_2}(x_2,\alpha) $$
(18)

These two dimensional basis functions are again orthonormal, i.e. it holds

$$ {\int}{d^2x}B_{n}(x,\alpha)B_{m}(x,\alpha)=\delta_{{n_1}{m_1}}\delta_{{n_2}{m_2}} $$
(19)

The basis functions B k (x) are the eigenstates of the two dimensional harmonic oscillator and form a complete basis for integrable functions of two variables. A two dimensional function f(x) can thus be written in the series expansion:

$$ f(x)={\sum\limits_{k_1,k_2}^{\infty}}{c_k}B_k(x,\alpha) $$
(20)

The choice of an appropriate scale coefficient α and maximum order k max is of practical interest. The coefficients c k are given by

$$ c_k={\int}{dx^2}f(x)B_k(x,\alpha) $$
(21)

Indicative basis functions B 2(x, α), B 6(x, α), B 9(x, α), B 11(x, α) and B 13(x, α), B 15(x,α) of a 2D feed-forward neural network with Hermite basis functions are depicted in Figs. 3, 4, and 5. Following, the same method N-dimensional Hermite polynomial-based neural networks (N > 2) can be constructed. The associated high-dimensional Gauss-Hermite activation functions preserve the properties of orthogonality and invariance to Fourier transform.

Fig. 3
figure 3

2D Hermite polynomial activation functions: a basis function B 2(x,α), b basis function B 6(x,α)

Fig. 4
figure 4

2D Hermite polynomial activation functions: a basis function B 9(x,α), b basis function B 11(x,α)

Fig. 5
figure 5

2D Hermite polynomial activation functions: a basis function B 13(x,α), b basis function B 15(x,α)

4 Signals power spectrum and the Fourier transform

4.1 Parseval’s theorem

To find the spectral density of a signal ψ(t) with the use of its Fourier transform \(\Uppsi(j{\omega}),\) the following definition is used:

$$E_{\psi}={\int\limits_{-\infty}^{+\infty}}{(\psi(t))^2}dt={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}\psi(t)\left({\int\limits_{-\infty}^{+\infty}}\Uppsi(j\omega)e^{j{{\omega}t}}d{\omega}\right)dt \quad \hbox {i.e.}\, E={1 \over {2\pi}}{\int_{-\infty}^{+\infty}}\Uppsi(j\omega)\Uppsi(-j\omega)d{\omega}$$
(22)

Taking that ψ(t) is a real signal it holds that \(\Uppsi(-j\omega)=\Uppsi^{*}(j\omega)\) which is the signal’s complex conjugate. Using this in Eq. (22) one obtains

$$E_{\psi}={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}\Uppsi(j\omega)\Uppsi^{*}(j\omega)d\omega \;{\rm or}\; E_{\psi}={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}|\Uppsi(j\omega)|^2{d\omega}$$
(23)

This means that the energy of the signal is equal to \({1 \over {2\pi}}\) times the integral over frequency of the square of the magnitude of the signal’s Fourier transform. This is Parseval’s theorem. The integrated term \(|\Uppsi(j\omega)|^2\) is the energy density per unit of frequency and has units of magnitude squared per Hertz.

4.2 Power spectrum of the signal using the Gauss-Hermite expansion

As shown in Eqs. (7) and (19) the Gauss-Hermite basis functions satisfy the orthogonality property, i.e. for these functions it holds

$$ {\int\limits_{-\infty}^{+\infty}}{{\psi_m}(x)}{\psi_k(x)}dx= \left\{ \begin{array}{ll} 1& if\, m=k \\ 0 & if\, m \neq k \end{array} \right. $$

Therefore, using the definition of the signal’s energy one has

$$ E={\int\limits_{-\infty}^{+\infty}}{(\psi(t))^2}dt={\int\limits_{-\infty}^{+\infty}}\left[{\sum\limits_{k=1}^N}{c_k}\psi_k(t)\right]^2 {dt} $$
(24)

and exploiting the orthogonality property one obtains

$$ E={\sum\limits_{k=1}^N}{c_k^2}$$
(25)

Therefore the square of the coefficients c k provides an indication of the distribution of the signal’s energy to the associated basis functions. One could arrive at the same results using the Fourier transformed description of the signal and Parseval’s theorem. It has been shown that the Gauss-Hermite basis functions remain invariant under the Fourier transform subject only to a change of scale. Denoting by \(\Uppsi(j\omega)\) the Fourier transformed signal of ψ(t) and by \(\Uppsi_k(j\omega)\) the Fourier transform of the k-th Gauss-Hermite basis function one obtains

$$ \Uppsi(j\omega)={\sum\limits_{k=1}^N}{c_k}\Uppsi_k(j\omega) $$
(26)

and the energy of the signal is computed as

$$E_\psi={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}{|\Uppsi(j\omega)|^2}d{\omega}$$
(27)

Substituting Eq. (26) into Eq. (27) one obtains

$$E_\psi={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}{\left|{\sum\limits_{k=1}^N}{c_k}\Uppsi_k(j\omega) \right|^2}d{\omega} $$
(28)

and using the invariance of the Gauss-Hermite basis functions under Fourier transform one gets

$$ E_\psi={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}{\left|{\sum\limits_{k=1}^N}{c_k}{\alpha^{-{1 \over 2}}}\psi_k({\alpha^{-1}}{j\omega)} \right|^2}d{\omega}$$
(29)

while performing the change of variable ω1 = α−1ω it holds that

$$ E_\psi={1 \over {2\pi}}{\int\limits_{-\infty}^{+\infty}}{\left|{\sum\limits_{k=1}^N}{c_k}{\alpha^{{1 \over 2}}}\psi_k({j\omega_1)} \right|^2}d{\omega_1}$$
(30)

Next, by exploiting the orthogonality property of the Gauss-Hermite basis functions one gets that the signal’s energy is proportional to the sum of the squares of the coefficients c k which are associated with the Gauss-Hermite basis functions, i.e. a relation of the form

$$E_\psi={\sum\limits_{k=1}^N}{c_k^2}$$
(31)

5 Gauss-Hermite neural modeling for power transformers

5.1 Thermal model of electric power transformers

The method aims at monitoring the evolution in time of the transformer’s HST which can be an indication of the ageing and degradation of the windings or of operating the transformer under overload conditions. Most of the faults cause change in the thermal behavior of the transformer. Such abnormal conditions can be detected by analysing the HST. The most common abnormal condition of the transformer that can be detected with the use of thermal analysis is the overload. Transformer life is severely affected if the HST remains for long time intervals more than 110 °C.

The stages for obtaining an analytical model of the power transformer’s thermal behavior are as follows (Ippolito and Siano 2004):

  • Calculate at each time step the ultimate top oil temperature rise in the transformer from the load current at that instant, using:

    $$ {\Updelta}\Uptheta_{TO,U}={\Updelta}\Uptheta_{TO,R}[{{{I_L^2}R+1} \over {R+1}}]^q $$
    (32)

    where \({\Updelta}{\Uptheta}_{TO,U}\) is ultimate top oil temperature (TOT) rise, (\(^{\circ}\)C), \({\Updelta}{\Uptheta}_{TO,R}\) is the rated TOT rise over ambient, (°C), I L is the load current normalised to rated current, (p.u.), q is an empirically derived exponent to approximately account for effects of change of resistance with change in load, R is the ratio of rated-load loss to no-load loss at applicable tap position.

  • Calculate the increment in the TOT from the ultimate top oil rise and the ambient temperature at each time step using the differential equation:

    $$ {\tau_{TO}}{{d\Uptheta_{TO}} \over {dt}}=[{\Updelta}\Uptheta_{TO,U}+\Uptheta_A]-\Uptheta_{TO} $$
    (33)

    where \(\Uptheta_{TO}\) is the TOT, (°C), τ TO is the top oil rise time constant, and \(\Uptheta_A\) is the ambient temperature, (°C).

  • Calculate the ultimate hot spot temperature rise using:

    $$ {\Updelta}\Uptheta_{HS,U}={\Updelta}\Uptheta_{HS,R}{I_L^{2\beta}}$$
    (34)

    where β is an empirically derived exponent, dependent on the cooling method, \({\Updelta}\Uptheta_{HS,U}\) is the ultimate HST rise over top oil (for a given load current), (°C), \({\Updelta}\Uptheta_{HS,R}\) is the rated HST rise over top oil (for rated load current), (°C).

  • Calculate the increment in the HST rise, using the differential equation:

    $$ \tau_{HS}\left\{{{d\Updelta\Uptheta_{HS}} \over {dt}}\right\}={\Updelta}\Uptheta_{HS,U}-{\Updelta}\Uptheta_{HS} $$
    (35)

    where \(\Uptheta_{HS}\) is the hot spot winding temperature, (°C), is the HST rise above top oil, (°C), is the hot spot rise time constant, (h).

  • Finally, add the TOT to the hot spot temperature rise to get the HST, using:

    $$ \Uptheta_{HS}=\Uptheta_{TO}+{\Updelta}\Uptheta_{HS} $$
    (36)

The model of Eqs. (32)–(36), named top-oil rise model, is based on some simplifying assumptions and its accuracy can deteriorate due to parameter variations. As a result, in order to protect power transformers, conservative safety factors have been introduced that prevent the transformer’s overheating. Consequently, the calculated maximum power transfer may be 20–30 % less or worse than the real transformer capability.

5.2 Fault diagnosis for power transformers

Thermal analysis aims at monitoring the evolution in time of the transformer’s HST which can be an indication of the ageing and degradation of the windings or of operating the transformer under overload conditions (Catterson et al. 2010, 2011; Metwally 2011). Most of the faults (see Fig. 6) cause change in the thermal behavior of the transformer and can be detected by analysing the HST (usually measured at the top or in the center of the high or low voltage winding) (Piccolo et al. 2010; Abu-Elanien and Salama 2010; Velasquez-Contreras et al. 2011).

Fig. 6
figure 6

Frequency of faults in components of electric power transformers

The transformer’s main characteristics are resumed in Table 1. A measurement station has been set up consisting of thermocouples that were recording (a) the HST of the medium and voltage windings and (b) the Top Oil Temperature. The HST could have been also measured with optical fiber sensors. The manufacturer’s specifications give, the most probable hot-spot position. A hall effect current transducer has been used in order to measure the load current.

Table 1 Ratings of the modeled power transformer

The transformer’s thermal model, i.e. variations of the HST, has been identified considering the previously analyzed neural network with Gauss-Hermite basis functions. The inputs/outputs configuration of the neural model of the transformer’s thermal dynamics is shown in Fig. 7 (Rigatos et al. 2012b; Piccolo et al. 2010; Rigatos and Siano 2012; Galdi et al. 2000; Ippolito and Siano 2004). To approximate the HST variations described in a data set consisting of 870 quadruplets of the form \([\Uptheta_{TO}(k-1), \Uptheta_{TO}(k-2), I_L(k-1) | HST(k)]\) a feed-forward neural network with 3-D Gauss-Hermite basis functions has been used, containing 64 nodes in its hidden layer. Neural models with the same output, such as HST(k) and a larger number of inputs, i.e. including more past values of the top-oil temperature and of the load current, could be also considered. As shown in Figs. 8 and 9, thanks to the multi-frequency characteristics of the Gauss-Hermite basis functions, such a neural model can capture with increased accuracy spikes and abrupt changes in the HST profile (Zhang and Benveniste 1993; Rigatos and Tzafestas 2006; Rigatos and Zhang 2009). The RMSE (Root Mean Square Error) of training the Gauss-Hermite neural model was of the order of 4 × 10−3.

Fig. 7
figure 7

Inputs/outputs configuration of the neural model of the power transformer thermal dynamics

Fig. 8
figure 8

Approximation of the HST of the electric power transformer (red line) by a neural network with Hermite polynomial basis functions (blue-line). a HST time variation—profile 1. b HST time variation—profile 2

Fig. 9
figure 9

Approximation of the HST of the electric power transformer (red line) by a neural network with Hermite polynomial basis functions (blue-line). a HST time variation—profile 3. b HST time variation—profile 4

The update of the output layer weights of the neural network is given by a gradient equation (LMS-type) of the form

$$ w^{i}(k+1)=w^{i}(k)-{\eta}e(k){\phi^T}(k) $$
(37)

where e(k) = y(k) − y d (k) is the output estimation error at time instant k and ϕ T(k) is the regressor vector having as elements the values ϕ(x(k)) of the Gauss-Hermite basis functions for input vector x(k).

To approximate the HST variations described in a data set consisting of 870 quadruplets of the form \(\Uptheta_{TO}(k-1), \Uptheta_{TO}(k-2), I_L(k-1)| HST(k)\) a feed-forward neural network with 3-D Gauss-Hermite basis functions has been used, containing 64 nodes in its hidden layer.

The spectral components of the HST signal for both the fault-free and the under-fault operation of the power transformer have been shown in Figs. 10, 11, 12 and 13. It can be noticed that after a fault has occurred the amplitude of the aforementioned spectral components changes and this can be a strong indication about failure of the monitored transformer.

Fig. 10
figure 10

HST time variation—profile 1: a amplitude of the spectral components of the HST signal measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 11
figure 11

HST time variation—profile 2: a amplitude of the spectral components of the HST signal measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 12
figure 12

HST time variation—profile 3: a amplitude of the spectral components of the HST signal measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 13
figure 13

HST time variation—profile 4: a amplitude of the spectral components of the HST signal measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Obviously, the proposed spectral decomposition of the monitored signal, with series expansion in Gauss-Hermite basis functions can be used for fault detection tasks. As it can be seen in Figs. 10, 11, 12 and 13, in case of failure, the spectral components of the monitored signal differ from the ones which are obtained when the system is free of fault. Moreover, the fact that certain spectral components exhibit greater sensitivity to the fault and change value in a more abrupt manner is a feature which can be exploited for fault isolation. Specific failures can be associated with variations of specific spectral components. Therefore, they can provide indication about the appearance of specific types of failures and specific malfunctioning components.

6 Gauss-Hermite modeling of electric power generators

6.1 Model of the doubly-fed induction generator

The doubly-fed induction generator (DFIG) is not only the most widely used technology in wind turbines due to its good performance, but it is also used in many other fields such as hydro-power generation, pumped storage plants and flywheel energy storage systems. The DFIG model is derived from the voltage equations of the stator and rotor. It is assumed that the stator and rotor windings are symmetrical and symmetrically fed. The saturation of the inductances, iron losses, skin effect, and bearing friction is neglected. The winding resistance is considered to be constant. A model of the doubly-fed induction generator is as follows:

Dynamic equations:

$$ {J}{\dot{\omega}}={T_m}-{K_f}{\omega}-{T_e} $$
(38)

where J is the moment of inertia of the rotor, T m is the externally applied mechanical torque that makes the turbine rotate, T e is the electrical torque which is associated to the generated active power and finally the term k f ω expresses friction, with K f being the friction coefficient. The wind generated mechanical torque is given by

$$ T_m={1 \over 2}{\rho}{\pi}{R^3}{C_q(\lambda,\beta)}v^2 $$
(39)

where v is the wind’s speed (Boukhezzar and Siguerdidjane 2009). C q is a torque coefficient which depends on the blade pitch angle β and the tip-speed ratio which is provided by \(\lambda={{{\omega_r}R} \over v},\) with ω r being the rotor’s angular velocity, R is the rotor radius and ρ is the air density.

This type of wound-rotor machine is connected to the grid by both the rotor and stator side. The DFIG stator can be directly connected to the electric power grid while the rotor is interfaced through back-to-back converters (see Fig. 14). By decoupling the power system electrical frequency and the rotor mechanical frequency the converter allows a variable speed operation of the wind turbine.

Fig. 14
figure 14

Configuration of a doubly-fed induction generator unit in the power grid

The doubly-fed induction generator is analogous to the induction motor. In an induction motor the stator voltage plays the role of an input variable, while the rotor voltage is a constant (it is usually zero). In case of the doubly-fed induction machine it is very similar but the other way round, with a dual analogy to hold between the stator and rotor parameters of the generator and the motor. This means that the rotor voltage now acts as an input, while the stator voltage depends on the voltage at the bus to which the DFIG is connected and is a constant parameter (Rigatos and Siano 2012; Boukhezzar and Siguerdidjane 2009; Calderaro et al. 2008).

Electrical equations: Using the Park transform the DFIG is described in the dq reference frame by the following set of equations:

$$ v_{s_d}={R_s}{i_{s_d}}+{{d\psi_{s_d}} \over {dt}}-{\omega_{dq}}{\psi_{s_q}} $$
(40)
$$ v_{s_q}={R_s}{i_{s_q}}+{{d\psi_{s_q}} \over {dt}}+{\omega_{dq}}{\psi_{s_d}} $$
(41)
$$ v_{r_d}={R_r}{i_{r_d}}+{{d\psi_{r_d}} \over {dt}}-{\omega_r}{\psi_{r_q}} $$
(42)
$$ v_{r_q}={R_r}{i_{r_q}}+{{d\psi_{r_q}} \over {dt}}+{\omega_r}{\psi_{r_d}} $$
(43)

where ω dq is the synchronous frequency, ω r is the rotation frequency of the rotor, \(\psi_{s_d}\) is the stator flux component along the d-axis, \(\psi_{s_q}\) is the stator flux component along the the q-axis and equivalently \(\psi_{s_d}\) is the rotor flux component along the d-axis, while \(\psi_{s_d}\) \(\psi_{r_q}\) is the stator flux component along the q-axis (see Fig. 15).

Fig. 15
figure 15

The ab stator reference frame and the dq rotor reference frame of the induction generator

Moreover, \(v_{s_d}\) and \(i_{s_d}\) are the stator’s voltage and current in the d reference, \(v_{s_q}\) and \(i_{s_q}\) are the stator’s voltage and current in the q reference frame and equivalently \(v_{r_d}\) and \(i_{r_d}\) are the rotor’s voltage and current in the d reference frame, while \(v_{r_q}\) and \(i_{r_q}\) are the rotor’s voltage and current in the q reference frame.

As the d and q axis are magnetically decoupled the flux components are described by the following equations:

$$ {\psi_{s_d}}={L_s}{i_{s_d}}+M{i_{r_d}}$$
(44)
$$ {\psi_{s_q}}={L_s}{i_{s_q}}+M{i_{r_q}} $$
(45)
$$ {\psi_{r_d}}={L_r}{i_{r_d}}+M{i_{s_d}}$$
(46)
$${\psi_{r_q}}={L_r}{i_{r_q}}+M{i_{s_q}} $$
(47)

Moreover, the electromagnetic torque that is developed is given by

$$ T_e={\eta}(i_{s_q}\psi_{s_d}-i_{s_d}\psi_{s_q}) $$
(48)

where η is a variable that is associated to the number of poles and to the mutual inductance M. Additionally, active and reactive power delivered by the DFIG stator are associated to the real and imaginary part of the apparent power at the stator’s terminals, i.e.

$$ P_s=Re\{{U_s}{I_s^{*}}\}={v_{s_d}}{i_{s_d}}+{v_{s_q}}{i_{s_q}} $$
(49)
$$ Q_s=Im\{{U_s}{I_s^{*}}\}={v_{s_d}}{i_{s_q}}-{v_{s_q}}{i_{s_d}} $$
(50)

The angle of the vectors that describe the magnetic flux ψ s α and ψ s b is first defined for the stator, i.e.

$$ \rho=tan^{-1} \left({{\psi_{s_b}} \over {\psi_{s_a}}} \right) $$
(51)

The angle between the inertial reference frame and the rotating reference frame is taken to be equal to ρ.

Moreover, it holds that \(cos(\rho)={{\psi_{s_a}} \over {||\psi||}},\) \(sin(\rho)={{\psi_{s_b}} \over {||\psi||}},\) and \(||\psi||=\sqrt{{\psi_{s_\alpha}^2}+{\psi_{s_b}^2}}.\) Therefore, in the rotating frame dq of the generator there will be only one non-zero component of the magnetic flux \(\psi_{s_d}\), while the component of the flux \(\psi_{s_q}\) equals 0.

In a compact form the doubly-fed induction generator can be described by the following set of equations in the d − q reference frame that rotates at an arbitrary speed denoted as ω dq (Forchetti et al. 2009)

$$ {{d\psi_{s_q}} \over {dt}}=-{1 \over \tau_s}{\psi_{s_q}}-{\omega_{dq}}{\psi_{s_d}}+{M \over \tau_s}i_{r_q}+v_{s_q} $$
(52)
$$ {{d\psi_{s_d}} \over {dt}}=\omega_{dq}{\psi_{s_q}}-{1 \over {\tau_s}}{\psi_{s_d}}+{M \over \tau_s}i_{r_d}+v_{s_d} $$
(53)
$$ {{d{i_{r_q}}} \over {dt}}={\beta \over \tau_s}{\psi_{s_q}}+{\beta}{\omega_r}{\psi_{s_d}}-{\gamma_2}{i_{r_q}}-({\omega_{dq}-\omega_r}){i_{r_d}}-\beta{v_{s_q}}+{1 \over {{\sigma}L_r}}v_{r_q} $$
(54)
$$ {{d{i_{r_d}}} \over{dt}}=-\beta{\omega_r}{\psi_{s_q}}+{\beta \over\tau_s}{\psi_{s_d}}+(\omega_{dq}-\omega_r){i_{r_q}}-{\gamma_2}{i_{r_d}}-{\beta}{v_{s_d}}+{1 \over {{\sigma}L_r}}v_{r_d} $$
(55)

where \(\uplambda_{s_q},\) \(\uplambda_{s_d}, i_{r_q}, i_{r_d},\) are the stator flux and the rotor currents, \(v_{s_q}, v_{s_d}, v_{r_q}, v_{r_d},\) are the stator and rotor voltages, L s and L r are the stator and rotor inductances, ω r is the rotor’s angular velocity, M is the magnetizing inductance. Moreover, denoting as R s and R r the stator and rotor resistances the following parameters are defined

$$ \begin{aligned}\sigma&=1-{\frac{M^2}{{L_r}{L_s}}}\quad \beta={\frac{1-\sigma} {M\sigma}}\quad \tau_s={\frac{L_s}{R_s}}\\ \tau_r&={\frac{L_r}{R_r}}\quad {\gamma_2}=\left({\frac{1-\sigma}{{\sigma}{\tau_s}}}\right)\end{aligned} $$
(56)

The dynamic model of the doubly-fed induction generator can be also written in state space equations form by defining the following state variables: x 1 = θ, x 2 = ω r , \({x_3}=\psi_{s_d}, {x_4}=\psi_{s_d}, {x_5}=i_{r_d} \,{\hbox {and}}\,{x_6}=i_{r_q}.\) It holds that (Rigatos and Siano 2012; Boukhezzar and Siguerdidjane 2009)

$$ \dot{x}=f(x)+{g_a(x)}v_{r_d}+{g_b(x)}v_{r_q} $$
(57)

where x = [x 1x 2x 3x 4x 5x 6]T and

$$ f(x)= \left\{ \begin{array}{c} x_2 \\ -{K_m \over J}{x_2}-{{T_m} \over J}+{n \over J}(i_{s_q}{x_3}-i_{s_d}{x_4}) \\ -{1 \over {\tau_s}}{x_3}+{\omega_{dq}}{x_4}+{M \over \tau_s}x_5+v_{s_d}\\ -{\omega_{dq}}{x_3}-{1 \over {\tau_s}}{x_4}+{M \over \tau_s}x_6+v_{s_q}\\ -{\beta}{x_2}{x_4}+{\beta \over \tau_s}{x_3}+(\omega_{dq}-x_2){x_6}-{\gamma_2}{x_5}-{\beta}v_{s_d}\\ {\beta \over\tau_s}{x_4}+{\beta}{x_2}{x_3}+(\omega_{dq}-x_2){x_5}-{\gamma_2}{x_6} \end{array} \right\} $$
(58)
$$ g_{a} (x) = \left( {\begin{array}{*{20}c} 0 & 0 & 0 & 0 & {\frac{1}{{\sigma L_{r} }}} & 0 \\ \end{array} } \right) $$
(59)
$$ g_{b} (x) = \left( {\begin{array}{*{20}c} 0 & 0 & 0 & 0 & 0 & {\frac{1}{{\sigma L_{r} }}} \\ \end{array} } \right) $$
(60)

Indicative numerical values for the parameters of the considered doubly-fed induction generator model are given in Table 2.

Table 2 Ratings of the modeled DFIG

6.2 Fault diagnosis for doubly-fed induction generators

The components of the turbine-power generator system are exposed to harsh operating conditions and exhibit failures (see Fig. 16). The generator’s dynamic model, i.e. variations of the rotor’s current on the d-axis, has been identified considering the previously analyzed neural network with Gauss-Hermite basis functions. The inputs/outputs configuration of the neural model of the power generator dynamics is shown in Fig. 16. Real-time measurements of the rotor current in the a − b reference frame are available which after the application of a rotation transformation can provide the associated rotor current measurements in the d − q reference frame. To approximate the variations of the rotor current i r_d described in a data set consisting of 2,000 quadruplets of the form [i rd (k − 1), i rq (k − 2), ω(k − 1) |i rd (k)] a feed-forward neural network with 3-D Gauss-Hermite basis functions has been used, containing 64 nodes in its hidden layer. Neural models with the same output, such as i rd (k) and a larger number of inputs, i.e. including more past values of the rotor’s current and of the rotation speed, could be also considered (Fig. 17).

Fig. 16
figure 16

Frequency of faults in components of the turbine-power generator system

Fault cases 1–2 were associated with a change in the value of the stator’s resistance R s under two different set-points for the rotor’s angular speed. Fault cases 3–4 were associated with a change in the value of the rotor’s inductance L r under two different set-points for the rotor’s angular speed. As shown in Figs. 18 and 19, thanks to the multi-frequency characteristics of the Gauss-Hermite basis functions, such a neural model can capture with increased accuracy spikes and abrupt changes in the rotor’s current (Zhang and Benveniste 1993; Rigatos and Tzafestas 2006; Rigatos and Zhang 2009). The RMSE of training the Gauss-Hermite neural model was of the order of 4 × 10−3.

Fig. 17
figure 17

Inputs/outputs configuration of the neural model of the power generator dynamics

Fig. 18
figure 18

Approximation of the rotor’s current \(i_{r_d}\) of the electric power generator (red line) by a neural network with Hermite polynomial basis functions (blue-line). a d-axis rotor’s current under fault in stator’s resistance—case 1. b d-axis rotor’s current under fault in stator’s resistance—case 2

Fig. 19
figure 19

Approximation of the rotor’s current \(i_{r_d}\) of the electric power generator (red line) by a neural network with Hermite polynomial basis functions (blue-line). a d-axis rotor’s current under fault in rotor’s inductance—case 1. b d-axis rotor’s current under fault in rotor’s inductance—case 2

The update of the output layer weights of the neural network is given by a gradient equation of the LMS-type (Least Mean Squares) given by

$$ w^{i}(k+1)=w^{i}(k)-{\eta}e(k){\phi^T}(k) $$
(61)

where e(k) = y(k) − y d (k) is the output estimation error at time instant k and ϕ T(k) is the regressor vector having as elements the values ϕ(x(k)) of the Gauss-Hermite basis functions for input vector x(k).

To approximate the rotor current variations described in a data set consisting of 2,000 quadruplets of the form i rd (k − 1), i rd (k − 2), ω(k − 1)|i rd (k) a feed-forward neural network with 3-D Gauss-Hermite basis functions has been used, containing 64 nodes in its hidden layer.

The spectral components of the \(i_{r_d}\) signal for both the fault-free and the under-fault operation of the power generator have been shown in Figs. 20, 21, 22 and 23. It can be noticed that after a fault has occurred, the amplitude of the aforementioned spectral components changes and this can be a clear indication about failure of the monitored transformer.

Fig. 20
figure 20

d-axis rotor current \(i_{r_d}\) under fault in stator’s resistance—case 1: a amplitude of the spectral components of the rotor’s current i r_d measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 21
figure 21

d-axis rotor current \(i_{r_d}\) under fault in stator’s resistance—case 2: a amplitude of the spectral components of the rotor’s current \(i_{r_d}\) measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 22
figure 22

d-axis rotor current \(i_{r_d}\) under fault in rotor’s inductance—case 1: a amplitude of the spectral components of the rotor’s current i r_d measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Fig. 23
figure 23

d-axis rotor current \(i_{r_d}\) under fault in rotor’s inductance—case 2: a amplitude of the spectral components of the rotor’s current \(i_{r_d}\) measured from the electric power transformer in the fault free case (red bar line) and when a fault had taken place (yellow bar line), b differences in the amplitudes of the spectral components between the fault-free and the faulty case (green bar line)

Again, the proposed spectral decomposition of the monitored signal, with series expansion in Gauss-Hermite basis functions can be used for fault detection tasks. As it can be seen in Figs. 20, 21, 22 and 23, in case of failure, the spectral components of the monitored signal differ from the ones which are obtained when the system is free of fault. Moreover, the fact that certain spectral components exhibit greater sensitivity to the fault and change value in a more abrupt manner is a feature which can be exploited for fault isolation. Specific failures can be associated with variations of specific spectral components. Therefore, they can provide indication about the appearance of specific types of failures and specific malfunctioning components.

7 Conclusions

A new method for fault diagnosis of nonlinear systems has been proposed based on the modeling of the system’s dynamics with feed-forward neural networks that use orthogonal basis functions exhibiting invariance to Fourier transform. A neural network with Gauss-Hermite polynomial activation functions has been used for approximating the nonlinear system’s dynamics out of a set of input-output data. Thus the output of the neural network could provide a series expansion that takes the form of a weighted sum of Gauss-Hermite basis function. Using the Fourier Transform property of the Gauss-Hermite basis functions, it was shown that the considered neural network could provide spectral analysis of the output of the monitored system. The weights of the output layer of the neural network stand for the amplitude of the spectral components of the nonlinear system’s dynamics. Moreover, using the orthogonality property of the Gauss-Hermite basis functions it was shown that the sum of the square of the output layer weights of these neural networks stands for a measure of the energy contained in the output of the monitored system.

The monitoring of changes in the amplitude of the aforementioned spectral components provides an indication about malfunctioning of the monitored system and a tool for detecting the existence of failures. Additionally, since specific faults are associated with amplitude changes of specific spectral components of the system, fault isolation can be also performed. The proposed FDI method can be applied to several electromechanical systems, e.g. vehicles, electric motors, power generators, etc. In this paper, as a first case study, the problem of fault diagnosis of electric power transformers has been examined. The considered neural network with Gauss-Hermite polynomial activation functions enabled to obtain information about the thermal condition of oil-immersed power transformers and about their ageing and failure risks through the approximation of a critical variable of the transformer known as HST. Evaluation tests have confirmed the efficiency of the proposed fault diagnosis method. As a second case study the problem of fault diagnosis of the doubly-fed induction generator has been examined. The dynamics of the rotor current has been modeled with the use of a Gauss-Hermite neural network and the associated spectral components have been obtained. Variation in the energy spectrum of the rotor’s current provided again information about the existence of failures and about the association of faults with specific components of the turbine-generator system.