Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

An important piece of information, which can be extracted from parameters of empirical models, is quantitative characteristics of couplings between processes under study. The problem of coupling detection is encountered in multiple fields including physics (Bezruchko et al., 2003), geophysics (Maraun and Kurths, 2005; Mokhov and Smirnov, 2006, 2008; Mosedale et al., 2006; Palus and Novotna, 2006; Verdes, 2005; Wang et al., 2004), cardiology (Rosenblum et al., 2002; Palus and Stefanovska, 2003) and neurophysiology (Arnhold et al., 1999; Brea et al., 2006; Faes et al., 2008; Friston et al., 2003; Kreuz et al., 2007; Kiemel et al., 2003; Le Van Quyen et al., 1999; Mormann et al., 2000; Osterhage et al., 2007; Pereda et al., 2005; Prusseit and Lehnertz, 2008; Smirnov et al., 2005; Romano et al., 2007; Schelter et al., 2006; Schiff et al., 1996; Sitnikova et al., 2008; Smirnov et al., 2008, Staniek and Lehnertz, 2008; Tass, 1999; Tass et al., 2003). Numerous investigations are devoted to synchronisation, which is an effect of interaction between non-linear oscillatory systems (see, e.g., Balanov et al., 2008; Boccaletti et al., 2002; Hramov and Koronovskii, 2004; Kreuz et al., 2007; Maraun and Kurths, 2005; Mormann et al., 2000; Mosekilde et al., 2002; Osipov et al., 2007; Palus and Novotna, 2006; Pikovsky et al., 2001; Prokhorov et al., 2003; Tass et al., 2003). In the last decade, more careful attention is paid to directional coupling analysis. Such characteristics might help, e.g., to localise an epileptic focus (a pathologic area) in the brain from electroencephalogram (EEG) or magnetoencephalogram (MEG) recordings: hypothetically, an increasing influence of an epileptic focus on adjacent areas leads to the seizure onset for some kinds of epilepsy.

The most appropriate and direct approaches to the detection of causal influences are based on the construction of empirical models. These approaches include Granger causality (Sect. 12.1) and phase dynamics modelling (Sect. 12.2). Below, we present our results showing their fruitful applications to the problems of neurophysiology (Sects. 12.3 and 12.4) and climatology (Sects. 12.5 and 12.6).

1 Granger Causality

The problem is formally posed as follows. There are time series from M processes \(\left\{{x_k (t)} \right\}_{t=1}^N\!, \;k=1,\ldots,M\). One needs to detect and characterise couplings between them, i.e. to find out how the processes influence each other. In the case of two linear processes (Granger, 1969; Pereda et al., 2005), one first constructs univariate autoregression models (Sect. 4.4)

$${x_k}(t)={A_{k,0}}+\sum\limits_{i=1}^d{{A_{k,i}}{x_k}(t-i)}+{\xi_k}(t),$$
((12.1))

where \(k=1,2,\ d\) is a model order, ξ k are Gaussian white noises with variances \(\sigma_{\xi_k}^2 \). Let us denote the vector of coefficients \(\left\{{A_{k,i},i=0,\ldots,d} \right\}\) as \({{\mathbf{A}}_k}\), the sum of squared residual errors as

$$\Sigma_k^2=\sum\limits_{t=d+1}^N{{{\left({{x_k}(t)-{A_{k,0}}-\sum\limits_{i=1}^d{{A_{k,i}}{x_k}(t-i)}}\right)}^2}},$$

and its minimal value as \(s_k^2 = \mathop {{\mathrm{min}}}\limits_{{{\mathbf{A}}_k}} \Sigma _k^2\). The model coefficients are estimated via the ordinary least-squares technique (Sect. 7.1.1), i.e. one gets \({{\hat{\textbf{A}}}_k} = {\mathrm{arg}}\mathop {{\mathrm{min}}}\limits_{{{\mathbf{A}}_k}} \Sigma _k^2\). An unbiased estimator for \(\sigma _{{\xi _k}}^2\) would represent the mean-squared prediction error of the univariate model. Such an estimator is given by

$$\hat\sigma_k^2=\frac{{s_k^2}} {{N-d-(d+1)}},$$

where \(d + 1\) is the number of estimated coefficients in Eq. (12.1). The model order d is selected large enough to provide delta correlatedness of the residual errors. For automatic choice of d, one often uses criteria of Akaike (1974) or Schwarz (1978).

Then, one similarly constructs a bivariate AR model:

$$\begin{array}{l} {x_1}(t)={a_{1,0}}+\sum\limits_{i=1}^d{{a_{1,i}}{x_1}(t-i)}+\sum\limits_{i=1}^d{{b_{1,i}}{x_2}(t-i)}+{\eta_1}(t), \\ {x_2}(t)={a_{2,0}}+\sum\limits_{i=1}^d{{a_{2,i}}{x_2}(t-i)}+\sum\limits_{i=1}^d{{b_{2,i}}{x_1}(t-i)}+{\eta_2}(t). \\ \end{array}$$
((12.2))

where η k are Gaussian white noises. Minimal values of the sums of squared residual errors are denoted \(s_{\left. 1 \right|2}^2\) and \(s_{\left. 2 \right|1}^2\) for the first and second processes, respectively. Unbiased estimators for the residual error variances are denoted \(\hat \sigma _{\left. 1 \right|2}^2\) and \(\hat \sigma _{\left. 2 \right|1}^2\). Prediction improvement for x k , i.e. the quantity \({\mathrm{P}}{{\mathrm{I}}_{j \to k}} = \hat \sigma _k^2 - \hat \sigma _{\left. k \right|j}^2\), characterises the influence of the process x j on x k (denoted further as \(j \to k\)).

Note that \({\mathrm{P}}{{\mathrm{I}}_{j \to k}}\) is an estimate obtained from a time series. To define a theoretical (true) prediction improvement \({\mathrm{PI}}_{j \to k}^{{\mathrm{true}}} = \sigma _k^2 - \sigma _{\left. k \right|j}^2\), one should minimise the expectations of the squared prediction errors instead of the empirical sums \(\Sigma _k^2\) and \(\Sigma _{k\left| j \right.}^2\) to get model coefficients, i.e. one should use an ensemble averaging or an averaging over an infinitely long-time realisation instead of the averaging over a finite time series. For uncoupled processes, one has \({\mathrm{PI}}_{j \to k}^{{\mathrm{true}}} = 0\), but the estimator \({\mathrm{P}}{{\mathrm{I}}_{j \to k}}\) can take positive values due to random fluctuations. Therefore, one needs a criterion to decide whether an obtained positive value of \({\mathrm{P}}{{\mathrm{I}}_{j \to k}}\) implies the presence of the influence \(j \to k\). It can be shown that the quantity

$${F_{j\to k}}=\frac{{(N-3d-1) \left(s_k^2-s_{\left. k\right|j}^2\right)}} {{s_{\left. k\right|j}^2d}}$$

is distributed according to Fisher’s F-law with \((d,N - 3d - 1)\) degrees of freedom. Hence, one can conclude that \({\mathrm{PI}}_{j \to k}^{{\mathrm{true}}} > 0\) and the influence \(j \to k\) exists at the significance level p (i.e. with the probability of random error not greater than p) if the value of \({F_{j \to k}}\) exceeds \((1-p)\) quantile of the respective F-distribution. This is called F-test or Granger – Sargent test (see, e.g., Hlavackova-Schindler et al., 2007).

If a time series is short, it is problematic to use high values of d, since the number of the estimated coefficients is then large, which often leads to insignificant conclusions even in cases of really existing couplings. The difficulty can be overcome in part if one constructs a bivariate model in the form

$${x_k}(t)={a_{k,0}}+\sum\limits_{i=1}^{{d_k}}{{a_{k,i}}{x_k}(t-i)}+\sum\limits_{i=1}^{{d_{j \to k}}}{{b_{k,i}}{x_j}(t-i-{\Delta_{j \to k}})},$$
((12.3))

where \(j,k = 1,2,\;j \neq k\), and selects a separate univariate model order d k for each process instead of the common d in Eq. (12.2), a separate value of \({d_{j \to k}}\) and a separate trial delay time \({\Delta _{j \to k}}\). If at least some of the values d k and \({d_{j \to k}}\) can be made small, then the number of the estimated coefficients is reduced.

If one needs non-linear models, the difficulty gets even harder due to the curse of dimensionality. In a non-linear case, the procedure of coupling estimation remains the same, but the AR models must involve non-linear functions. Thus, univariate AR models take the form

$${x_k}(t)={f_k}({x_k}(t-1),{x_k}(t-2),\ldots,{x_k}(t-{d_k}),{{\mathbf{A}}_k})+{\xi_k}(t),$$
((12.4))

where it is important to choose properly the kind of the non-linear functions f k . Algebraic polynomials (Mokhov and Smirnov, 2006), radial basis functions (Ancona et al., 2004) and locally constant predictors (Feldmann and Bhattacharya, 2004) have been used. For relatively short time series, it is reasonable to use polynomials f k of low orders P k . Bivariate models are then constructed in the form (12.2), where the linear functions are replaced with polynomials of the order P k . Yet, there is no regular procedure assuring an appropriate choice of the non-linear functions.

If the number of processes \(M > 2\), then estimation of the influence \(j \to k\) can be performed in two ways:

  1. (i)

    Bivariate analysis of x j and x k results in an estimator, which reflects both a “direct” influence \(j \to k\) and that mediated by other observed processes.

  2. (ii)

    Multivariate analysis takes into account all the M processes and allows to distinguish between the influences \(j \to k\) from different processes x j . Namely, one computes a squared prediction error for x k when a multivariate AR model containing all the processes except for x j is used. Then, one computes such an error for a multivariate AR model containing all the M processes including x j . If the predictions are more accurate in the latter case, one infers the presence of the direct influence \(j \to k\).

To express prediction improvements in relative units, one normalises \({\mathrm{P}}{{\mathrm{I}}_{j \to k}}\) by the variance \({\mathrm{var}}[{x_k}]\) of the process x k or by the variance \(\hat \sigma _k^2\) of the prediction error of the univariate model (12.1). The quantity \({{{\mathrm{P}}{{\mathrm{I}}_{j \to k}}} / {\hat \sigma _k^2}}\) is used more often than \({{{\mathrm{P}}{{\mathrm{I}}_{j \to k}}} / {{\mathrm{var}}[{x_k}]}}\). Both quantities are not greater than one and one may hope to give them a vivid interpretation. Thus, \({{{\mathrm{P}}{{\mathrm{I}}_{j \to k}}} / {\hat \sigma _k^2}}\) is close to unity if the influence \(j \to k\) describes almost all “external factors” ξ k unexplained by the univariate model of x k . \({{{\mathrm{P}}{{\mathrm{I}}_{j \to k}}} / {{\mathrm{var}}[{x_k}]}}\) is close to unity if in addition the univariate model (12.1) explains a negligible part of the variance \({\mathrm{var}}[{x_k}]\), i.e. \(\hat \sigma _k^2 \approx \operatorname{var} [{x_k}]\). These interpretations are often appropriate, even though they may appear insufficient to characterise an importance of the influence \(j \to k\) from the viewpoint of long-term changes in the dynamics.

2 Phase Dynamics Modelling

A general idea of the approach is that such a characteristic as “intensity of coupling” between two oscillatory processes shows how strongly a future evolution of an oscillator phase depends on the current value of the other oscillator phase (Rosenblum and Pikovsky, 2001). In fact, it is similar to the Granger causality, since bivariate models for the phases are constructed to characterise couplings. It makes sense to model such variables as phases, since they are often especially sensitive to weak perturbations as known from the synchronisation theory (see, e.g., Pikovsky et al., 2001).

Phase dynamics of weakly coupled (deterministic) limit-cycle oscillators with close natural frequencies can be to a good approximation described with a set of ordinary differential equations (Kuramoto, 1984):

$$\begin{gathered} {{{\mathrm{d}}{\phi_1}} \mathord{\left/ {\vphantom{{{\mathrm{d}}{\varphi_1}}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}={\omega_1}+{H_1}({\phi_2}-{\phi_1}), \\ {{{\mathrm{d}}{\phi_2}} \mathord{\left/ {\vphantom{{{\mathrm{d}}{\varphi_2}}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}={\omega_2}+{H_2}({\phi_1}-{\phi_2}), \\ \end{gathered}$$
((12.5))

where φ k are phases of the oscillators, ω k are their natural frequencies and H k are coupling functions. Model (12.5) does not apply if the phase dynamics of the oscillators is perturbed by noise (a typical situation in practice) or coupling functions depend on phases in a more complicated manner rather than only via phase difference due to strong non-linearities in the systems and their interactions. Yet, if noise level is low, the model can be generalised in a straightforward manner. One comes to stochastic differential equations (Kuramoto, 1984; Rosenblum et al., 2001)

$$\begin{gathered} {{{\mathrm{d}}{\phi_1}} \mathord{\left/ {\vphantom{{{\mathrm{d}}{\varphi_1}}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}={\omega_1}+{G_1}({\phi_1},{\phi_2})+{\xi_1}(t), \\ {{{\mathrm{d}}{\phi_2}} \mathord{\left/ {\vphantom{{{\mathrm{d}}{\varphi_2}}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}={\omega_2}+{G_2}({\phi_2},{\phi_1})+{\xi_2}(t), \\ \end{gathered}$$
((12.6))

where ω k are not necessarily close to each other, ξ k are independent zero-mean white noises with autocorrelation functions \(\left\langle {{\xi _k}(t){\xi _k}(t^{\prime})} \right\rangle = \sigma _{{\xi _k}}^2{\updelta}(t - t^{\prime}),\ \delta\) is the Dirac’s delta function and \(\sigma _{{\xi _k}}^2\) characterise noise intensities. The functions G k are 2π periodic with respect to both arguments and describe both couplings between the oscillators and their individual phase non-linearity.

Let \(\sigma _{{\xi _k}}^2\) and \(\left| {{G_k}} \right|\) be reasonably small so that the contribution of the respective terms in Eq. (12.6) to the phase increment \({\phi _k}(t + \tau ) - {\phi _k}(t)\) is small in comparison with the “linear increment” \({\omega _k}\tau\), where the finite time interval τ is of the order of the basic oscillation period. Then, by integrating Eq. (12.6) over the interval τ, one converts to difference equations and gets

$${\phi_k}(t+\tau)-{\phi_k}(t)={F_k}({\phi_k}(t),{\phi_j}(t),{{\mathbf{a}}_k})+{\varepsilon_k}(t),$$
((12.7))

where \(k,j = 1,2,\ j \neq k,\ {\varepsilon _k}\) are zero-mean noises, F k are trigonometric polynomials

$${F_k}({\phi_k},{\phi_j},{{\mathbf{a}}_k})={w_k}+\sum\limits_{(m,n) \in{\Omega_k}}{\left({{\alpha_{k,m,n}}\,{\mathrm{cos}}(m{\phi_k}-n{\phi_j})+{\beta_{k,m,n}}\,{\mathrm{sin}}(m{\phi_k}-n{\phi_j})} \right),}$$
((12.8))

\({{\mathbf{a}}_k} = \;\;({w_k},{\{ {\alpha _{k,m,n}},{\beta _{k,m,n}}\} _{(m,n) \in {\Omega _k}}})\) are vectors of their coefficients and Ω k are summation ranges, i.e. sets of pairs \((m,n)\) defining which monomials are contained in F k . The terms with \(m = n = 1\) can be induced by a linear coupling of the form \(\mathit{kx}_{j}\) or \(k({x_j} - {x_k})\) in some “original equations” for the oscillators. The terms with \(n = 2\) can be due to a driving force, which is quadratic with respect to the coordinate of the driving oscillator, e.g. \(\mathit{kx}_j^2\). Various combinations are also possible so that the couplings in the phase dynamics equations (12.7) can be described with a set of monomials of different orders with \(n \neq 0\). The strongest influence arises from the so-called “resonant terms”, which correspond to the ratios \({m / n} \approx {{{\omega _j}} / {{\omega _k}}}\) in the equation for the kth oscillator phase. However, non-resonant terms can also be significant.

Intensity of the influence \(j \to k\) can be reasonably defined via the mean-squared value of the partial derivative \({{\partial {F_k}({\phi _k},{\phi _j},{{\mathbf{a}}_k})} / {\partial {\phi _j}}}\) (Rosenblum and Pikovsky, 2001; Smirnov and Bezruchko, 2003):

$$c_{j \to k}^2=\frac{1}{{2{\pi^2}}}\int\limits_0^{2\pi}{\int\limits_0^{2\pi}{{{\left({{{\partial{F_k}({\phi_k},{\phi_j},{{\bf{a}}_k})} \mathord{\left/ {\vphantom{{\partial{F_k}({\phi_k},{\phi_j},{{\bf{a}}_k})}{\partial{\phi_j}}}} \right. \kern-\nulldelimiterspace}{\partial{\phi_j}}}} \right)}^2}{\rm{d}}{\phi_j}\,{\rm{d}}{\phi_k}}}.$$
((12.9))

Indeed, the value of \(c_{j\to k}^{2}\) depends only on the terms with \(n \neq 0\) and reads as (Smirnov and Bezruchko, 2003)

$$c_{j \to k}^2=\sum\limits_{(m,n) \in{\Omega_k}}{{n^2}\left({\alpha_{k,m,n}^2+\beta_{k,m,n}^2} \right)}.$$
((12.10))

This is a theoretical coupling characteristic which can be computed if the polynomial coefficients in Eq. (12.8) are known.

In practice, one has only a time series of observed quantities x 1 and x 2 representing two oscillatory processes. So, one first extracts a time series of the phases \({\phi _1}(t)\) and \({\phi _2}(t)\) from the observed data with any of the existing techniques (Sect. 6.4.3). Then, one estimates the coupling characteristics by fitting the phase dynamics equations (12.7) with the functions (12.8) to the time series of the phases. For that, one can use the ordinary least-squares technique (Sect. 7.1.1) to get the estimates \({{\hat{\textbf{a}}}_k}\) of the coefficients (Rosenblum and Pikovsky, 2001), i.e. one minimises the values of

$$\hat\sigma_k^2({{\mathbf{a}}_k})=\frac{1} {{N-{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}}\sum\limits_{n={\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}+1}^N{{{\left({{\phi_k}(n\Delta t+\tau)-{\phi_k}(n\Delta t)-{F_k}({\phi_k}(n\Delta t),{\phi_j}(n\Delta t),{{\mathbf{a}}_k})}\right)}^2}},$$

where \(k,j = 1,\;2,\;j \neq k\). The estimates can be written as \({{\hat{\textbf{a}}}_k} = {\mathrm{arg}}\mathop {{\mathrm{min}}}\limits_{{{\mathbf{a}}_k}} \hat \sigma _k^2({{\mathbf{a}}_k})\). The minimal value \(\hat \sigma _k^2 = \mathop {{\mathrm{min}}}\limits_{{{\mathbf{a}}_k}} \hat \sigma _k^2({{\mathbf{a}}_k})\) characterises the noise level. The most direct way to estimate the coupling strengths \(c_{j\to k}\) is to use the expression (12.10) and replace the true values \({{\mathbf{a}}_k}\) with the estimates \({{\hat{\textbf{a}}}_k}\). Thereby, one gets the estimator \(\hat{c}_{j\to k}^{2}=\sum\limits_{(m,n)\in\Omega_{k}}n^{2}\left(\hat{\alpha}_{k,m,n}^{2}+\hat{\beta}_{k,m,n}^{2}\right)\).

Sensitivity of the technique to weak couplings was demonstrated numerically (Rosenblum and Pikovsky, 2001). The estimator \(\hat{c}_{j\to k}\) appears “good” for long and stationary signals, whose length should be about several hundreds of basic periods under moderate noise level. The technique has already given interesting results for a complex real-world process, where such data are available, namely for the interaction between human respiratory and cardio-vascular systems (Rosenblum et al., 2002). It appears that the character of the interaction in infants changes with their age from an almost symmetric coupling to a predominant influence of the respiratory system on the cardio-vascular one.

Application of the technique in practice encounters essential difficulties when time series are non-stationary. For instance, it is important to characterise an interaction between different brain areas from EEG recordings. However, their quasi-stationary intervals last for about a dozen of seconds, i.e. comprise not more than 100 basic periods for pathological (epileptic or Parkinsonian) oscillatory behaviour. Then, one could divide a long time series into quasi-stationary segments and compute coupling characteristics from each segment separately. However, for a time series of such a moderate length, the estimators \(\hat{c}_{j\to k}\) turn out to be typically biased. The reasons are described in Smirnov and Bezruchko (2003), where corrected estimators \({\gamma _{j \to k}}\) for the quantities \(c_{j \to k}^2\) are suggested:

$${\gamma_{j\to k}}=\sum\limits_{(m,n)\in{\Omega_k}}{{n^2}\left({\hat\alpha_{k,m,n}^2+\hat\beta_{k,m,n}^2-2\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^2}\right)},$$

the estimates of the variances \(\hat \sigma _{{{\hat \alpha }_{k,m,n}}}^2\) of the coefficient estimates \({\hat \alpha _{k,m,n}}\) are derived in the form

$$\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^2=\frac{{2\hat\sigma_k^2}} {{N-{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}}\left\{{1+2\sum\limits_{l=1}^{{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}{\left({1-\frac{l} {{{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}}}\right){\mathrm{cos}}\left[{\frac{{l\left({m{{\hat w}_k}+n{{\hat w}_j}}\right)}} {{{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}}}\right]{\mathrm{exp}}}\left[{-\frac{{l\left({m^2}\hat\sigma_k^2+{n^2}\hat\sigma_j^2\right)}} {{2{\tau\mathord{\left/ {\vphantom{\tau{\Delta t}}}\right. \kern-\nulldelimiterspace}{\Delta t}}}}}\right]}\right\}.$$

95% confidence bands for the coupling strengths \(c_{j\to k}^{2}\) are derived in Smirnov and Bezruchko (2003) for the case of trigonometric polynomials F k of the third order (namely, for the set Ω k , which includes the pairs of indices \(m=n=1,\ m=1,n=-1,\ m=1,n=0,\ m=2,n=0\) and \(m = 3,n = 0\)) in the form \([{\gamma _{j \to k}} - 1.6{\hat \sigma _{{\gamma _{j \to k}}}},{\gamma _{j \to k}} + 1.8{\hat \sigma _{{\gamma _{j \to k}}}}]\), where the estimates of the standard deviations \({\hat \sigma _{{\gamma _{j \to k}}}}\) are computed from the same short time series as

$$\hat\sigma_{{{\gamma}_{j\to k}}}^2={\begin{cases} {2\sum\limits_{m,n}{{n^4}\hat\sigma_{\hat\alpha_{k,m,n}^2}^2},{{\gamma}_{j\to k}}\geqslant 5\sqrt{\sum\limits_{m,n}{2{n^4}\hat\sigma_{\hat a_{1,m,n}^2}^2}},} \\ {\sum\limits_{m,n}{{n^4}\hat\sigma_{\hat a_{k,m,n}^2}^2,{\mathrm{otherwise,}}}} \\ \end{cases}}$$

and the estimate of the variance of the squared coefficient estimate is given as

$$\hat\sigma_{\hat\alpha_{k,m,n}^2}^2={\begin{cases} {2\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^4+4\left({\hat\alpha_{k,m,n}^2-\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^2}\right)\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^2, \hat\alpha_{k,m,n}^2-\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^2\geqslant 0,} \\ {2\hat\sigma_{{{\hat\alpha}_{k,m,n}}}^4,{\mathrm{otherwise}}{\mathrm{.}}} \\ \end{cases}}.$$

The value of \({\gamma _{j \to k,c}} = 1.6{\hat \sigma _{{\gamma _{j \to k}}}}\) represents a 0.975 quantile for the distribution of the estimator \({\gamma _{j \to k}}\) in the case of uncoupled processes. Hence, the presence of the influence can be inferred at the significance level 0.025 (i.e. with a probability of random error not more than 0.025) if it appears that \({\gamma _{j \to k}} > {\gamma _{j \to k,c}}\). The technique has been compared to other non-linear coupling analysis techniques in Smirnov and Andrzejak (2005) and Smirnov et al. (2007), where its superiority is shown for sufficiently regular oscillatory processes.

If directional couplings between processes are expected to be time-delayed, the technique can be readily generalised (Cimponeriu et al., 2004). Namely, one constructs the phase dynamics model in the form

$${\phi_k}(t+\tau)-{\phi_k}(t)={F_k}({\phi_k}(t),{\phi_j}(t-{\Delta_{j \to k}}),{{\mathbf{a}}_k})+{\varepsilon_k}(t),$$
((12.11))

where \(k,j = 1,2,\;j \neq k\), and \({\Delta _{j \to k}}\) is a trial delay time in the influence \(j \to k\). One gets coupling estimates and their standard deviations depending on the trial delay: \({\gamma _{j \to k}}({\Delta _{j \to k}})\) and \({\hat \sigma _{{\gamma _{j \to k}}}}({\Delta _{j \to k}})\). Then, one selects the trial delay corresponding to the largest value of \({\gamma _{j \to k}}\), which significantly exceeds zero (if such a value of \({\gamma _{j \to k}}\) exists), i.e. exceeds \(\gamma_{j\to k,c}(\Delta_{j\to k})\). Thereby, one also gets an estimate of the delay time.

The phase dynamics modelling technique is applicable if couplings are not very strong so that the degree of synchrony between the oscillators is low. This condition can be checked, e.g., via the estimation of the phase synchronisation index (Sect. 6.4.5) also called mean phase coherence (Mormann et al. (2000): \(\rho (\Delta ) = \left| {{{\left\langle {{\mathrm{exp}}\left( {{\mathrm{i}}({\varphi _1}(t) - {\varphi _2}(t + \Delta ))} \right)} \right\rangle }_t}} \right|\). This quantity ranges from zero to one. The estimators \({\gamma _{2 \to 1}}({\Delta _{2 \to 1}})\) and \({\gamma _{1 \to 2}}({\Delta _{1 \to 2}})\) with their confidence bands can be considered reliable if the values of \(\rho ( - {\Delta _{2 \to 1}})\) and \(\rho ({\Delta _{1 \to 2}})\) are less than 0.45 (Mokhov and Smirnov, 2006). The second condition of applicability is a sufficient length of the time series: not less than 40 basic periods (Mokhov and Smirnov, 2006; Smirnov and Bezruchko, 2003). Finally, the autocorrelation function of the residual errors for the models (12.7) or (12.11) should decrease down to zero over the interval of time lags (0, τ) to confirm appropriateness of the basic model (12.6) with white noises.

The corrected estimators \({\gamma _{j \to k}}\) are used for the analysis of two-channel EEG in a patient with temporal lobe epilepsy in Smirnov et al. (2005). Their further real-world applications are described in Sects. 12.3, 12.5 and 13.2.

3 Brain – Limb Couplings in Parkinsonian Resting Tremor

Many neurological diseases including epilepsy and Parkinson’s disease are related to pathological synchronisation of large groups of neurons in the brain. Synchronisation of neurons in nuclei of thalamus and basal ganglia is a hallmark of Parkinson’s disease (Nini et al., 1995). However, as yet its functional role in the generation of Parkinsonian tremor (involuntary regular oscillations of limbs at a frequency ranging from 3 to 6 Hz) is a matter of debate (Rivlin-Etzion et al., 2006). In particular, the hypothesis that the neural synchronisation drives the tremor has not yet got a convincing empirical confirmation (Rivlin-Etzion et al., 2006). The standard therapy for medically refractory Parkinson’s disease is permanent electrical deep brain stimulation (DBS) at high frequencies (greater than 100 Hz) (Benabid et al., 1991). Standard DBS has been developed empirically, its mechanism of action is unclear (Benabid et al., 2005) and it has relevant limitations, e.g. side effects (Tass et al., 2003; Tass and Majtanik, 2006). It has been suggested to specifically counteract the pathological cerebral synchrony by desynchronising DBS (Tass, 1999), e.g. with coordinated reset stimulation (Tass, 2003). The verification of the tremor being generated by synchronised neural activity in the thalamus and the basal ganglia will further justify and strengthen the desynchronisation approach (Tass, 1999; 2003) and help to develop therapies, which may presumably be milder and lead to less side effects. Therefore, to detect couplings between limb oscillations and activity of different brain areas in Parkinsonian patients is a topical problem.

We have analysed more than 40 epochs of spontaneous Parkinsonian tremor recorded in three patients with Parkinson’s disease (Bezruchko et al., 2008; Smirnov et al., 2008). Limb oscillations are represented by accelerometer signals recorded at sampling frequencies 200 Hz or 1 kHz. Information about the brain activity is represented by the recordings of local field potentials (LFPs) from a depth electrode implanted into the thalamus or the basal ganglia. The data are obtained at the Department of Stereotaxic and Functional Neurosurgery, University of Cologne, and at the Institute of Neuroscience and Biophysics – 3, Research Centre Juelich, Germany.

Accelerometer and LFP signals during an interval of strong Parkinsonian tremor are presented in Fig. 12.1 along with their power spectra. One can see oscillations in the accelerometer signal, which correspond to a peak in the power spectrum at the frequency of 5 Hz. The peak at the tremor frequency is seen in the LFP spectrum as well, even though it is wider. The phases of both signals can be unambiguously defined in the frequency band around the tremor frequency (e.g. 3–7 Hz). As a result of the phase dynamics modelling (Sect. 12.2), we have found statistically significant influence of the limb oscillations on the brain activity with a delay time not more than several dozens of milliseconds. The influence of the brain activity on the limb oscillations is present as well and is characterised by a delay time of 200–400 ms, i.e. one to two basic tremor periods (Fig. 12.2). The results are well reproduced, both qualitatively and quantitatively, for all three patients (Fig. 12.3). Some details are given in Sect. 13.2.

Fig. 12.1
figure 1

An interval of spontaneous Parkinsonian tremor (total duration of 36 s, only the starting 5 s are shown): (a, c) an accelerometer signal in arbitrary units and its power spectrum estimate; (b, d) an LFP recording from one of the electrodes and its power spectrum estimate

Fig. 12.2
figure 2

Coupling estimates for the tremor epoch shown in Fig. 12.1 (dimensionless) versus a trial delay time: (a) brain → hand; (b) hand → brain. The phases are defined in the frequency band 3–7 Hz. Thin lines show the threshold values \(\gamma _{j \to k}^{\ast} = 1.6{\hat \sigma _{{\gamma _{j \to k}}}}\). The values of \({\gamma _{j \to k}}\) exceeding \(\gamma _{j \to k}^{\ast}\) differ from zero statistically significantly (at an overall significance level of \(p < 0.05\)). Thus, one observes an approximately zero delay time for the hand → brain influence and a delay time of about 335 ms for the brain → hand driving

Fig. 12.3
figure 3

Estimates of coupling strengths in both directions: brain → hand (the left column) and vice versa (the right column) for the three patients (three rows) versus a trial delay time. Coupling estimates, averaged over ensembles of 10–15 intervals of strong tremor, are shown along with their averaged 95% confidence bands (Smirnov et al., 2008)

Surrogate data tests (Dolan and Neiman, 2002; Schreiber and Schmitz, 1996) confirm statistical significance of our conclusions as well. Moreover, they show that linear techniques cannot reveal the influence of the thalamus and basal ganglia activity on the limbs.

Influence of the limb on the brain has been detected earlier with the linear Granger causality (Eichler, 2006; Wang et al., 2007). However, the phase dynamics modelling provides a new result: the brain → hand influence is detected and its delay time is estimated. This delay is quite big as compared to the conduction time of the neural pulses from the brain to the muscles. Therefore, it is interpreted (Smirnov et al., 2008) as a sign of indirect (after processing of the signals in the cortex) influence of the thalamus or the basal ganglia activity on the limb oscillations. Besides, it means that nuclei of the thalamus and the basal ganglia are elements of “feedback loops”, which determine limb oscillations, rather than being just passive receivers of cerebral or muscle signals.

We have also estimated non-linear Granger causality for broadband accelerometer and LFP signals, rather than for the band-pass-filtered versions. It detects bidirectional couplings as well but does not give reliable estimates of the delay times. One reason can be that different time delays may correspond to different frequency bands leading to unclear results of the combined analysis.

An important area of further possible applications of the presented directionality analysis might be functional target point localisation diagnosis for an improvement of the depth electrode placement.

4 Couplings Between Brain Areas in Epileptic Rats

Over the years, electroencephalography is widely used in clinical practice for the investigation, classification and diagnosis of epileptic disorders. The EEG provides valuable information in patients with typical and atypical epileptic syndromes and offers important prognostic information. Absence epilepsy, previously known as petit mal, is classically considered as non-convulsive generalised epilepsy of unknown aetiology. Clinically, absence seizures occur abruptly, last from several seconds up to a minute and are accompanied by a brief decrease of consciousness that interrupts normal behaviour. Absences may either have or not have facial automatisms, e.g. minimal jerks and twitches of facial muscles, and eye blinks. In humans, EEGs during typical absence seizures are characterised by the occurrence of generalised 3–5-Hz spike – and - wave complexes which have an abrupt onset and offset (Panayiotopoulos, 1997). Similar EEG paroxysms, spike-and-wave discharges (SWDs) appear in rat strains with a genetic predisposition to absence epilepsy, such as WAG/Rij (Wistar Albino Glaxo from Rijswijk) (Coenen and van Luijtelaar, 2003). The EEG waveform and duration (1–30 s, mean 5 s) of SWD in rats and in humans are comparable, but the frequency of SWD in rats is higher, 8–11 Hz (Midzianovskaia et al., 2001; van Luijtelaar and Coenen, 1986).

EEG coherence was used previously to measure neuronal synchrony between populations of thalamic and cortical neurons (Sitnikova and van Luijtelaar, 2006). The onset of SWD was characterised by area-specific increase of coherence that supported the idea that the cortico-thalamo-cortical circuitry is primarily involved in the initiation and propagation of SWD (Meeren et al., 2005; Steriade, 2005). However, the exact mechanism is unknown. A useful information to uncover it would be characteristics of directional couplings between different brain areas. Below, we describe our results on the estimation of interdependencies between local field potentials recorded simultaneously from the specific thalamus and the frontal cortex before, during and after SWD (Sitnikova et al., 2008).

Experiments were performed in five male 11–12-month-old WAG/Rij rats. The recordings are done at the Department of Biological Psychology, Radboud University of Nijmegen. EEGs were recorded from brain areas in which seizure activity is known to be the most robust: in the frontal cortex and in the ventroposteromedial (VPM) thalamic nucleus (Vergnes et al., 1987). EEG recordings were made in freely moving rats in a Faraday cage. Each recording session lasted from 5 to 7 h during the dark period of the day – night cycle. SWDs appeared in EEG as trains of stereotypic repetitive 7–10-Hz spikes and waves with high amplitude exceeding the background more than three times. SWDs lasted longer than 1 s (Midzianovskaia et al., 2001; van Luijtelaar and Coenen, 1986). In total, 53, 111, 34, 33 and 63 epileptic discharges in five rats were detected and analysed.

As it is mentioned in Sect. 11.1, non-stationarity is an intrinsic feature of the EEG signal. Since the above coupling estimation techniques require stationary data, we divided the EEG recordings into relatively short epochs in which the EEG signal revealed quasi-stationary behaviour. Time window lasting for 0.5 s seems to be a good choice. This duration corresponds to four spike-wave cycles. We report only results of the Granger causality estimation, since phase dynamics modelling gave no significant conclusions due to the shortness of quasi-stationary segments. Introduction of non-linearity (such as polynomials of the second and the third order) has no significant influence on the prediction quality of AR models before and after SWD. It suggests a predominance of the linear causal relations in non-seizure EEG. In contrast, the seizure activity (SWD) exhibits a non-linear character. However, the construction of non-linear models for seizure-related processes is quite non-trivial. Thus, we present only the results of the linear analysis.

Prediction improvements are computed using EEG data from the frontal cortex \((x_{1})\) and from the VPM \((x_{2})\). The linear AR models (12.1) and (12.3) are used to calculate the coupling characteristics \(\mathit{PI}_{1\to 2}\) (FC → VPM) and \(\mathit{PI}_{2\to 1}\) (VPM → FC). EEG recordings during a typical SWD are shown in Fig. 12.4. Figure 12.5 shows a typical dependence of the prediction error \(\sigma_{1|2}^{2}\) on the dimensions d 1 and \(d_{2\to 1}\) at \(\Delta_{2\to 1}=0\) for a 0.5-s interval of SWD. The error decreases when d 1 and \(d_{2\to 1}\) rise from 1 to 5. It reaches its saturation point for the values of \(d_{1}=d_{2\to 1}=5\), which are taken as optimal. The same dependence is observed for the error \(\sigma_{2|1}^{2}\). Introduction of non-zero delays \(\Delta_{j\to k}\) makes predictions worse, therefore, only zero delay times are used in the further analysis.

Fig. 12.4
figure 4

EEG recordings of a spike-and-wave discharge in the frontal cortex (a, b) and in the specific ventroposteromedial thalamic nucleus (c, d). The panels (b) and (d) are magnified segments of the panels (a) and (c), respectively

Fig. 12.5
figure 5

The prediction error of model (12.3) for the frontal cortex EEG recording (fitted to the middle half a second in Fig. 12.4b) versus d 1 and \(d_{2\to 1}\)

The first and the last spike in spike-and-wave sequences are used to mark the onset and the offset of seizure activity. Estimation of the thalamus-to-cortex and cortex-to-thalamus influences is performed for the EEG epochs covering a seizure (SWD), 5 s before a seizure (pre-SWD) and 5 s after a seizure (post-SWD). The averaged results are illustrated in Fig. 12.6.

Fig. 12.6
figure 6

Normalised prediction improvements averaged over all accessible SWDs for each animal versus the starting time instant of the moving window (0.5 s length). The onset and the offset of SWD are shown by vertical lines. The presence of SWD is associated with significant (and reversible) changes in the Granger causality in both directions. Surrogate data tests (dotted lines) are performed for each animal and confirm statistical significance of the Granger causality estimation results

Before the onset of SWD, couplings are weak and remain constant until SWD begins. The first SWD-related disturbances of \(\mathit{PI}_{j\to k}\) are observed about half a second before SWD onset. This effect is provoked by the seizure itself because the 0.5-s time window starts to capture the seizure activity. Still, the obtained values of \(\mathit{PI}_{1\to 2}\) and \(\mathit{PI}_{2\to 1}\) are statistically significantly greater than zero for majority of analysed epochs both before and during SWD at least at the level of \(p=0.05\) according both to F-test and surrogate data test (Schreiber and Schmitz, 1996). No changes in \(\mathit{PI}_{j\to k}\) are found earlier than 0.5 s before the SWD onset, suggesting that the quantities \(\mathit{PI}_{j\to k}\) are not capable of seizure prediction. The immediate onset of SWD is associated with a rapid growth in \({{P\!I}}_{1\to 2}\) and \({{P\!I}}_{2\to 1}\). The Granger causality characteristics reach their maximum within half a second after a seizure onset and remain high during the first 5 s of a seizure. The increase in couplings in both directions during an SWD as compared to pre-SWD epochs is significant. The ascending influence thalamus → cortex tends to be always stronger in terms of the PI values compared to the descending influence cortex → thalamus. Moreover, the occurrence of an SWD is associated with a tendency for a larger increase in the thalamus → cortex as compared to the cortex → thalamus influence. The results are similar for all five rats analysed. Thus, bidirectional couplings between FC and VPM are always present, but the cortico-thalamo-cortical associations are reinforced during SWD.

Our results suggest that a reinforcement of predominant thalamus → cortex coupling accompanies the occurrence of an SWD, which can be interpreted as follows. In the described study, the EEG records were made in the areas where seizure activity is known to be the most robust (the frontal cortex and the VPM). It is important that direct anatomic connections between these structures are virtually absent, but both structures densely interconnect with the somatosensory cortex (Jones, 1985). As discussed in Meeren et al. (2002), the somatosensory cortex (the peri-oral region) contains an “epileptic focus” that triggers an SWD in WAG/Rij rats. The frontal EEGs are recorded rather far away from the “epileptic focus”. Several groups report similar results of their investigations with other methods: the cortex, indeed, does not lead the thalamus when the cortical electrode is relatively far from the peri-oral area of the somatosensory cortex (Inoue et al., 1993; Polack et al., 2007; Seidenbecher et al., 1998).

5 El Niño – Southern Oscillation and North Atlantic Oscillation

El Niño – Southern Oscillation (ENSO) and North Atlantic Oscillation (NAO) represent the leading modes of interannual climate variability for the globe and the Northern Hemisphere (NH), respectively (CLIVAR, 1998; Houghton et al., 2001). Different tools have been used for the analysis of their interaction, in particular, cross-correlation function (CCF) and Fourier and wavelet coherence for the sea surface temperature (SST) and sea-level pressure (SLP) indices (Jevrejeva et al., 2003; Pozo-Vazquez et al., 2001; Rogers, 1984; Wallace and Gutzler, 1981).

One often considers a NAO index defined as the normalised SLP difference between Azores and Iceland (Rogers, 1984; http://www.cru.uea.ac.uk). It is further denoted as \({\mathrm{NAOI_{cru}}}\). Alternatively, in http://www.ncep.noaa.gov, NAO is characterised as the leading decomposition mode of the field of 500 hPa geopotential height in the NH based on the “rotated principal component analysis” (Barnston and Livezey, 1987). It is denoted further as \({\mathrm{NAOI_{ncep}}}\). Hence, \({\mathrm{NAOI_{ncep}}}\) is a more global characteristic than \({\mathrm{NAOI_{cru}}}\). ENSO indices T(Niño-3), T(Niño-3,4), T(Niño-4) and T(Niño-1+2) characterise SST in the corresponding equatorial regions of the Pacific Ocean (see, e.g., Mokhov et al., 2004). Southern oscillation index (SOI) is defined as the normalised SLP difference between Tahiti and Darwin. All the signals are rather short, which makes confident inference about the character of interaction difficult. We have investigated interaction between ENSO and NAO in Mokhov (2006) with non-linear Granger causality and phase dynamics modelling. The results are described below.

Mainly, the period 1950–2004 (660 monthly values) is analysed. The indices \({\mathrm{NAOI_{cru}}}\) and \({\mathrm{NAOI_{ncep}}}\) for NAO and T(Niño-3), T(Niño-3,4), T(Niño-4), T(Niño-1+2) and SOI for ENSO are used. Longer time series for \({\mathrm{NAOI_{cru}}}\) (1821–2004), T(Niño-3) (1871–1997) and SOI (1866–2004) are also considered.

5.1 Phase Dynamics Modelling

Figure 12.7 demonstrates individual characteristics of the indices \({\mathrm{NAOI_{ncep}}}\) (Fig. 12.7a) and T(Niño-3,4) (Fig. 12.7d). Wavelet analysis of each signal \(x(t)\) is based on the wavelet transform

$$W(s,t)=\frac{1} {{\sqrt s}}\int\limits_{-\infty}^\infty{x(t^{\prime}){\Phi^{\ast}}\left({{{\left({t-t^{\prime}} \right)} \mathord{\left/{\vphantom{{\left({t-t^{\prime}} \right)} s}} \right. \kern-\nulldelimiterspace} s}} \right){\mathrm{d}}t^{\prime}},$$
((12.12))

where \(\Phi (\eta ) = {\pi ^{{{ - 1} / 4}}}\left[{\mathrm{exp}}( - {\mathrm{i}}{\omega _0}\eta ) - {\mathrm{exp}}\left({{ - \omega _0^2} / 2}\right)\right]{\mathrm{exp}}( - {{{\eta ^2}} / 2})\) is the Morlet wavelet (see also Eq. (6.23) in Sect. 6.4.3), an asterisk means complex conjugate and s is the timescale. Global wavelet spectra S of the climatic signals, obtained by integration of Eq. (12.12) over time t at each fixed s, exhibit several peaks (Fig. 12.7b, e). One can assume that the peaks correspond to oscillatory processes for which the phases can be adequately introduced. To get the phases of “different rhythms” in NAO and ENSO, we try several values of s in Eq. (12.12) corresponding to different spectral peaks. The phase is defined as an argument of the respective complex signal \(W(s,t)\) at fixed s. For \(\omega_{0}=6\) used below, this is tantamount to band-pass filtering of a signal x around the frequency \(f = {1 / s}\) with a relative bandwidth \({1 / 4}\) and subsequent use of the Hilbert transform (see Sect. 6.4.3). Then, we estimate couplings between all the “rhythms” pairwise. The only case when substantial conclusions about the presence of coupling are inferred is the “rhythm” with \(s=32\) months for both signals (Fig. 12.7a, d, dashed lines). The phases are sufficiently well defined for both signals, since clear rotation around the origin takes place on the complex plane (Fig. 12.7c, f).

Fig. 12.7
figure 7

Characteristics of \({\mathrm{NAOI_{ncep}}}\) and T(Nino-3,4). (a) \({\mathrm{NAOI_{ncep}}}\) (thin line) and ReW for \(s=32\) months (dashed line); (b) global wavelet spectrum of \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\ (f = {1 / s})\); (c) an orbit \(W(t)\) for \({\mathrm{NAOI_{ncep}}}\), \(s=32\) months; (df) the same as (ac) for the index T(Niño-3,4)

The results of the phase dynamics modelling are shown in Fig. 12.8 for \(s=32\) months and model (12.11) with \(\tau=32\) months, where φ 1 stands for the phase of NAO and φ 2 for ENSO. Figure 12.8 a shows that the technique is applicable only for \(\Delta_{2\to 1} <30\), where \(\rho(-\Delta_{2\to 1}) <0.4\). The influence ENSO → NAO is pointwise significant for \(0<\Delta_{2\to 1} <30\) and maximal for \(\Delta_{2\to 1}=24\) months (Fig. 12.8b). From here, we infer the presence of the influence ENSO → NAO at an overall significance level \(p=0.05\) as discussed in Mokhov and Smirnov (2006). Most probably, the influence ENSO → NAO is delayed by 24 months; however, this conclusion is not as reliable. No signs of the influence NAO → ENSO are detected (Fig. 12.8c).

Fig. 12.8
figure 8

Coupling between \({\mathrm{NAOI_{ncep}}}\) and T(Niño-3,4) (over the period 1950–2004) in terms of the phase dynamics modelling: (a) mean phase coherence; (b, c) strengths of the influences ENSO → NAO and NAO → ENSO, respectively, with their 95% pointwise confidence bands

Large ρ for \(\Delta_{2\to 1}>30\) does not imply strong coupling. For such a short time series and close basic frequencies of the oscillators, the probability to get \(\rho(\Delta)>0.4\) for uncoupled processes is greater than 0.5 as observed in numerical experiments with exemplary oscillators.

All the reported results remain the same for any s in the range 29–34 months and relative bandwidths 0.2–0.4. Phase calculation based directly on a band-pass filtering and Hilbert transform leads to similar results, e.g., for the second-order Butterworth filter (Hamming, 1983) with the same bandwidths. The use of the other ENSO indices instead of T(Niño-3,4) gives almost the same results as in Fig. 12.8. Coupling is not pronounced only for T(Niño-1+2). Analysis of the other rhythms in \(N\!A\!O\!I_{\mathrm{ncep}}\) and T(Niño-3,4) does not lead to significant conclusions about the presence of interaction. For \({\mathrm{NAOI_{cru}}}\) the width of the peak corresponding to \(s=32\) months is greater than that for \({\mathrm{NAOI_{ncep}}}\). It leads to stronger phase diffusion of the 32-month rhythm as quantified by the mean-squared residual errors of the model (12.11) (Smirnov and Andrzejak, 2005). As a result, we have not observed significant coupling between \({\mathrm{NAOI_{cru}}}\) and any of the ENSO indices for the period 1950–2004 as well as for the longer recordings (1871–1997 and 1866–2004).

5.2 Granger Causality Analysis

Cross-correlations between \({\mathrm{NAOI_{ncep}}}\ (x_{1})\) and T(Niño-3,4) \((x_{2})\) are not significant at \(p <0.05\). More interesting results are obtained from the non-linear Granger causality analysis based on the polynomial AR models like Eq. (12.4). Figure 12.9 a shows the normalised quantity \({{{\mathrm{P}}{{\mathrm{I}}_{2 \to 1}}} / {{\mathrm{var}}[{x_1}]}}\) for the parameters \(d_{1}=0,\ d_{2\to 1}=1\) and \(P_{1}=2\). It is about 0.015 for the time delays \(19\leq\Delta_{2\to 1}\leq 21\) or \(80\leq\Delta_{2\to 1}\leq 83\) months. Each of these PI values is pointwise significant at \(p=0.01\). Taking into account strong correlations of \(\mathit{PI}_{2\to 1}\) separated by \(\Delta_{2\to 1}\) less than 4 months, one can infer that the influence ENSO → NAO is present at the overall level \(p <0.05\) (Mokhov and Smirnov, 2006). Analogously, Fig. 12.9b shows the quantity \({\mathrm{P}}{{\mathrm{I}}_{1 \to 2}} / {{\mathrm{var}}[{x_2}]}\) for \(d_{2}=0\), \(d_{1\to 2}=1\) and \(P_{2}=2\). Its pointwise significant values at \(48\leq\Delta_{1\to 2}\leq 49\) months do not allow confident detection of the influence NAO → ENSO.

Fig. 12.9
figure 9

Coupling between \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\) and T(Niño-3,4) (1950–2004) in terms of the non-linear Granger causality. Prediction improvements are normalised by variances of the signals: (a) \({\mathrm{P}}{{\mathrm{I}}_{2 \to 1}}/{\mathrm{var}}[{x_1}]\), (b) \({{{\mathrm{P}}{{\mathrm{I}}_{1 \to 2}}} / {{\mathrm{var}}[{x_2}]}}\). Pointwise significance level p estimated via F-test is shown below each panel

If d 1 and \({d_{2 \to 1}}\) are increased up to 2, no changes in PI values presented in Fig. 12.9a are observed. So, the reported PI is not achieved via complication of the individual model. Simultaneous increase in d 1 up to 3, P 1 up to 3 and \({d_{2 \to 1}}\) up to 2 leads to the absence of any confident conclusions due to large variance of the estimators.

Similar results are observed if T(Nino-3,4) is replaced with T(Niño-3), T(Niño-4) or SOI. However, the conclusion about the presence of the influence ENSO → NAO becomes less confident: \(p \approx 0.1\). The use of T(Niño-1+2) leads to even less significant results. Analogous to the phase dynamics modelling, replacement of \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\) with \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{cru}}}}\) does not lead to reliable coupling detection neither for the period 1950–2004 nor for longer periods.

Finally, to reveal trends in coupling during the last decade, couplings between \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\) and T(Niño-3,4) are estimated in a moving window of the length of 47 years. Namely, we start with the interval 1950–1996 and finish with the interval 1958–2004. PI values reveal an increase in the strength of the influence ENSO → NAO. The value of \({\mathrm{P}}{{\mathrm{I}}_{2 \to 1}}\) for \(19 \leqslant {\Delta _{2 \to 1}} \leqslant 20\) months rises almost monotonously by 150% (Fig. 12.10). Although it is difficult to assess statistical significance of the conclusion, the monotone character of the increase indicates that it can hardly be an effect of random fluctuations. To a certain extent, it can be attributed to the strong 1997–1998 ENSO event.

Fig. 12.10
figure 10

Influence ENSO → NAO for \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\) and T(Niño-3,4) in a 47-year moving window; the value \({{{\mathrm{max}}\{ {\mathrm{P}}{{\mathrm{I}}_{2 \to 1}}({\Delta _{2 \to 1}} = 19),{\mathrm{P}}{{\mathrm{I}}_{2 \to 1}}({\Delta _{2 \to 1}} = 20)\} } / {{\mathrm{var}}[{x_1}]}}\) is shown versus the last year of the moving window

Thus, the presence of coupling between ENSO and NAO is revealed by the use of two non-linear techniques and different climatic indices. Consistent results are observed in all cases. The influence ENSO → NAO is detected with confidence probability of 0.95 from the data for \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\) (1950–2004). Estimate of its delay time ranges from several months up to 3 years with the most probable value of 20–24 months. Besides, an increase in the strength of the influence during the last decade is observed. Possible physical mechanisms underlying oscillations and interactions as slow and even slower than those reported here are considered, e.g., in Jevrejeva et al. (2003); Latif (2001); Pozo-Vazquez et al. (2001). The influence ENSO → NAO is not detected with the index \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{cru}}}}\), which is a “more local” characteristic than the index \({\mathrm{NAO}}{{\mathrm{I}}_{{\mathrm{ncep}}}}\). The influence NAO → ENSO is not detected with confidence for any indices.

6 Causes of Global Warming

A key global problem is related to the determination of the relative role of natural and anthropogenic factors in climate variations. Forecasts of the future climate change due to anthropogenic forcing depend on the present estimates of the impact of different factors on the climate. Thus, an impact of solar activity variations is quantified in Mokhov and Smirnov (2008); Mokhov et al. (2006); Moore et al. (2006) via the analysis of different reconstructions and measurement data for solar irradiance and global surface temperature (GST) of the Earth. A variable character of the solar activity impact in connection with its overall increase in the second half of the twentieth century is noted. Moreover, the use of a global climate model in three dimensions has led to the conclusion that solar activity influence can determine only a relatively small portion of the global warming observed in the last decades. A significant influence of the anthropogenic factor on the GST is noted in Verdes (2005). However, the question about the relative role of different factors is still not answered convincingly on the basis of the observation data analysis.

Here, we report our estimates of the influences of different factors on the GST (Mokhov and Smirnov, 2009) based on the analysis of the following data: annual values T of the mean GST anomaly in 1856–2005 (http://www.cru.uea.ac.uk), reconstructions and measurements of the annual solar irradiance variations I in 1856–2005 (http://www.cru.uea.ac.uk), volcanic activity V in 1856–1999 (Sato et al., 1993) and carbon dioxide atmospheric content n in 1856–2004 (Conway et al., 1994) (Fig. 12.11).

Fig. 12.11
figure 11

The data: (a) mean GST (anomaly from the base period 1961–1990); (b) solar constant (irradiance in the range from infrared to ultraviolet wavelengths inclusively); (c) volcanic activity (optical depth of volcanic aerosol, dimensionless); (d) carbon dioxide atmospheric content in ppm (parts per million)

Firstly, we construct univariate AR models for the GST and then analyse the influences of different factors with bivariate and multivariate AR models. Since the main question is about the causes of the GST rise, we compute two characteristics for the different models: (i) the expectation of the value of T in 2005 denoted as \({T_{2005}}\) and (ii) the expectation of the angular coefficient \({\alpha _{1985--2005}}\) of a straight line approximating the time profile \(T(t)\) over the interval 1985–2005 in the least-squares sense (i.e. a characteristic of the recent trend). These two quantities for the original GST data take the values \({T_{2005}} = 0.502K\) and \({\hat \alpha _{1985--2005}} = 0.02{K / {{\mathrm{year}}}}\).

The AR models are fitted to the intervals [1856 – L] for different L, rather than only for the largest possible \(L = 2005\). Checking different L allows one to select a time interval, where each influence is most pronounced, and to determine a minimal value of L for which an influence can be revealed.

6.1 Univariate Models of the GST Variations

The mean-squared prediction error of a linear model (12.1) obtained from the interval [1856–2005] saturates at \({d_T} = 4\) (Fig. 12.12). Incorporation of any non-linear terms does not lead to statistically significant improvements (not shown). Thus, an optimal model reads

$$T(t)={a_0}+\sum\limits_{i=1}^{{d_T}}{{a_i}T(t-i)}+\xi (t),$$
((12.13))

where \({d_T} = 4,\ {a_0} = - 0.01 \pm 0.10\,K\), \({a_1} = 0.58 \pm 0.08,\ {a_2} = 0.03 \pm 0.09\), \({a_3} = 0.11 \pm 0.09,\ {a_4} = 0.29 \pm 0.08\). The intervals present standard deviation estimates coming from the least-squares routine (Sect. 7.4.1). The model prediction error is \(\sigma _T^2 = 0.01\,{K^2}\), while the sample variance of the GST over the interval [1856–2005] is equal to \({\mathrm{var}}[T] = 0.06{K^2}\). In relative units \({{\sigma _T^2} / {\operatorname{var} [T]}} = 0.17\), i.e. 17% of the GST variance is not explained by the univariate AR model. Residual errors for the AR model with \({d_T} = 4\) look stationary (Fig. 12.13a) and their histogram exhibits maximum around zero (Fig. 12.13b). Their delta correlatedness holds true (Fig. 12.13c). The latter is the main condition for the F-test applicability to the further Granger causality estimation.

Fig. 12.12
figure 12

Univariate AR models of the GST: the normalised prediction error variance versus the model order

Fig. 12.13
figure 13

Residual errors for the univariate AR model (12.13) at \({d_T} = 4\): (a) the time realisation; (b) the histogram; (c) the autocorrelation function with the 95% confidence interval estimates

Time realisations of the obtained model (12.13) over 150 years at fixed initial conditions (equal to the original GST values in 1856–1859) look very similar to the original time series (Fig. 12.14a). For a quantitative comparison, Fig. 12.14b shows mean values and 95% intervals for the distributions of model values \(T(t)\) computed from an ensemble of 100 simulated model time realisations. The original time series does not come out of the intervals most of the time, i.e. the model quality is sufficiently high. However, this is violated for the GST values in 2001–2005. Thus, one may suspect that model (12.13) with constant parameters and constant \(\sigma _T^2 = 0.01\,{K^2}\) is not completely adequate, e.g. it may not take into account some factors determining the essential GST rise over the last years.

Fig. 12.14
figure 14

Behaviour of the model (12.13) fitted to the interval [1856–2005]: (a) three time realisations taken randomly from an ensemble of 100 realisations; (b) mean values over the ensemble (thin line) and the 95% intervals of the distributions (error bars) with the superimposed original data for GST (thick line)

The hypothesis finds a further confirmation under a more strict test. We check whether the univariate model (12.13) fitted to the interval [1856–1985] can predict the GST rise over the interval [1985–2005]. The results of model fitting are similar to those for the interval [1856–2005]. Coefficient estimates differ to some extent: \({a_0} = - 0.01 \pm 0.16K,\ {a_1} = 0.56 \pm 0.09\), \({a_2} = 0.05 \pm 0.10,\ {a_3} = 0.02 \pm 0.10,\ {a_4} = 0.29 \pm 0.09\). Prediction error is again \(\sigma _T^2 = 0.01\,{K^2}\). However, the original GST values over the last 16 years do not fall within the 95% intervals (Fig. 12.15). Thus, one may assume that something has changed in the GST dynamics over the last decades, e.g., as a result of external influences.

Fig. 12.15
figure 15

The original GST (thick line) and the 95% “corridor” for the model (12.13) fitted to the interval [1856–1985]

This is further analysed with bi- and multivariate AR models for the GST. We take \({d_T} = 4\) and select \({d_{I \to T}},{d_{n \to T}},{d_{V \to T}}\) and \({\Delta _{I \to T}},{\Delta _{n \to T}},{\Delta _{V \to T}}\) so as to provide the greatest GST prediction improvement and qualitative similarity between the model behaviour and the original GST time profile.

Fig. 12.16
figure 16

Bivariate modelling of the GST from different time windows [1856 – L]. PI-values and significance levels for (a) the models taking into account solar activity; (b) the models taking into account volcanic activity; (c) the models taking into account \({\mathrm{CO}}_{2}\) atmospheric content. The numerical values of PI (thick lines) are indicated on the left y-axes and significance levels (thin lines) on the right y-axes. The dashed lines show the level of \(p = 0.05\)

Fig. 12.17
figure 17

The original GST values (thick line) and 95% “corridors” for the bivariate models of the GST: (a) model (12.14) with solar activity fitted to the interval [1856–1985]; model (12.15) with volcanic activity fitted to the interval [1856–1999]; (c) model (12.16) with \({\mathrm{CO}}_{2}\) atmospheric content fitted to the interval [1856–2005]

6.2 GST Models Including Solar Activity

An optimal choice of parameters is \({d_{I \to T}} = 1\) and \({\Delta _{I \to T}} = 0\). The influence \(I\to T\) is most clearly seen when the interval [1856–1985] is used for model fitting (Fig. 12.16a). The model reads

$${T_t}={a_0}+{a_1}{T_{t-1}}+{a_4}{T_{t-4}}+{b_I}{I_{t-1}}+{\eta_t},$$
((12.14))

where \({a_1} = - 93.7 \pm 44.4K,\ {a_1} = 0.52\pm 0.09,\ {a_4} = 0.27\pm 0.09\) and \({b_I} = 0.07\pm 0.03K / {{({{\mathrm{W}} / {{{\mathrm{m}}^{\mathrm{2}}}}})}}\). The prediction improvement is \({{{\mathrm{PI}}}_{I \to T}}/ {\sigma _T^2} = 0.028\) and its positivity is statistically significant at \(p < 0.035\). The model fitted to the interval [1856–2005] detects no influence \(I\to T\) significant at \(p < 0.05\). It may evidence that the impact of other factors, not related to solar activity, has increased during the interval [1985–2005]. Simulations with model (12.14) indirectly confirm this assumption. Figure 12.17 a shows an ensemble of simulated realisations when an original time series \(I(t)\) is used as input. The 95% intervals are narrower than those for the univariate model (cf. Fig. 12.14b), i.e. the incorporation of solar activity into the model allows better description of the GST in 1856–1985. However, the GST rise in 1985–2005 is not predicted by the bivariate model as well.

To assess the long-term effect of the solar activity trend on the GST rise, we simulate an ensemble of time realisations of model (12.14) when a detrended signal \(I(t)\) (Lean et al., 2005) is used as input. The result is visually indistinguishable from the plot in Fig. 12.17 a (not shown). Thus, the removal of the solar activity trend does not affect the model GST values. Quantitatively, we get \(\left\langle{{T_{2005}}}\right\rangle = 0.0\pm 0.02K\) and angular coefficients \(\left\langle {{\alpha_{1985--2005}}}\right\rangle \leq 0.002{K/ {{\mathrm{year}}}}\) in both cases. The original trend \({\hat\alpha_{1985--2005}} = 0.02{K / {{\mathrm{year}}}}\) is not explained by any of the bivariate models (12.14). Thus, despite it is detected that the solar activity variations affect the GST, the long-term analysis suggests that they are not the cause of the GST rise in the last years.

Introduction of non-linearity into the models does not improve their predictions so that the linear models seem optimal. This is the case for all models below as well. Therefore, all the results are presented only for the linear models.

6.3 GST Models Including Volcanic Activity

The influence of the volcanic activity appears of the same order of magnitude as that of the solar activity. An optimal choice is \({d_{V \to T}} = 1\) and \({\Delta _{V \to T}} = - 1\), i.e. a model

$${T_t}={a_0} + {a_1}{T_{t-1}} + {a_4}{T_{t-4}} + {b_V}{V_t} + {\eta _t}.$$
((12.15))

The influence is detected most clearly from the entire interval [1856–1999] of the available data for \(V(t)\) (Fig. 12.16b). For that interval \({{{\mathrm{P}}{{\mathrm{I}}_{V \to T}}} / {\sigma _T^2}} = 0.029\) and positivity of \({{\mathrm{P}}{{\mathrm{I}}_{V \to T}}}\) is statistically significant at \(p < 0.03\). Model coefficients are \({a_0} = 0.25 \pm {0.14K,\ {a_1} = 0.55} \pm 0.08,\ {a_4} = 0.29 \pm 0.08,\ {b_V} = -0.92 \pm 0.41K\).

However, even if the original data for \(V(t)\) are used as input, the model predicts only strong fluctuations of the GST around the mean value, e.g., in 1999 – around the value of \(\left\langle{{T_{1999}}}\right\rangle = 0.7 \pm 0.14K\) (Fig. 12.17b), rather than the rise in the GST during the last years. According to model (12.15), there is no trend in the GST on average: \(\left\langle{{\alpha_{1985--2005}}}\right\rangle\leq 0.001{K / {{\mathrm{year}}}}\). If the signal \(V(t)=0\) is used as input, then the model predicts even greater values of the GST: \(\left\langle{{T_{1999}}}\right\rangle = 1.5 \pm 0.16\,K\). Indeed, the long-term effect of volcanic eruptions is to limit the GST values. Volcanic activity is relatively high in 1965–1995 (Fig. 12.17c), which should contribute to a decrease in the GST. Therefore, explaining the GST rise during the last decades by the volcanic activity influence is also impossible.

6.4 GST Models Including \({\mathrm{CO}}_{2}\) Concentration

An optimal choice is \({d_{n \to T}} = 1\) and \({\Delta _{n \to T}} = 0\). Apart from highly significant prediction improvement, it provides a model which behaves qualitatively similar to the original data (in contrast to the models with \({d_{n \to T}} > 1\)). The model reads as

$${T_t}={a_0}+{a_1}{T_{t-1}}+{a_4}{T_{t-4}}+{b_n}{n_{t-1}}+{\eta_t}.$$
((12.16))

The influence of \({\mathrm{CO}}_{2}\) appears much more considerable than that of the other factors. It is detected most clearly from the entire available interval [1856–2005] (Fig. 12.16 c), where \({{{\mathrm{P}}{{\mathrm{I}}_{n \to T}}} / {\sigma _T^2}} = 0.087\) and its positivity is significant at \(p < 0.0002\). The coefficients of this model are \(a_{0}=-1.10\pm 0.29\,K,\ {a_1} = 0.46\pm 0.08,\ {a_4} = 0.20\pm 0.08,\ b_{n}=0.003\pm 0.001\,K/{\mathrm{ppm}}\).

An ensemble of time realisations (Fig. 12.17c) shows that the model (12.16) with the original data \(n(t)\) used as input describes the original data \(T(t)\) much more accurately than do the models taking into account the solar or the volcanic activity. Moreover, the model (12.16) fitted to a narrower interval, e.g. [1856–1960], exhibits practically the same time realisations as in Fig. 12.17c, i.e. it correctly predicts the GST rise despite the data over an interval [1960–2004] are not used for the model fitting. The model (12.16) fitted to any interval [1856 – L] with \(L > 1935\) gives almost the same results.

If the artificial signal \(n(t) = {\mathrm{const}} = n(1856)\) is used as input for the model (12.16) fitted to the interval [1856–1985], then one observes just fluctuations of T about the level of \({T_{1856}}\) (Fig. 12.18) and no trend, i.e. \(\left\langle{{\alpha_{1985--2005}}}\right\rangle = 0\). If the original data for \(n(t)\) are used as input, one gets the model characteristics \(\left\langle {{T_{2005}}}\right\rangle \approx 0.5K\) and \(\left\langle {{\alpha_{1985--2005}}} \right\rangle = 0.17{K / {{\mathrm{year}}}}\), which are close to the observed ones. Thus, according to the model (12.16), the rise in the atmospheric \({\mathrm{CO}}_{2}\) content explains a major part of the recent rise in the GST.

Fig. 12.18
figure 18

The original GST values (thick line) and the 95% “corridor” for the bivariate model (12.16) if the signal \(n(t) = {\mathrm{const}} = n(1856)\) is used as input

The results of the multivariate AR modelling confirm the above results of the bivariate analysis (the corresponding plots are not shown).

Thus, the Granger causality estimation and the investigation of the AR models’ long-term behaviour allow to assess an effect of the solar activity, volcanic activity and carbon dioxide atmospheric content on the global surface temperature. The Granger causality shows that the three factors determine about 10% of the quantity \(\sigma _T^2\), which is the variance of the short-term GST fluctuations unexplained by the univariate AR model. The impact of \({\mathrm{CO}}_{2}\) is the strongest one, while an effect of the other two factors is several times weaker. The long-term behaviour of the models reveals that the \({\mathrm{CO}}_{2}\) content is a determinative factor of the GST rise. According to the empirical AR models, the rise in the \({\mathrm{CO}}_{2}\) concentration determines at least 75% of the GST trend over 1985–2005, while the other two factors are not the causes of the global warming. In particular, if the \({\mathrm{CO}}_{2}\) concentration remained at the level of 1856 year, the GST would not rise at all during the last century. In contrast, model variations in the solar and volcanic activity do not lead to significant changes in the GST trend.