Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

It is difficult even to list all fields of knowledge and practice where modelling from data series is applied. One can say that they range from astrophysics to medicine. Purposes of modelling are diverse as well. Therefore, we confine ourselves with several examples demonstrating practical usefulness of empirical modelling.

The most well-known application is, of course, a forecast (see Sects. 7.4.1, 10.2.1 and 10.3). Indeed, it seems the most intriguing problem of the data analysis (Casdagli and Eubank, 1992; Gerschenfeld and Weigend, 1993; Kravtsov, 1997; Soofi and Cao, 2002): one often hears about weather and climate forecasts, prediction of earthquakes and financial indices, etc. Still, empirical models obtained as described above accurately predict such complicated processes only rarely. The main obstacles are the curse of dimensionality, deficit of experimental data, considerable noise levels and non-stationarity. However, the chances for accurate predictions rise in simpler and more definite situations.

Another useful possibility is validation of physical ideas about an object under study. Modelling from time series allows to improve one’s understanding of the “mechanisms” of an object functioning. High quality of a model evidences validity of the underlying substantial ideas (see Sects. 8.3 and 9.3). Such a good model can be further used to reach practical purposes.

Less known are applications of empirical modelling to non-stationary data analysis. Non-stationarity is ubiquitous in nature and technology. In empirical modelling, it is important to cope with non-stationary signals properly. Quite often, non-stationarity is just a harmful obstacle in modelling. However, there are many situations where this is not the case. One the one hand, a natural non-stationarity of a real-world signal can be of interest by itself (Sect. 11.1). One the other hand, artificially introduced non-stationarity of a technical signal can be a way of information transmission (Sect. 11.2).

A very promising and widely required application is characterisation of directional (causal) couplings between observed processes. It is discussed in more detail, than the other mentioned problems, in Chap. 12. Several other opportunities such as prediction of bifurcations and signal classification are briefly described in Sect. 11.3.

1 Segmentation of Non-stationary Time Series

In the theory of random processes, non-stationarity of a process implies temporal changes in its multidimensional distribution functions. Majority of real-world processes, especially in biology, geophysics or economics, look non-stationary since their characteristics are not constant over an observation interval. Non-stationarity leads to significant difficulties in modelling which are comparable with the curse of dimensionality. However, character of non-stationarity may carry an important information about properties of an object under study.

The term “dynamical non-stationarity” refers to a situation where an original process can be described with differential or difference equations whose parameters vary in time (Schreiber, 1997, 1999). Proper characteristics of dynamical non-stationarity are useful, e.g., to detect a time instant of a parameter change more accurately, than it can be done with characteristics of probability distribution functions. Indeed, if a dynamical regime looses its stability after a parameter change, a phase orbit may remain in the same area of the phase space for a certain time interval. Hence, statistical properties of an observed time series do not change rapidly. Yet, another regime will be established in the future and it may be important to anticipate upcoming changes as early as possible.

According to a basic procedure for the investigation of a possibly non-stationary process, one divides an original time series into M segments of length \(L\leq N_{\mathrm{st}}\), over which the process is regarded stationary. Then, one performs reconstruction of model equations or computes some other statistics (sample moments, power spectrum, etc.) for each segment N j separately. After that, the segments are compared to each other according to the closeness of their model characteristics or other statistics. Namely, one introduces the distance d between segments and gets matrix of distances \(d_{i,j}=d(N_{i}, N_{j})\). The values of the distances allow to judge about stationarity of the process. The following concrete techniques are used.

  1. (i)

    Comparison of the sample probability distributions according to the χ 2 criterion (Hively et al., 1999). One divides the range of the observable values into H bins and counts the number of data points from each segment falling into each bin. A distance between two segments is computed as

    $$d_{i,j}^2=\sum\limits_{k=1}^H{\frac{{{{({n_{k,i}}-{n_{k,j}})}^2}}} {{{n_{k,i}}+{n_{k,j}}}}} ,$$

    where \(n_{k,i}\) and \(n_{k,j}\) are the numbers of data points in the kth bin from the ith and jth segment, respectively. This quantity can reveal non-stationarity in the sense of a transient process for a system with a constant evolution operator.

  2. (ii)

    Comparison of empirical models. One constructs global models and gets estimates of their parameter vectors \(\hat{\textbf{c}}_{i},\hat{\textbf{c}}_{j}\) for the ith and jth segments, respectively. A distance between the segments can be defined, e.g., as a Euclidean distance between those estimates: \(d_{i,j}^{2}=\sum\nolimits_{k=1}^{P}(\hat{c}_{k,i}-\hat{c}_{k,j})^2\) (Gribkov and Gribkova, 2000).

The results of such an analysis of an exemplary time series (Fig. 11.1a) are conveniently presented in a kind of “recurrence diagram” (Fig. 11.1b), where starting time instants of the ith and jth segment are shown along the axes. The distances between the segments are indicated in greyscale: white colour corresponds to strongly different segments (large distances) and black one to “almost the same” segments (zero distances). To illustrate possibilities of the approach, let us consider an exemplary one-dimensional map

$${x_{n+1}}={c_0}(n){\mathrm{cos}}\,{x_n}$$
((11.1))

with an observable \(\eta=x\) and a time series of the length of 2000 data points.

Fig. 11.1
figure 1

Non-stationary data analysis: (a) a time realisation of the system (11.1); (b) a recurrence diagram where the distances between time series segments are computed based on the χ 2 criterion for the sample probability densities; (c), (d) the distances are the Euclidean distances between parameter vectors of the models \(x_{n+1}=f(x_{n},\textbf{c})\) with a polynomial f of the order \(K=2\) (c) and \(K=6\) (d)

The parameter c changes its value at a time instant \(n=1000\) from \(c_{0}=2.1\) corresponding to a chaotic regime to \(c_{0}=2.11735\) corresponding to a period-7 cycle. However, the latter regime is clearly established after a sufficiently long transient process (about 400 iterations) so that majority of statistical properties such as the sample mean and the sample variance in a relatively short moving window get clearly different only at a time instant of about \(n=1400\) (Fig. 11.1a). Figure 11.1b–d show the recurrence diagrams for the time series segments of the length of 200 data points. A big dark square on a diagonal corresponds to a quasi-stationary segment, while a boundary between dark areas is a time instant when the process properties reflected by the distance between the segments change. Figure 11.1b shows a diagram based on the χ 2 criterion. Figure 11.1c, d is obtained via the construction of the models \(x_{n+1}=f(x_{n},\textbf{c})\) with algebraic polynomials of the orders \(K=2\) (a less adequate model, Fig. 11.1c) and \(K=6\) (a more adequate model, Fig. 11.1d). One can see that the statistic χ 2 detects the changes in the system much later (Fig. 11.1b) than does a good dynamical model (Fig. 11.1d). Figure 11.1c, d also demonstrates that dynamical non-stationarity is detected correctly if empirical models are of high quality. Examples of an analogous segmentation of intracranial electroencephalogram recordings from patients with temporal lobe epilepsy are described in Dikanev et al. (2005).

The selection of quasi-stationary segments can be of importance for obtaining a valid empirical model as well. Generally, in empirical modelling it is desirable to use an entire time series to get more accurate parameter estimates. However, a model with constant parameters is inappropriate in the case of dynamical non-stationarity. One should fit such a model to a quasi-stationary segment, which should be as long as possible. To learn a maximal length of a continuous quasi-stationary segment, one should find the widest black square on the diagonal of the above recurrence diagram. Then, one should add similar shorter segments to the first segment and repeat model fitting for the obtained maximally long quasi-stationary piece of data. This approach is realised and applied to the electroencephalogram analysis in Gribkov (2000). A similar problem concerning localisation of time instants of abrupt changes is considered in Anosov et al. (1997).

2 Confidential Information Transmission

Development of communication systems which use a chaotic carrier is a topical problem in communication technology (Dmitriev, 2002). In relation to this field, we note an interesting possibility to use non-stationary time series modelling for multichannel confidential information transmission (Anishchenko and Pavlov, 1998). Let us consider a non-linear dynamical system \({\mathrm{d}}\textbf{x}/{\mathrm{d}}t=\textbf{f}(\textbf{x, c}_{0})\), whose parameter c 0 slowly changes in time as \(\boldsymbol{c}_{0}=\boldsymbol{c}_{0}(t)\). Let a chaotic time realisation of this system, e.g., \(\eta(t)=x_{1}(t)\), be a transmitted signal and parameter variations \(\textbf{c}_{0}(t)\) be information signals which are not directly transmitted. The conditions necessary to extract information signals from the chaotic observed signal are as follows:

  1. (i)

    One completely knows the structure of the dynamical system used as a transmitter, i.e. as a generator of a chaotic time series with changing parameters;

  2. (ii)

    A characteristic time of variation in its parameters c 0 is much greater than its characteristic oscillation time.

Then, one can restore the time realisations \(\textbf{c}_{0}(t)\) from a single observable \(\eta(t)\) by estimating the respective parameters in the model \({\mathrm{d}}\textbf{x}/{\mathrm{d}}t=\textbf{f}(\textbf{x, c})\) from subsequent quasi-stationary segments of the observed time series \(\eta(t)\) with some of the techniques described in Chaps. 8, 9 and 10.

A numerical toy example of transmitting a greyscale graphical image is considered in Anishchenko and Pavlov (1998), where a system producing a chaotic signal is a modified generator with inertial non-linearity (also called the Anishchenko– Astakhov oscillator) given by the equations

$$\begin{aligned}&{{{\mathrm{d}}x} \mathord{\left/ {\vphantom{{{\mathrm{d}}x}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}=mx+y-xz,\nonumber \\ &{{{\mathrm{d}}y} \mathord{\left/ {\vphantom{{{\mathrm{d}}y}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}=-x,\nonumber \\ &{{{\mathrm{d}}z} \mathord{\left/ {\vphantom{{{\mathrm{d}}z}{{\mathrm{d}}t}}} \right. \kern-\nulldelimiterspace}{{\mathrm{d}}t}}=-gz+0.5g\left({x+\left| x \right|} \right)x.\end{aligned}$$
((11.2))

An information signal represents intensity of the greyscale (256 possible values) for the subsequent pixels on the portrait of Einstein (Fig. 11.2a). This signal modulates the values of the parameter g in the interval [0.15, 0.25] at \(m=1.5\). A transmitted signal is \(\eta(t)=y(t)\) with a superimposed weak observational noise. A signal in the communication channel looks as “pure” noise (Fig. 11.2b).

Fig. 11.2
figure 2

Information transmission with the use of the reconstruction from a time series (Anishchenko and Pavlov, 1998): (a) an original image (\(500\times 464\) pixels); (b) a signal y in a communication channel; (c) a restored image

If the structure of the generating dynamical system is unknown, one cannot restore a transmitted information (at least, it is very difficult). The result of extracting the information signal in a “receiver” via the estimation of the parameter g in Eq. (11.2) is shown in Fig. 11.2c. A single scalar information signal \(g(t)\) is transmitted in this example, though one can change several parameters of the generator simultaneously. In practice, the number of transmitted signals \(c_{0,k}(t)\) which can be successfully restored from an observed realisation \(\eta(t)\) is limited by the intensity of noise in a communication channel (Anishchenko and Pavlov, 1998). Further developments in this research direction are given in Ponomarenko (2004, 2002).

3 Other Applications

Several more specific applications of empirical modelling are listed below and commented very briefly:

  1. (1)

    Prediction of bifurcations in weakly non-autonomous systems (Casdagli, 1989; Feigin et al., 1998, 2001; Mol’kov and Feigin, 2003). When an object parameter changes slowly, one fits an autonomous model with the same structure to subsequent time series segments. Estimates of its parameters vary between the segments. Thus, one obtains a time dependence of the parameter estimates: \(\hat{\textbf{c}}_{j},j=1,\ldots,M\), where j is the number of a time series segment. From such a new time series, one can construct a model predicting future parameter changes, e.g. a model in the form of an explicit function of time (Sect. 7.4.1). Thereby, one gets predicted parameter values for each future time instant j and checks to what dynamical regime of the autonomous model they correspond. Thus, one can predict a change in the observed dynamical regime, i.e. a bifurcation in the autonomous system, which occurs when the value of c crosses a boundary of the domain where the current regime is stable. Such bifurcation forecast is possible under the strict conditions that the autonomous model adequately describes an object dynamics in a wide area of the parameter space including a “post-bifurcation” dynamical regime.

  2. (2)

    Restoration of external driving from a chaotic time realisation of a single dynamical variable of a non-autonomous system. This is useful if the driving signal carrying important information cannot be measured directly and one observes only a result of its influence on some non-linear system. Principal possibility of such restoration is illustrated in Gribkov et al. (1995) even in cases where an external driving is not slowly varying. The conditions necessary for the restoration are as follows: (i) the structure of the driven system and the way how the driving enters the equations are known a priori; (ii) a time series from an autonomous system is also available. Then, one first estimates parameters of an autonomous system. Secondly, the external driving is restored from the non-autonomous system realisation taking into account the obtained parameter estimates.

  3. (3)

    Signal classification is another important task where one divides a set of objects into groups (classes) of similar objects based on the experimental data analysis. A general approach involves definition of the distance between two signals, computation of the distances for each pair of signals and division of the signals into groups (clusters) with any of the existing clustering approaches. The distance between signals can be defined, in particular, by estimating a difference between empirical models constructed from the signals (Kadtke, 1997). Different clustering algorithms are studied by the cluster analysis (Kendall, 1979). We note that the problem of quasi-stationary segment selection (Sect. 11.1) can be formulated in the same terms: different segments of a signal can be considered as different signals, which are united into clusters (quasi-stationary intervals) based on the kind of recurrence diagram (Fig. 11.1c, d).

  4. (4)

    Automatic control of a technical object is realised via regulation of the available parameters. The search for optimal parameter values can be realised in practice as follows: (i) one performs measurements at different values of object parameters; (ii) one constructs an empirical model with a given structure from each set of data; (iii) one reveals the model parameters, whose variations correspond to variations in the governing parameters of an object; (iv) via investigation of the model, one reveals the model parameters which strongly influence the character of the model dynamics; (v) via investigation of the model, one finds such values of its most influential parameters, which correspond to “the best regime of an object functioning”; (vi) finally, one specifies the values of object parameters corresponding to the model parameters found at the previous step. Such an approach is suggested in Gribkov et al. (1994b) and partly realised in relation to a system of resonance frequency and temperature stabilisation in a section of a linear electron accelerator.

  5. (5)

    Estimation of an attractor characteristics from a short time series (Pavlov et al., 1997). One of the important problems in non-linear time series analysis is the computation of such characteristics of an attractor as Lyapunov exponents and fractal dimensions (see, e.g., Eckmann, 1985, 1992; Ershov and Potapov, 1998; Kantz, 1995; Theiler, 1990; Wolf et al., 1985). However, reliable estimates of them can be obtained directly only from a time series, which is very long (so that an orbit could return to the vicinity of any of its points many times) and sufficiently “clean” (noise free). Such data are typically unavailable in practice. A global dynamical model can often be constructed from a much shorter time series if it involves a small number of free parameters. After getting a model, one can compute Lyapunov exponents and fractal dimensions of its attractor from a numerically simulated arbitrarily long time realisation (Sect. 2.1.4) and take them as the sought estimates.

  6. (6)

    Testing for non-linearity and determinism (Gerschenfeld and Weigend, 1993; Small et al., 2001). In investigation of a complex object dynamics, one often does not manage to get a valid model, to reliably estimate dimension, etc. Then, one poses “more modest” questions, which can also be quite substantial. One of them is whether the dynamics of an object is non-linear? One can get an answer with the use of empirical models. A possible technique is as follows.

    One constructs local linear models with different numbers of neighbours k [Eq. (10.11) in Sect. 10.2.1). Root-mean-squared one-step-ahead prediction error ε computed from a test time series or via the cross-validation technique (Sect. 7.2.3) is plotted versus k. The value of k close to the length of a training time series corresponds to a global linear model. At small values of k, one gets different sets of model parameter values for different small domains of the phase space. If the original process is linear, then the prediction error ε monotonously decreases with k, since the parameter estimates become more accurate (due to the use of greater amount of data) without violation of the model validity. If the process is non-linear, then the error ε reaches its minimum at some intermediate value of k, which is sufficiently big to reduce the noise influence and sufficiently small to provide a good accuracy of the local linear approximation. Thereby, one can infer the presence of non-linearity and estimate the scale, where it manifests itself, from the plot \(\varepsilon(k)\) (Gerschenfeld and Weigend, 1993).

  7. (7)

    Adaptive noise reduction (Davies, 1994; Farmer and Sidorowich, 1991; Kostelich and Schreiber, 1993). An observed signal \(\eta(t)\) often represents a sum of a “useful” signal \(X(t)\) and a “harmful” signal \(\xi(t)\), which is called “noise”:

    $$\eta (t)=X(t)+\xi (t).$$
    ((11.3))

    Then, the problem is to extract the signal \(X(t)\) from \(\eta(t)\), i.e. to get a signal \(\hat{X}(t)\), which differs from \(X(t)\) less strongly than the observed signal \(\eta(t)\) (Farmer and Sidorowich, 1991). The expression (11.3) describes a measurement noise, which is especially harmful in numerical differentiation where it is usually reduced with a Savitzky–Golay filter (Sect. 7.4.2). The latter is a kind of linear filtering (Hamming, 1983; Rabiner and Gold, 1975). However, all linear filters are based on the assumption that an “interesting” dynamics \(X(t)\) and the noise \(\xi(t)\) have different characteristic timescales, i.e. their powers are concentrated in different frequency bands (Sect. 6.4.2). As a rule, the noise is assumed to be a very high-frequency signal (this is implied when the Savitzky–Golay filter is used in differentiation) or a very low-frequency signal (slow drifts of the mean and so forth).

    However, it may appear that the noise has the same timescale as the useful signal. Then, linear filtering cannot help, but non-linear noise reduction based on the construction of non-linear empirical models may appear efficient. The principal idea is to fit a non-linear model to an observed time series taking into account measurement noise (see, e.g., Sects. 8.1.2 and 8.2). Residual errors \(\varepsilon(t)\) of the model fitting are considered as the estimates of the noise realisation \(\xi(t)\). Then, one estimates the useful signal via subtraction of \(\varepsilon(t)\) from the observed signal \(\eta(t)\):

    $$\hat X(t) = \eta (t) - \varepsilon (t).$$
    ((11.4))

    There are many implementations of the approach. One of the first works (Farmer and Sidorowich, 1991) exploited local linear models. In any case, the model size selection and the choice of an approximating function form are important, since an inadequate model would give biased estimates \(\hat{X}(t)\) which may strongly differ from the true signal \(X(t)\), i.e. a further distortion of the signal would occur rather than a noise reduction.

  8. (8)

    Finally, let us mention again such a promising possibility as restoration of equivalent characteristics of non-linear elements in electric circuits and other systems (Sect. 9.3). Such characteristics can be reconstructed via the empirical modelling even in the regimes of large oscillation amplitudes and chaos, where ordinary tools may be inapplicable. This approach has been successfully used to investigate dynamical characteristics of a ferroelectric capacitor (Hegger et al., 1998), semiconductor elements (Sysoev et al., 2004; Timmer et al., 2000) and fibre optical systems (Voss et al., 1999).