1 Introduction

It is well known that the Fourier transform provides representation for signals in frequency domain when the spectral characteristics do not change with time. However, when the signal is nonstationary, to get the information about spectral characteristics at different times, the Fourier transform needs to be replaced with a time-frequency representation (TFR). TFRs characterize signals over a time-frequency plane by combining time-domain and frequency domain analyses and reveal the temporal localization of spectral components. We can obtain the spectral characteristics at different times via the short time Fourier transform (STFT). The bilinear distributions and evolutionary spectrum (ES) are other approaches to obtain spectral characteristics Boashash (2003). Priestley’s ES theory, generalizes the definition of spectra for nonstationary signals Priestly (1967). Accordingly, the Wold-Cramer ES, considers a nonstationary signal as the output of a linear time-varying (LTV) system driven by a stationary white noise Kayhan et al. (1992). There have been different approaches to estimate the ES such as the evolutionary periodogram (EP) Kayhan et al. (1994). The EP is based on projections of the spectrum onto the time and frequency domains using an orthonormal basis set Kayhan et al. (1994). In the case of deterministic nonstationary signals, the discrete evolutionary transform (DET) allows the computation of a kernel and the corresponding ES Suleesathira et al. (1998). Similarly, in array signal processing, the signal received by each sensor of the array can be modeled as a sum of complex sinusoids with time-varying complex amplitudes Kayhan and Moeness (2000). As shown in Kayhan and Moeness (2000), the time-varying amplitudes can be estimated using linear estimators obtained via minimum mean-squared error criteria. These estimates are then used for the estimation of time-varying cross-power distributions of the data across the array.

In this paper we propose a representation for evolutionary spectrum of nonstationary signals that can be applied for the Blind Source Separation (BSS) problem. In the simplest form, the BSS problem can be defined as recovering n mutually independent unknown sources from m linear observations (mixtures) of them Johnson and Dudgeon (1993). The BSS methods are used to extract important information from the mixture of sources for applications such as speech processing, image processing, biomedical signal processing and telecommunications. For example, in the case of Electroencephalography (EEG) recordings, voltage fluctuations resulting from ionic current within the neurons of the brain are measured non-invasively. As all the electrodes are placed along the scalp, what we actually observe from EEG data is a mixture of all the active sources. Since the electrical signals must travel through human tissue to reach the electrodes, each measured signal can be assumed to be a linear mixture of source signals Makeig et al. (1996). In addition, scalp-recorded EEG signals include non-brain sources such as electroculographic (EOG) and electromyographic (EMG) activities. It was shown in Delorme et al. (2007) that the BSS methods are very useful for extracting these sources from the EEG recordings. Separation of these sources from a mixture of observations is crucial in analysis of recordings. The unmixing matrix’s inverse can also be used to provide a spatial illustration of each BSS-extracted signal’s associated scalp location Delorme et al. (2012). If the number of sources are unknown, they can be estimated using methods such as Fujita (2012); Yuan et al. (2008).

The BSS algorithms can be classified as the ones that are based on using statistical information available on the source signals Coivunen et al. (2001); Everson and Roberts (2018) and those that are exploiting the difference in the time-frequency signatures of the sources to be separated Pal et al. (2013). An example for the over-deteminded case i.e., the number of observations are greater than the number of sources \({n\le m}\), based on second-order statistics and joint-diagonalization of set of covariance matrices can be found in Belouchrani et al. (1997). Another example, on spatial time-frequency distributions (TFDs) as a generalization of bilinear TFDs, in the case of nonstationary signals, is in Belouchrani and Amin (1998); Sekihara et al. (1999). Although the bilinear TFDs have good localization property, they display cross-terms and positivity of spectral estimates are not guaranteed Cohen (1995). Alternatively, the EP as an estimator of the Wold-Cramer ES was extended for array processing in semi-homogeneous random fields Bohme (1979).

The evolutionary spectral representation we propose is based on the discrete prolate spheroidal sequences (DPSS) Slepian (1962). The DPSS are defined to be the sequences with maximum spectral concentration for a given duration and bandwidth. We will use a combination of the spatial evolutionary Slepian spectrum with whitening technique to estimate the mixing matrix and separate the source signals. The paper is organized as follows. In the next section, we review the evolutionary spectrum and provide the fundamental equations of signal representation. In Sect. 3, we present the proposed evolutionary Slepian transform. We review the BSS problem and related formulation in Sect. 4. In Sect.  5, we present experimental results. Conclusions follow.

2 Review of evolutionary spectrum and periodogram estimator

Introduced by Priestley Priestley (1981), the evolutionary spectrum (ES) theory describes the local power frequency distribution at each instant of time Priestly (1967). In particular, the Wold-Cramér ES considers a nonstationary signal as the output of a linear time-varying (LTV) system driven by a stationary white noise Melard and Schutter (1989) and the evolutionary periodogram (EP) is proposed for the estimation of the Wold-Cramér ES Kayhan et al. (1994). In order to review the Wold-Cramér ES, we can start with the representation of a discrete-time nonstationary process as the output of a casual, LTV system with impulse response h[nm] as

$$\begin{aligned} x[n]=\sum \limits _{m = - \infty }^{n}{h[n,m]\varepsilon [m]}, \end{aligned}$$
(1)

here \(\{\varepsilon [m]\}\) is a stationary, zero-mean, unit-variance, white noise process. The representation in (1) is known as the Wold-Cramér decomposition Priestley (1981). \(\{\varepsilon [m]\}\) can be expressed as a sum of sinusoids with random amplitudes and phases.

$$\begin{aligned} \varepsilon [m]= \int \limits _{ - \pi }^{\pi } e^{j\omega m}dZ(\omega ). \end{aligned}$$
(2)

Accordingly, the nonstationary process \(\{x[n]\}\) can be expressed as

$$\begin{aligned} x[n] = \int \limits _{ - \pi }^{\pi } {H(n,\omega )e^{ j\omega n} { d}Z(\omega )}, \end{aligned}$$
(3)

where

$$\begin{aligned} H(n,\omega ) = \sum \limits _{m = - \infty }^{n} {h[n,m]e^ { - j\omega (n - m)}}, \end{aligned}$$
(4)

for \(Z(\omega )\) being a process with orthogonal increments. The variance of x[n]

$$\begin{aligned} E\{|x[n]|^2\}=\frac{1}{2\pi } \int \limits _{ - \pi }^{\pi }|H(n,\omega )|^2 d\omega , \end{aligned}$$
(5)

provides the power distribution of the nonstationary process \(\{x[n]\}\) at each time n, as a function of the frequency parameter \(\omega\). The Wold-Cramér ES is defined as \(S(n,\omega )=|H(n,\omega )|^{2}\) and the cross-power ES for two processes \(\{x[n]\}\) and \(\{y[n]\}\) is given as \(S_{xy}(n,\omega )=H_x(n,\omega ) H^*_y(n,\omega )\). This definition was also proposed in Melard and Schutter (1989) as a special case of Priestley’s ES, if one restricts the function H(nw) to the class of oscillatory functions that are slowly-varying in time. In Kayhan et al. (1994), a similar condition was applied to model the component of x[n] for a particular frequency of interest, \(\omega _0\) as

$$\begin{aligned} x_0[n]=H(n,\omega _0)e^{ j\omega n} {d}Z(\omega _0), \end{aligned}$$
(6)

such that

$$\begin{aligned} x[n]=x_0[n]+y[n]=A(n,\omega _0)e^{ j\omega n}+y[n], \end{aligned}$$
(7)

where \(A(n,\omega _0)=H(n,\omega _0){ d}Z(\omega _0)\) and represents time-varying complex amplitude. y[n] is the zero-mean modeling error which includes the components of x[n] at frequencies different from \(\omega _0\) . It can be derived that

$$\begin{aligned} E\{|A(n,\omega _0)|^2\}=S(n,\omega _0)\frac{d\omega _0}{2\pi }, \end{aligned}$$
(8)

and using x[n] and \(A(n,\omega _0)\), \(S(n,\omega _0)\) can be estimated Kayhan et al. (1994). Repeating this process for all frequencies \(\omega\), an estimate of the time-dependent spectral density \(S(n,\omega )\) was obtained Kayhan et al. (1994). In this case, assuming that \(A(n,\omega _0)\) also varies with time, a representation as an expansion of orthonormal functions \(\{ \beta _i[n]\}\) over \(0\le n \le N-1\) is

$$\begin{aligned} A(n,\omega _0)=\sum _{i=0}^{M(\omega _0)-1} \beta _i^{*}a_i=\mathbf b[n]^H\mathbf a. \end{aligned}$$
(9)

The vectors \(\mathbf a=[a_0, ... ,a_{M-1}]^T\) and \(\mathbf b[n]=[\beta _0[n],...,\beta _{M-1}[n]]^T\) represent a vector of random expansion coefficients and a vector of orthonormal functions at time n, respectively. The number of expansion functions \(M\le N\) depends on the frequency \(\omega _0\) and indicates the degree to which \(A(n,\omega _0)\) varies with time. For small M\(A(n,\omega _0)\) is slowly varying and for large values of M, \(A(n,\omega _0)\) is rapidly varying. Then, any time behavior of \(A(n,\omega _0)\) can be approximated by changing M. The order of expansion must be kept at a minimum to improve frequency resolution Kayhan et al. (1994). The minimum MSE estimate for \(A(n,\omega _0)\) is given as

$$\begin{aligned} \hat{A}(n,\omega _0)=\sum _{i=0}^{M-1} \beta _i^{*}[n]\sum _{k=0}^{N-1} \beta _i [k]x[k]e^{ -j\omega _0 k}, \end{aligned}$$
(10)

and for all possible values of frequency, the time-varying spectral density is called the EP Kayhan et al. (1994). Therefore, the relation between the estimator and the time-varying spectral density is

$$\begin{aligned} \hat{S}=\frac{2\pi }{d\omega }|\hat{A}(n,\omega )|^2=\frac{N}{M}|\sum _{i=0}^{M-1} \beta _i^{*}[n]\sum _{k=0}^{N-1} \beta _i [k]x[k]e^{ -j\omega k}|^2. \end{aligned}$$
(11)

Rewriting

$$\begin{aligned} \hat{S}=\frac{N}{M}\big |\sum _{k=0}^{N-1} v[n,k] x[k]e^{ -j\omega k}\big |^2, \end{aligned}$$
(12)

here \(\hat{S}\) can be interpreted as the magnitude square of the Fourier transform of x[k] windowed by a sequence v[nk] where \(v[n,k]= \sum _{i=0}^{M-1} \beta _i^{*}[n] \beta _i [k]\). Using the model in (7) at frequency \(\omega _0\), the derivations above can be expanded for array processing as in Kayhan and Moeness (2000). For example, considering signals \(\{x_l[n]\}\), \(1\le l\le L\),   \(0\le n \le N-1\), where L is the number of sensors and N is the number of the data snapshots, \(\{A_l (n,\omega _o)\}\) can be represented as an expansion of M orthogonal basis functions for the sensor data \(x_l[n]\) as

$$\begin{aligned} A_l(n, \omega _o)=\sum _{i=0}^{M(\omega _0)-1} \beta _i^{*}a_i, \end{aligned}$$
(13)

and \(x_l[n]\) can be expressed over the observation interval in vector form

$$\begin{aligned} \mathbf{{x}}_l=\mathbf{{F}}(\omega _0)\mathbf{{a}}_l(\omega _0)+\mathbf{{y}}_l(\omega _0), \end{aligned}$$
(14)

where \(\mathbf{{F}}(\omega _0)\) is a matrix with entries \(\mathbf{{F}}_{n+1,i+1}=\beta _i^{*}[n]e^{j\omega _0 n}\), Kayhan and Moeness (2000). A precise representation can be obtained in the joint TF domain, if we have enough knowledge in the spectral characteristics of the signals, Otherwise, we can use some bandwidth estimation techniques such as Tsiakoulis et al. (2013); Wang and Yong (2016); Marques (2006); Liebeherr et al. (2016, 2007) to obtain M. Letting \(\mathbf{{a}}[n]=\mathbf{{b}}[n]^H\mathbf{{A}}\) be a vector of amplitudes at time n, the estimates of the time-varying amplitudes are obtained as \(\mathbf{{\hat{a}}}[n]=\mathbf{{b}}[n]^H\mathbf{{F}}^H \mathbf{{x}}\) via MSE estimator. Then, in array signal processing, the cross-power evolutionary spectral density estimator can be computed as

$$\begin{aligned} \mathbf{{\hat{S}}}_{xx}(n,\omega )=E\{\mathbf{{\hat{a}}}[n]^H\mathbf{{\hat{a}}}[n]\}, \end{aligned}$$
(15)

which is also

$$\begin{aligned} \mathbf{{\hat{S}}}_{xx}(n,\omega )=(\mathbf{{b}}[n]^H\mathbf{{F}}^H)\otimes _l\mathbf{{R}}\otimes _r(\mathbf{{Fb}}[n]), \end{aligned}$$
(16)

here \(\otimes _l\) and \(\otimes _r\) are the left and right block Kronecker product, respectively and \(\mathbf{{R}}=E\{\mathbf{{x}}{} \mathbf{{x}}^H\}\) for E being the expectation operation. The cross-power evolutionary spectral density estimator at time n and frequency \(\omega _0\) between the data at sensors \(\ell\) and m can be obtained as \(\mathbf{{{\hat{S}}}}_{x_\ell x_m}(n,\omega )\) Kayhan and Moeness (2000).

3 Proposed evolutionary spectrum

3.1 Discrete evolutionary transform

In this section, we briefly review the discrete evolutionary transform (DET). In Suleesathira et al. (1998), DET was defined to represent a nonstationary signal and its spectrum. The DET can be thought of as a generalization of the short time Fourier transform (STFT) and can be connected to the EP Suleesathira et al. (1998). Using the Gabor or the Malvar representations with the Wold-Cramér representation, an evolutionary kernel can be obtained and the ES is the magnitude square of the evolutionary kernel Suleesathira et al. (1998).

The Wold-Cramér representation, similar to (1) can be written as

$$\begin{aligned} x[n]= \sum _{k=0}^{K-1} X(n,\omega _k) e^{ j\omega _k n}, \end{aligned}$$
(17)

where \(\omega _k=2\pi k/ K\), \(0 \le n\le N-1\) and \(X(n,\omega _k)\) is called the evolutionary kernel Suleesathira et al. (1998). The DET can be obtained by expressing the kernels directly from the signal by considering the Gabor and the Malvar representations of x[n]. In this case associating with the sinusoidal representation in (1)

$$\begin{aligned} X(n,\omega _k)= \sum _{\ell =0}^{N-1} x(\ell ) W_k(n,\ell ) e^{ -j\omega _k \ell }, \end{aligned}$$
(18)

is an inverse discrete transformation that provides the evolutionary kernel, \(X(n,\omega _k)\) in terms of the signal. \(W_k(n,\ell )\) is in general, a time and frequency dependent window Suleesathira et al. (1998). Here the ES is defined as \(S_E(n,\omega _k)=|X(n,\omega _k)|^2\). It becomes obvious that the DET is a generalization of the STFT and \(S_E(n,\omega _k)\) is a generalization of the spectrogram. A similar representation for the kernel was obtained in Kayhan et al. (1994) when developing the EP by expressing the time-varying window as a set of orthogonal functions.

3.2 Evolutionary Slepian transform and spectrum

Stationary and nonstationary random processes can be represented by general orthogonal expansions as proposed by Priestley Priestley (1981). Discrete form of prolate spheroidal wave functions (PSWF) Slepian (1962) can be used efficiently for signal decomposition Oh et al. (2010) and called discrete prolate spheroidal sequences (DPSS). They are also known as Slepian sequences. The PSWF have been used in time series analysis Moghtaderi et al. (2009). Indeed, PSWF have been used in many applications, one example is in communication theory Moore and Cada (2004), Wavelet-like properties is in Simons et al. (2018) and their mathematical properties and computation are presented in Walter and Shen (2003). Discrete form of the PSWF i.e., DPSS resulted from the work of Slepian about the problem of concentrating a signal jointly in temporal and spectral domains Slepian (1962).

Given N and \(0< \Omega <1/2\), the DPSS are a collection of N real valued, strictly bandlimited \(|f|\le \Omega\) discrete time sequences \({\phi }_{N, \Omega }=\left[ \phi ^{(1)}_{N,\Omega }, \phi ^{(2)}_{N,\Omega }, \cdots , \phi ^{(N)}_{N,\Omega }\right]\) with their corresponding eigenvalues \(1>\lambda ^{(1)}_{N,\Omega }>\lambda ^{(2)}_{N,\Omega }\cdots \lambda ^{(N)}_{N,\Omega }>0\). The second Slepian sequence maximizes the ratio and is orthogonal to the first Slepian sequence. The third Slepian sequence maximizes the ratio of integrals and is orthogonal to both the first and second Slepian sequences. Continuing in this way, the Slepian sequences form an orthogonal set of bandlimited sequences. There are \(2N\Omega -1\) Slepian sequences with energy concentration ratios approximately equal to one and for the rest, the concentration ratios begin to approach zero, (See Fig.1). For a given integer \(K\le N\), we can get \(N\times K\) matrix formed by taking the first K columns of \(\phi_{N,W}\). When \(K\approx 2NW\), it is a highly efficient basis that captures most of the signal energy. The performance of ES depend on how well the signals are represented using the DPSS. Details on how many DPSS are needed for optimum representation can be found in Oh et al. (2010); Moore and Cada (2004); Walter and Shen (2003).

Fig. 1
figure 1

Left: First four Slepian sequences for chosen N=512 and N\(\Omega\)=3; right: energy concentrations i.e., eigenvalues

In general, a signal x[n] can be represented in terms of an orthogonal basis \(\{\phi _{k}[n]\}\) as,

$$\begin{aligned} x[n]=\sum _{k=0}^{K-1}d_k \phi _{k}[n], ~~~0\le n\le N-1,~~ d_k=\sum _{n=0}^{N-1}x[n]\phi ^*_{k}[n], ~~~ 0\le k\le K-1. \end{aligned}$$
(19)

We showed in Oh et al. (2010) that rewriting x[n] as follows:

$$\begin{aligned} x[n]=\sum _{k=0}^{K-1}\underbrace{\bigg [d_k \phi _{k}[n]e^{-j\omega _k n}\bigg ]}_{X(n,\omega _k)}e^{j\omega _k n}, \end{aligned}$$
(20)

where \(\omega _k=2\pi \frac{k}{N}\), the evolutionary kernel \(X(n,\omega _k)\) can be expressed in terms of x[n] as

$$\begin{aligned} X(n,\omega _k)=d_k \phi _{k}[n]e^{-j\omega _k n} =\sum _{m=0}^{N-1}x[m]W_k(n,m)e^{-j\omega _k m}, \end{aligned}$$
(21)

where \(W_k(n,m)=\phi _{k}[n]\phi ^*_{k}[m]e^{-j\omega _k (n-m)}.\) To obtain the evolutionary kernel, specifically the window \(W_k(n,m)\), we considered DPSS \(\{\phi _{k}[n]\}\) as the bases of the representation in Oh et al. (2010). Accordingly, by taking the magnitude square as \(|X(n,\omega _k)|^2\), we obtain the evolutionary Slepian spectrum.

3.3 Windowed evolutionary Slepian transform and spectrum

Starting from the general definition of DET and multiplying with a Gaussian window h(n), we define the windowed evolutionary Slepian transform as follows:

$$\begin{aligned} x[n]=\sum _{m=0}^{M-1}\sum _{k=0}^{K-1} d_{m,k} \phi _k[n] h(n-mL), \end{aligned}$$
(22)

modifying (22) by multiplying with both \(e^{-j \omega _kn }\) and \(e^{j \omega _kn }\) (i.e., no effect as \(e^{-j \omega _kn }e^{j \omega _kn }=1\)), we can compute the evolutionary kernel \(X(n,\omega _k)\) as follows:

$$\begin{aligned} x[n]= \sum _{k=0}^{K-1}\underbrace{\bigg [ \sum _{m=0}^{M-1} d_{m,k} \phi _k[n]h(n-mL)e^{-j\omega _kn}\bigg ] }_{X(n, \omega _k)}e^{j\omega _k n}. \end{aligned}$$
(23)

The coefficients \(d_{m,\ell }\) can be calculated as

$$\begin{aligned} d_{m,\ell }= \sum _{\ell =0}^{N-1} x[\ell ] \phi _{k}^*[\ell ]\gamma ^*(\ell -mL). \end{aligned}$$
(24)

Rewriting \(X(n, \omega _k)\), we obtain

$$\begin{aligned} X(n, \omega _k)=\sum _{\ell =0}^{N-1}\sum _{m=0}^{M-1}x(\ell ) \phi _{k}^*[\ell ] \phi _{k}[n]\varphi [n, \ell ]e^{-j\omega _kn}, \end{aligned}$$
(25)

where \(\varphi [n, \ell ]=h(n-mL) \gamma ^*(\ell -mL)\) for \(h(n-mL)\) and \(\gamma ^*(\ell -mL)\) being Gaussian functions. Arranging the terms by multiplying with \(e^{-j \omega _k \ell }\) and \(e^{j \omega _k\ell }\) and rearranging

$$\begin{aligned} X(n, \omega _k)=\sum _{\ell =0}^{N-1}x(\ell ) W(n, \ell ) e^{-j \omega _k \ell }, \end{aligned}$$
(26)

gives us an expression similar to STFT for the evolutionary kernel \(X(n, \omega _k)\) and also the EP for the time-frequency dependent window \(W(n, \ell )\) where

$$\begin{aligned} W(n, \ell )= \sum _{m=0}^{M-1} \phi _{k}^*[\ell ] \gamma ^*(\ell -mL) e^{j\omega _k \ell } \phi _{k}[n]h(n-mL)e^{-j\omega _k n}. \end{aligned}$$
(27)

Accordingly, we can obtain the evolutionary Slepian spectrum (ESS) which we call windowed evolutionary Slepian spectrum (WESS) by taking the magnitude square as \(S=|X(n, \omega _k)|^2\) and illustrate in the simulations.

Fig. 2
figure 2

TVARMA sources and their corresponding WESS

Fig. 3
figure 3

Separation of sources from noiseless observations and comparison with actual sources using WESS

Fig. 4
figure 4

Separation of sources from noisy observations (SNR 20 dB) and comparison with actual sources using WESS

4 Blind source separation problem

4.1 Problem formulation

Blind source separation (BSS) covers a wide range of applications in diverse fields such as digital communications, pattern recognition, biomedical engineering, and financial data analysis, among others. Separation of unknown signals that have been mixed in an unknown way has been a topic of great interest in the signal processing community, as well. In general, the available BSS methods use the following data model for each signal received at each sensor Belouchrani et al. (1997):

$$\begin{aligned} \mathbf{{x}}[n]=\mathbf{{C}}{} \mathbf{{s}}[n]+\mathbf{{\mu }}[n], \end{aligned}$$
(28)

such that

  • \(\mathbf{{x}}[n]=[x_1[n], \ldots , x_p[n]]^T\) is a p vector of observations,

  • \(\mathbf{{s}}[n]=[s_1[n], \ldots , s_q[n]]^T\) is a q vector of unknown sources,

  • \(\mathbf{{C}}\) is a \(p\times q\) mixing or array matrix,

  • \(\mathbf{{\mu }}[n]\) is a zero-mean, \(\sigma ^2\) variance white noise vector.

The objective is to obtain an estimate \(\mathbf {\hat{C}}\) of \(\mathbf {C}\) and obtain sources as

$$\begin{aligned} \mathbf{{\hat{s}}}[n]=\mathbf{{\hat{C}}}^\# \mathbf{{x}}[n]\approx \mathbf{{G}}{} \mathbf{{s}}[n]+\mathbf{{\hat{C}}}^\# \mathbf{{\mu }}[n] \end{aligned}$$
(29)

where \(\#\) represents pseudoinverse and \(\mathbf{{G}}\) is a matrix with only one nonzero entry per row and column Belouchrani et al. (1997). In particular, the approaches using time-frequency signal representations for BSS involve the following steps Fevotte and Doncarli (2004):

  • Estimation of the spatial time-frequency spectra,

  • Estimation of whitening matrix and noise variance,

  • Joint-diagonalization of the noise compensated and whitened spatial time-frequency spectra matrices.

The details of these steps and full implementation of the BSS can be found in Belouchrani et al. (1997); Belouchrani and Amin (1998); Fevotte and Doncarli (2004); Cardoso and Souloumiac (1993).

4.2 Spatial evolutionary transform and BSS

In time-frequency approach for BSS, using the data model received at each sensor, the cross-power spectral estimate can be written as Kayhan and Moeness (2000),

$$\begin{aligned} \mathbf{{\hat{S}}}_{xx}(n,\omega )=\mathbf{{C}}{} \mathbf{{\hat{S}}}_{ss}(n,\omega )\mathbf{{C}}^H+\sigma ^2 \mathbf{{b}}[n]^H\mathbf{{b}}[n]\mathbf {I}. \end{aligned}$$
(30)

In this paper, in the equation above, \(\mathbf{{\hat{S}}}_{xx}(n,\omega )\) is spatial evolutionary Slepian spectrum representation. Representing \(\mathbf {W}\) as the \(p\times q\) whitening matrix and letting \(\mathbf {U}=\mathbf {WC}\) be unitary, whitened and noise compensated matrices

$$\begin{aligned} \mathbf{{\tilde{S}}}_{xx}(n,\omega ) = \mathbf{{W}}(\mathbf{{\hat{S}}}_{xx}(n,\omega )-\sigma ^2\mathbf {I})\mathbf{{W}}^H=\mathbf{{U}}{} \mathbf{{{S}}}_{ss}(n,\omega )\mathbf{{U}}^H \end{aligned}$$
(31)

where \(\mathbf {U}\) is unitary and diagonalizes the cross-power spectral estimate \(\mathbf{{\tilde{S}}}_{xx}(n,\omega )\) for any \((n,\omega )\)Tong et al. (1991); Fevotte and Doncarli (2004); Cardoso and Souloumiac (1993). The unitary matrix can be estimated from the eigenvectors of any \(\mathbf{{\tilde{S}}}_{xx}(n,\omega )\) with distinct eigenvalues and the mixing matrix is obtained using \(\mathbf {C}=\mathbf {W}^\#\mathbf {U}\). The source signals are then estimated as in (29) Fevotte and Doncarli (2004).

Fig. 5
figure 5

a Ground truth simulated EEG signals, b Observations of ground truth simulated EEG signals using random mixing matrix for \(n=3\) and \(m=4\) and \(SNR=20dB\)

Fig. 6
figure 6

Separated sources using a MST based BSS, b WESS based BSS, c SWVD based BSS

Fig. 7
figure 7

a Ground truth simulated EEG signals, b Observations of ground truth simulated EEG signals using random mixing matrix for \(n=4\) and \(m=5\)

5 Experimental results

In our experiments, we test the applicability of our proposed evolutionary Slepian spectrum (ESS) in the BSS problem. We first apply our method for a time varying autoregressive moving average (TVARMA) process (See Fig. 2). Based on the time-frequency representation based BSS algorithm in Fevotte and Doncarli (2004), we can simulate an overdetermined case (i.e., \(m\le n\)) and use three sources and four observations. Overdetermined case can be an example for speech signal processing or telecommunications applications where there are typically more sensors than the number of sources. We chose random matrices as the mixing matrices to generate observations. The observations we simulated are noise free and noisy (20 dB SNR). The separation of three sources and estimation from noise free observations using the proposed method is presented in Fig. 3. Then we test the estimation of sources using noisy observations for SNR 20 dB (See Fig. 4).

In the second set of experiments, we used signals (which are simulations of typical biosignals that are observed in recordings of EEG such as eyeblink, muscle movement of limbs and heart, etc.) provided by ICALAB Cichocki et al. (2007) which can be used to test the performance of evolutionary Slepian spectrum in the BSS problem. By using a random mixing matrix, we obtained one example of mix signals for the overdetermined case (\(m=3, n=4\) and \(m=4, n=5\)) shown in Figs. 5a–b and 7a–b shows the separated result via our proposed evolutionary Slepian spectrum. It can be seen from the results that sources can be separated from the mixtures.

We evaluated the performance of the WESS by comparing it to the modified S-Transform (MST) and smoothed Wigner-Ville Distribution (SWVD) (See Figs. 6 and 8.) We chose the MST for its being an adaptive form of the STFT and continuous Wavelet Transform (CWT) and the WVD was chosen for its being a high resolution TF representation Stankovic (1994); Boashash (1991). All time-frequency distributions (TFDs) can be derived from the WVD by convolving the WVD with a signal dependent kernel. Although the WVD offers high energy concentration for linearly mono-component frequency modulated signals, any nonlinear modulation or multi-components make the WVD suffer from cross-terms. Cross-terms can be suppressed by smoothing with a two-dimensional kernel. In the ST the mother wavelet-like term is separated into a slowly varying term (the Gaussian function) and an oscillatory exponential kernel. The Gaussian function localizes in time the amplitude modulated (AM) component while the oscillatory exponential kernel selects the frequency being localized, frequency modulated (FM) component. The modification can be obtained by changing a scaling parameter linearly with the frequency to get better control of the window width and the MST is obtained. In our simulations, we observed from the separated sources in Figs. 6 and 7 that evolutionary Slepian spectrum performs better than the WVD and S-T in the BSS problem.

Fig. 8
figure 8

Separated sources using a MST based BSS, b WESS based BSS, c SWVD based BSS

6 Conclusions

In this paper, we defined a spectrum representation method using Slepian sequences similar to the evolutionary periodogram and showed that Slepian evolutionary spectrum can be used for blind source separation problem for nonstationary and in particular for simulated biosignals. Evolutionary spectrum provides a novel and useful approach for separation of individual signals (biosignals and/or nonbiosignals) from electrophysiological recordings and can be used for removal of artifacts by subtracting the sources from the recordings. Some advantages of the proposed method: 1.) The algorithm is computationally efficient and easy to implement. 2.) After the source separation post processing methods, i.e., filtering, wavelet transform or time-frequency methods can be applied for precise analysis of the individual source signals. 3.) In addition to artifact removal, evolutionary spectrum based BSS can be useful for analyzing changes in the biological activities in a way similar to nonstationary processes without making the stationarity assumption for the sources. As our future work, we will expand our experiments to larger data sets for the BSS problem and also explore methods that can enable us to determine the number of sources in the observation mixtures.