1 Introduction

Blind Source Separation (BSS) consists in decomposing several observed signals into a set of source signals and their mixing parameters, with almost no prior knowledge about them (Comon and Jutten 2010; Deville 2016). Most of the researches dealing with BSS methods suppose that the mixing model is linear, where the observed signals result from linear combinations of the source signals. Nevertheless, for some applications, the linear mixing model is not valid, and must be replaced by a nonlinear one. This nonlinear model provides a better description of the mixing process and the interactions between sources. Due to the complexity of nonlinear models, the nonlinear BSS methods are more complex and remain less studied (Taleb 2002; Deville and Hosseini 2009; Hosseini and Deville 2013). This complexity may be reduced by constraining the structure of the mixing models. Indeed, the addition of simplifying assumptions has allowed the development of exploitable nonlinear models. Among these models, the linear-quadratic (LQ) model has drawn significant attention [see e.g. the survey in Deville and Duarte (2015)]. Research focused on the LQ model has demonstrated the relevance of its use in several applications such as remote sensing (Meganem et al. 2014a, b; Eches and Guillaume 2014; Jarboui et al. 2014, 2016), analysis of gas sensor array data (Bedoya 2006; Ando et al. 2015), and scanned document processing (Merrikh-Bayat et al. 2011; Duarte et al. 2011; Almeida and Almeida 2012; Liu and Wang 2013). The particularity of the LQ model as compared with the linear one is the presence of the second-order terms. Thereby, considering K observations resulting from an LQ mixture of L sources, the relationship between the observed and the source signals can be characterized by the following equation

$$\begin{aligned} x_i(n)=\sum \limits _{j=1}^{L} a_j (i) s_j(n)+ \sum \limits _{j=1}^{L}\sum \limits _{k=j}^{L} a_{j,k} (i) s_j(n)s_k(n), \end{aligned}$$
(1)

where \(x_i(n)\) is the ith observed signal at time n, \(s_j(n)\) is the jth unknown source signal, \( a_{j}(i) \) is the linear coefficient associated with the jth source signal and the ith observed signal, \( a_{j,k}(i) \) is the quadratic mixing coefficient associated with the ith observed signal and resulting from the interaction between the jth and the kth sources. All the actual sources \(s_j\) and the pseudo-sources \(s_j \times s_k\) are called extended sources, as in Deville and Duarte (2015); Meganem et al. (2014b). There is a particular case of the LQ model called the bilinear (BL) model where the squared term coefficients \( a_{j,j}(i) \) are null. The model (1) then becomes

$$\begin{aligned} x_i(n)=\sum \limits _{j=1}^{L} a_j (i) s_j(n)+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) s_j(n)s_k(n). \end{aligned}$$
(2)

Various methods applicable to the LQ model have been proposed (Deville and Duarte 2015). While some of them are only devoted to Blind Mixture Identification (BMI), in order to only identify the mixing parameters (Krob and Benidir 1993; Abed-Meraim et al. 1996), the others are dedicated to BSS, which aims also at estimating the source signals. They include Sparse Component Analysis (SCA) methods (Jarboui et al. 2014; Deville and Hosseini 2007), which are only applicable to sparse sources, Non-negative Matrix Factorization (NMF) methods (Meganem et al. 2014b; Eches and Guillaume 2014; Jarboui et al. 2016), which may be used only when sources and mixing coefficients are non-negative, and Independent Component Analysis (ICA) methods (Deville and Hosseini 2009; Hosseini and Deville 2013; Castella 2008), which are based on the assumption that the source signals are statistically independent. Nevertheless, the use of LQ ICA-based methods, either BMI or BSS, is usually constrained by other properties of the sources and/or the mixture. For example, some of them can be used only when the sources are complex-valued and circular (Krob and Benidir 1993; Abed-Meraim et al. 1996) or binary (Castella 2008). Others are suited only to determined mixtures (Deville and Hosseini 2009; Hosseini and Deville 2013; Almeida and Almeida 2012), which generally means overlooking a useful part of the available observations. Moreover, most of the existent LQ ICA-based methods are time-consuming.

In this paper, we study the bilinear model in an over-determined configuration. In addition, we suppose that the sources are real-valued, stochastic, auto-correlated, mutually independent and jointly strict-sense stationary signals. Then, we propose a new and fast BSS method, called Bilinear Second-Order Blind Source Separation (B-SO-BSS), based on Second-order Statistics (SOS) and the joint diagonalization of correlation matrices of the whitened centred observed signals. Such SOS methods have already been proposed in the framework of linear BSS (Tong et al. 1990; Belouchrani et al. 1997). We first study the correlation between different extended sources in Sect. 2, then we present our proposed method developed based on the results of this study in Sect. 3, and in Sect. 4, we eventually provide some simulation results using artificial mixtures of synthetic and real-world sources.

2 Mutual correlation of the extended sources

In this section, by supposing that the actual sources \(s_j\) are real-valued, stochastic, auto-correlated , mutually independent and jointly strict-sense stationary, we investigate whether all the extended sources are mutually uncorrelated in two different cases: when the actual sources are zero-mean and when they are not.

2.1 Case of zero-mean actual sources

We here detail the study of the correlation between the different extended sources when the actual sources are zero-mean. The pseudo-sources are then zero-mean, as will now be shown. Indeed, these pseudo-sources are defined as \( s_j(n)s_k(n) \), with \( j \ne k \). Their factors \( s_j(n) \) and \( s_k(n) \) are independent and thus uncorrelated, which yields

$$\begin{aligned} E\{ s_j(n)s_k(n) \}= & {} E\{ s_j(n) \} E\{ s_k(n) \} \nonumber \\= & {} 0 \end{aligned}$$
(3)

where \( E\{ . \} \) stands for expectation.

All extended sources are thus zero-mean in the considered case. Therefore, their cross-covariance functions, which should be used to measure their correlation, are here equal to their cross-correlation functions. The latter functions are derived hereafter.

2.1.1 Correlation function of \(s_i(n)\) and \(s_j(n)\)

Two different actual sources \(s_i(n)\) and \(s_j(n)\) being independent and zero-mean, their cross-correlation function is equal to zero:

$$\begin{aligned} R_{s_i,s_j}(\tau )=E\{s_i(n+\tau ) s_j(n)\}=0 \ . \end{aligned}$$
(4)

2.1.2 Correlation function of \(s_i(n)\) and \(s_j(n)\times s_k(n)\)

\( \mathrm {If} \ (i \ne j) \ \mathrm {and} \ (i \ne k)\), since \(s_i(n)\), \(s_j(n)\) and \(s_k(n)\) are independent, we can write

$$\begin{aligned} R_{s_i,(s_j s_k)}(\tau )= & {} E\{s_i(n+\tau )(s_j(n)s_k(n))\} \nonumber \\= & {} E\{s_i(n+\tau )\} E\{s_j(n)\}E\{s_k(n)\}\nonumber \\= & {} 0 . \end{aligned}$$
(5)

\( \mathrm {If} \ (i = j) \ \mathrm {or} \ (i = k)\), the reasoning is similar for the two cases, e.g. considering \((i = j)\), we get

$$\begin{aligned} R_{s_i,(s_i s_k)}(\tau )= & {} E\{s_i(n+\tau )(s_i(n) s_k(n))\} \nonumber \\= & {} E\{s_i(n+\tau ) s_i(n)\} E\{s_k(n)\} \nonumber \\= & {} 0 \ . \end{aligned}$$
(6)

2.1.3 Correlation function of \(s_i(n)\times s_j(n)\) and \(s_k(n)\times s_l(n)\)

\(\mathrm {If} \ (i \ne k),\ (i \ne l),\ (j \ne k) \ \mathrm {and} \ (j \ne l)\) , the independence of \(s_i(n)\), \(s_j(n)\), \(s_k(n)\) and \(s_l(n)\) yields

$$\begin{aligned} R_{(s_i s_j),(s_k s_l)}(\tau )= & {} E\{(s_i(n+\tau ) s_j(n+\tau ))(s_k(n) s_l(n))\} \nonumber \\= & {} E\{s_i(n+\tau )\} E\{s_j(n+\tau )\} E\{s_k(n)\} E\{s_l(n)\} \nonumber \\= & {} 0 \ . \end{aligned}$$
(7)

\( \mathrm {If} \ (i = (k \ \mathrm {or} \ l)) \ \mathrm {xor} \ (j = (k \ \mathrm {or} \ l))\), the reasoning is similar for all cases, e.g. considering \((i=k)\) and therefore \((j \ne (k \ \mathrm {and} \ l))\), the sources \(s_i\), \(s_j\) and \(s_l\) are independent so that

$$\begin{aligned} R_{(s_is_j),(s_i s_l)}(\tau )= & {} E\{(s_i(n+\tau ) s_j(n+\tau ))(s_i(n)s_l(n))\} \nonumber \\= & {} E\{s_i(n+\tau )s_i(n)\} E\{s_j(n+\tau )\} E\{s_l(n)\} \nonumber \\= & {} 0 \ . \end{aligned}$$
(8)

Thus, in the case of zero-mean actual sources, all the extended sources are mutually uncorrelated.

2.2 Case of non-zero-mean actual sources

In this case, (3) shows that the pseudo-sources are non-zero-mean, so that the correlation of the extended sources must be measured using their cross-covariance functions. Assuming that the actual sources are auto-correlated such that the covariance function \(C_{s_i,s_i}(\tau ) \ne 0\) at a lag \(\tau \), we here present two types of extended sources which are mutually correlated.

2.2.1 Covariance function of \(s_i(n)\) and \(s_j(n)\times s_k(n)\) when \((i = j) \ \mathrm {or} \ (i = k)\)

The reasoning is similar for the two cases, e.g. considering \((i=j)\), calculating the covariance function yields

$$\begin{aligned} C_{s_i,(s_i s_k)}(\tau )= & {} E\{s_i(n+\tau )(s_i(n) s_k(n))\} - E\{s_i(n+\tau )\} E\{s_i(n) s_k(n)\} \nonumber \\= & {} E\{s_i(n+\tau ) s_i(n)\} E\{s_k(n)\}- E\{s_i(n+\tau )\} E\{s_i(n)\} E\{s_k(n)\} \nonumber \\= & {} \Big ( E\{s_i(n+\tau ) s_i(n)\}- E\{s_i(n+\tau )\} E\{s_i(n)\}\Big ) \ E\{s_k(n)\} \nonumber \\= & {} C_{s_i,s_i}(\tau ) E\{s_k(n)\} \nonumber \\\ne & {} 0 \end{aligned}$$
(9)

Therefore, \(s_i(n)\) and \(s_i(n)\times s_k(n)\) are correlated.

2.2.2 Covariance function of \(s_i(n)\times s_j(n)\) and \(s_k(n)\times s_l(n)\) when \((i = (k \ \mathrm {or} \ l)) \ \mathrm {xor} \ (j = (k \ \mathrm {or} \ l))\)

The reasoning is similar for all cases, e.g. considering \((i=k)\) and therefore \((j \ne (k \ and \ l))\), the covariance function reads

$$\begin{aligned} C_{(s_i s_j),(s_i s_l)}(\tau )= & {} E\{(s_i(n+\tau ) s_j(n+\tau ))(s_i(n) s_l(n))\} \nonumber \\&- E\{s_i(n+\tau ) s_j(n+\tau )\} E\{s_i(n) s_l(n)\} \nonumber \\= & {} \Big ( E\{s_i(n+\tau ) s_i(n)\} E\{s_j(n+\tau )\} E\{s_l(n)\}\Big )\nonumber \\&-\Big ( E\{s_i(n+\tau ) \} E\{s_j(n+\tau )\} E\{s_i(n)\} E\{s_l(n)\} \Big ) \nonumber \\= & {} C_{s_i,s_i}(\tau ) E\{s_j(n+\tau )\} E\{s_l(n)\} \nonumber \\\ne & {} 0 \end{aligned}$$
(10)

Therefore, \(s_i(n)\times s_j(n)\) and \(s_i(n)\times s_l(n)\) are correlated.

3 Proposed BSS method

In this section, we propose a new BSS method, called Bilinear Second-Order Blind Source Separation (B-SO-BSS), first for zero-mean and then for non zero-mean actual sources.

3.1 Case of zero-mean actual sources

The bilinear mixing model (2) can be written in the following matrix form

$$\begin{aligned} \mathbf{x}(n)=\mathbf{As}(n), \end{aligned}$$
(11)

where \(\mathbf{x}(n)=[x_1(n),\ldots ,x_K(n)]^T\) is the vector of K observed signals at time n, \(\mathbf{s}(n)=[s_1(n),\ldots ,s_L(n),s_1(n)s_2(n),\ldots ,s_{L-1}(n)s_L(n)]^T\) is the vector of all the extended sources at time n, and the mixing matrix \(\mathbf{A}\), which contains both linear and quadratic mixing parameters, reads

$$\begin{aligned} \mathbf{A}=\left( \begin{array}{cccccc} a_1(1) &{} \quad \cdots &{} \quad a_L(1) &{} \quad a_{1,2}(1) &{} \quad \cdots &{} \quad a_{L-1,L}(1) \\ \vdots &{} \quad \ddots &{} \quad \vdots &{} \quad \vdots &{} \quad \ddots &{} \quad \vdots \\ a_1(K) &{} \quad \cdots &{} \quad a_L(K) &{} \quad a_{1,2}(K) &{} \quad \cdots &{} \quad a_{L-1,L}(K) \end{array} \right) . \end{aligned}$$
(12)

Then, the bilinear mixture can be considered as a linear mixture of the \(L(L+1)/2\) extended sources. In the following, we assume that \(K \ge L(L+1)/2\) so that this reformulated linear mixture is not under-determined. As shown in Sect. 2.1, all the extended sources are mutually uncorrelated in the case considered here. If they are also auto-correlated with different auto-correlation functions, the source separation may be achieved by jointly diagonalizing the correlation matrices of the whitened centred observations at different lags as will be detailed in Sect. 3.4.

3.2 Case of non-zero-mean actual sources

As shown in Sect. 2.2, in this case some extended sources are mutually correlated. Here, we show how the original bilinear mixing model may be used to derive a new mixing model with new mutually uncorrelated extended sources. The mean \( E\{ s_j(n) \} \) of the actual source \( s_j(n) \) does not depend on the considered time n, since the actual sources are assumed to be strict-sense stationary. The expectation of \({s}_j\) will be denoted by \(\bar{s}_j\) hereafter. The centred version of \( s_j(n) \) is thus \(\widetilde{s}_j(n)=s_j(n)-\bar{s}_j\). The bilinear model (2) can then be written as

$$\begin{aligned} x_i(n)= & {} \sum \limits _{j=1}^{L} a_j (i) (\widetilde{s}_j(n)+\bar{s}_j) + \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) (\widetilde{s}_j(n)+\bar{s}_j)(\widetilde{s}_k(n)+\bar{s}_k)\nonumber \\= & {} \sum \limits _{j=1}^{L} a_j (i) \widetilde{s}_j(n) + \sum \limits _{j=1}^{L} a_j (i) \bar{s}_j+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \widetilde{s}_j(n) \widetilde{s}_k(n) \nonumber \\&+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) (\bar{s}_k \widetilde{s}_j(n)+\bar{s}_j \widetilde{s}_k(n)) + \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j \bar{s}_k. \end{aligned}$$
(13)

The fourth term on the right hand side of (13), denoted as F in the following, can be rewritten as

$$\begin{aligned} F= & {} \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_k \widetilde{s}_j(n) + \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k}(i) \bar{s}_j \widetilde{s}_k(n) \nonumber \\= & {} \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_k \widetilde{s}_j(n) + \sum \limits _{k=1}^{L-1}\sum \limits _{j=k+1}^{L} a_{k,j}(i) \bar{s}_k \widetilde{s}_j(n), \end{aligned}$$
(14)

where the last term above is obtained just by inverting the roles of symbols j and k.

Then, we introduce the coefficients \(a_{j,k}(i)\) with \(j>k\), defined with respect to the actual coefficients of (2), as \(a_{j,k}(i)=a_{k,j}(i)\).

This yields

$$\begin{aligned} F= & {} \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_k \widetilde{s}_j(n) + \sum \limits _{k=1}^{L-1}\sum \limits _{j=k+1}^{L} a_{j,k}(i) \bar{s}_k \widetilde{s}_j(n) \nonumber \\= & {} \bigg (\sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} \bigcup \sum \limits _{k=1}^{L-1}\sum \limits _{j=k+1}^{L}\bigg ) \ a_{j,k}(i) \bar{s}_k \widetilde{s}_j(n). \end{aligned}$$
(15)

The above sum contains all possible combinations of \(j\in \left[ 1,L\right] \), \(k\in \left[ 1,L\right] \) such that \(j\ne k\). It can then be rewritten as

$$\begin{aligned} F=\sum \limits _{j=1}^{L}\sum \limits _{k=1,k\ne j}^{L} a_{j,k}(i) \bar{s}_k \widetilde{s}_j(n). \end{aligned}$$
(16)

Replacing (16) in (13) leads to

$$\begin{aligned} x_i(n)= & {} \sum \limits _{j=1}^{L} \big (a_j (i)+ \sum \limits _{k=1,k\ne j}^{L} a_{j,k}(i) \bar{s}_k\big )\widetilde{s}_j(n) \nonumber + \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \widetilde{s}_j(n) \widetilde{s}_k(n) \nonumber \\&+ \sum \limits _{j=1}^{L} a_j (i) \bar{s}_j+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j \bar{s}_k, \end{aligned}$$
(17)

which yields

$$\begin{aligned} x_i(n)=\sum \limits _{j=1}^{L} \widetilde{a}_j (i) \widetilde{s}_j(n)+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \widetilde{s}_j(n) \widetilde{s}_k(n) +C_i, \end{aligned}$$
(18)

where \( \widetilde{a}_{j}(i) \) are the linear coefficients of the new model which are defined as

$$\begin{aligned} \widetilde{a}_{j}(i) ={a}_{j}(i)+\sum \limits _{k=1, k\ne j }^{L} a_{j,k}(i)\bar{s}_k, \end{aligned}$$
(19)

and \(C_i\) is a constant defined as

$$\begin{aligned} C_i=\sum \limits _{j=1}^{L} a_j (i) \bar{s}_j+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k. \end{aligned}$$
(20)

Since the actual centred sources \(\widetilde{s}_j(n)\) and \(\widetilde{s}_k(n)\) are zero-mean and independent, from (18) the mean of the observed value \(x_i(n)\) is equal to \(\bar{x}_i = C_i\). Thus, its centred version can be written as follows:

$$\begin{aligned} \widetilde{x}_i(n)= & {} x_i(n)-\bar{x}_i \nonumber \\= & {} \sum \limits _{j=1}^{L} \widetilde{a}_j (i) \widetilde{s}_j(n)+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \widetilde{s}_j(n) \widetilde{s}_k(n). \end{aligned}$$
(21)

As can be seen, the centred observations form a new bilinear mixture of the actual centred sources, although the mixing parameters of the linear part in this new model are not the same as those in the original mixture. According to the results provided in Sect. 2.1, the new extended sources \(\widetilde{s}_j(n)\) and \( \widetilde{s}_j(n) \widetilde{s}_k(n) \) are all mutually uncorrelated. We can then rewrite this new bilinear model in the matrix form (11) just by replacing \(\mathbf{s}\) and \(\mathbf{x}\) by \(\widetilde{\mathbf{s}}\) and \(\widetilde{\mathbf{x}}\), and the parameters \(a_j(i)\) by \(\widetilde{a}_j (i)\) in the expression (12) of the matrix \(\mathbf{A}\) to obtain the matrix \(\mathbf{\widetilde{A}}\).

The approach that we developed at this stage therefore yields a modified set of observations, namely the centred observations \( \widetilde{x}_i(n) \), which form a determined (or overdetermined) linear mixture of a modified set of mutually uncorrelated source signals, namely the extended centred sources related to the actual centred sources. Moreover, we hereafter consider the case when these modified source signals are auto-correlated with different auto-correlation functions. With respect to these modified observations and source signals, the configuration that we thus derived meets the same main assumptions as those which have previously been used in the literature, for plain linear mixtures, to derive second-order BSS methods, such as the Algorithm for Multiple Unknown Signals Extraction (AMUSE) (Tong et al. 1990) or its improved version, that is the Second-Order Blind Identification (SOBI) method (Belouchrani et al. 1997). This then allows us to derive extended versions of the above standard methods, which were initially intended for linear mixtures, in order to process our configuration based on bilinear mixtures. In particular, we hereafter propose an extension of SOBI.

3.3 Identifiability condition

A necessary step in the proposed method is to check the mixture identifiability condition. Similarly to the SOBI method, the proposed method uses several correlation matrices of the whitened centred observations for a fixed set of different non-zero lags \(\tau _i\in \{\tau _1,\ldots ,\tau _m\}\). For a lag \(\tau _i\), the correlation matrix of the \( L(L+1)/2 \) whitened centred observations \(\mathbf{z}(n)=\mathbf{W}\widetilde{\mathbf{x}}(n)\) (where \(\mathbf{W}\) is a whitening matrix) is given by:

$$\begin{aligned} R_{\mathbf{z}}(\tau _i)=\mathbf{U} R_{\widetilde{\mathbf{s}}}(\tau _i)\mathbf{U}^T \end{aligned}$$
(22)

where \(\mathbf{U}\) denotes an orthogonal matrix, \(^T\) stands for transposition, and \(R_{\widetilde{\mathbf{s}}}(\tau _i)\) denotes the correlation matrix of the extended centred sources associated with the lag \(\tau _i\), which is a diagonal matrix since the extended centred sources are mutually uncorrelated.

Let us consider the following theorem:

Theorem

Let \(\tau _i=\{{\tau }_1, \ldots ,{\tau }_m\}\) be m non-zero lags, \(\mathbf{V}\) be an orthogonal matrix, such that:

$$\begin{aligned}&\forall 1 \le i\le m \mathbf{V}^T R_{\mathbf{z}}(\tau _i) \mathbf{V}=diag[d_1(i), \ldots , d_{L(L+1)/2}(i)] \end{aligned}$$
(23)
$$\begin{aligned}&\forall 1 \le j\ne k \le L(L+1)/2, \exists i, 1 \le i\le m d_j(i) \ne d_k(i). \end{aligned}$$
(24)

Then, \(\mathbf{U}\) and \(\mathbf{V}\) are essentially equal, i.e. they are equal up to a multiplication by a matrix \(\mathbf{P}\), such that \(\mathbf{U}=\mathbf{VP}\), where \(\mathbf{P}\) has one nonzero entry in each row and column, whose value is equal to \(\pm 1\).

This theorem provides a uniqueness condition for the matrix \(\mathbf{U}\) and consequently the mixing matrix \(\widetilde{\mathbf{A}}\). Note that the mixing matrix cannot be identified when the extended centred sources have identical normalized spectra. But, if they have different normalized spectra, it is possible to find a set of lags \(\tau _i\) satisfying the theorem condition. More details are provided in Belouchrani et al. (1997).

It should in particular be noted that if one of the actual centred sources \(\widetilde{s}_i(n)\) is temporally uncorrelated, then all the pseudo-sources related to it, i.e. \(\widetilde{s}_i(n)\widetilde{s}_j(n)\) with \(i \ne j\) are temporally uncorrelated too, so that all these extended centred sources have identical (constant) normalized spectra. Thus, a necessary condition for identifiability is that all the actual centred sources must be autocorrelated.Footnote 1

3.4 Proposed algorithm

Our proposed algorithm (B-SO-BSS), which provides estimates of centred actual sources up to a permutation and scale factors, is summarized in Algorithm 1.

figure g

In the particular case in which the actual sources are zero-mean, the same algorithm may be used just by choosing \(\widetilde{\mathbf{s}}=\mathbf{s}\), \(\widetilde{\mathbf{A}}=\mathbf{A}\), and \(\widetilde{\mathbf{x}}=\mathbf{x}\).

The last step of the proposed algorithm, which consists in identifying the estimated actual centred sources among all the estimated extended centred sources, is detailed below.

3.5 Identifying the estimated actual centred sources

The first steps of the proposed method yield a set of signals \(\hat{\mathbf {s}}(n)\) composed of estimates of the \(L(L+1)/2\) unordered extended centred sources, up to a permutation and scale factors. Thus, if e.g. \(\hat{s}_j(n)\) and \(\hat{s}_k(n)\) correspond to two centred actual sources and \(\hat{s}_i(n)\) corresponds to their product, then \(\hat{s}_i(n)\) must ideally be proportional to \(\hat{s}_j(n)\times \hat{s}_k(n)\). As a result, the absolute value of the correlation coefficient between \(\hat{s}_i(n)\) and \(\hat{s}_j(n)\times \hat{s}_k(n)\) must be close to one. Thus, by computing this correlation coefficient for all the possible triplets \(\{i,j,k\}\,, i\ne j\ne k\), we can identify the estimated actual centred sources among all the estimated extended centred sources.Footnote 2

3.6 Estimation of actual mixing coefficients

Many BSS applications only aim at estimating the actual source waveforms, which are provided by our Algorithm 1. In some applications (like in hyperspectral image unmixing), however, it is also needed to estimate the mixing parameters. In the case of non-zero-mean actual sources, our algorithm provides an estimate of matrix \(\widetilde{\mathbf{A}}\), and not \(\mathbf{A}\) (up to a permutation and a diagonal matrix). As mentioned in Sect. 3.2, matrix \(\widetilde{\mathbf{A}}\) consists of:

  • columns containing quadratic coefficients \(a_{j,k}(i)\), like in matrix \(\mathbf{A}\),

  • columns containing modified linear coefficients \(\widetilde{a}_{j}(i)\), which are different from the actual coefficients \({a}_{j}(i)\) included in matrix \(\mathbf{A}\), and are defined by (19).

In the following, we propose a method to recover an estimate of the actual matrix \(\mathbf{A}\) (up to classical indeterminacies) from the estimate of matrix \(\widetilde{\mathbf{A}}\) provided by Algorithm 1.

From (19), we have

$$\begin{aligned} {a}_{j}(i)= \widetilde{a}_{j}(i) -\sum \limits _{k=1, k\ne j }^{L} a_{j,k}(i)\bar{s}_k. \end{aligned}$$
(25)

Inserting (25) in (20) yields

$$\begin{aligned} C_i= & {} \bar{x_i}\nonumber \\= & {} \sum \limits _{j=1}^{L} [\widetilde{a}_{j}(i)-\sum \limits _{k=1, k\ne j }^{L} a_{j,k}(i)\bar{s}_k ] \bar{s}_j+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k\nonumber \\= & {} \sum \limits _{j=1}^{L} \widetilde{a}_{j}(i)\bar{s}_j -\sum \limits _{j=1}^{L} \sum \limits _{k=1, k\ne j }^{L} a_{j,k}(i)\bar{s}_k \bar{s}_j+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k\nonumber \\= & {} \sum \limits _{j=1}^{L} \widetilde{a}_{j}(i)\bar{s}_j -2 \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k + \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k\nonumber \\= & {} \sum \limits _{j=1}^{L} \widetilde{a}_{j}(i)\bar{s}_j- \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} a_{j,k} (i) \bar{s}_j\bar{s}_k\nonumber \\= & {} \sum \limits _{j=1}^{L} \frac{\widetilde{a}_{j}(i)}{d_j} (d_j \bar{s}_j)+ \sum \limits _{j=1}^{L-1}\sum \limits _{k=j+1}^{L} \frac{a_{j,k} (i)}{d_{j,k}} (-d_{j,k} \bar{s}_j\bar{s}_k), \end{aligned}$$
(26)

where \(d_j\) and \(d_{j,k}\) are unknown arbitrary scale factors, up to which the extended centred sources and the columns of matrix \(\widetilde{\mathbf{A}}\) have been estimated by Algorithm 1. The above result can be written in the following matrix form

$$\begin{aligned} \mathbf{c}=\widetilde{\mathbf{A}}_1 \mathbf{e}, \end{aligned}$$
(27)

where \(\mathbf{c}=[C_1,\ldots ,C_K]^T\), \(\mathbf{e}=[d_1\bar{s}_1, \ldots , d_L \bar{s}_L, (-d_{1,2}\bar{s}_1\bar{s}_2), \ldots , (-d_{L-1,L}\bar{s}_{L-1}\bar{s}_L) ]^T\), and \(\widetilde{\mathbf{A}}_1\) is the result of dividing the columns of matrix \(\widetilde{\mathbf{A}}\) by unknown scale factors \(d_j\) and \(d_{j,k}\). In other words, \(\widetilde{\mathbf{A}}_1\) is the matrix \(\hat{\mathbf{A}}\) provided by Algorithm 1 up to estimation errors.Footnote 3

Note that \(\mathbf c \) can easily be obtained by estimating the means of observations. As a result, \(\mathbf{e}\) can be obtained using

$$\begin{aligned} \mathbf{e}=\widetilde{\mathbf{A}}_1^{\dagger } \mathbf{c}, \end{aligned}$$
(28)

where \(\dagger \) stands for pseudo-inverse.

Furthermore, (25) can be rewritten as

$$\begin{aligned} {a}_{j}(i)\bar{s}_j=\frac{\widetilde{a}_{j}(i)}{d_j}(d_j\bar{s}_j)+\sum \limits _{k=1, k\ne j }^{L} \frac{a_{j,k}(i)}{d_{j,k}}(-d_{j,k}\bar{s}_j \bar{s}_k). \end{aligned}$$
(29)

As mentioned above, \(\frac{\widetilde{a}_{j}(i)}{d_j}\) and \(\frac{a_{j,k}(i)}{d_{j,k}}\) (\(\forall i=1,\ldots , K\)) correspond to the columns of \(\widetilde{\mathbf{A}}_1\), estimated by Algorithm 1, and \((d_j\bar{s}_j)\) and \((-d_{j,k}\bar{s}_j \bar{s}_k)\) are the entries of \(\mathbf{e}\), estimated using (28). Consequently, \({a}_{j}(i) \; \forall i=1,\ldots , K\) can be estimated up to unknown factors \(\bar{s}_j\) according to (29). In other words, this approach allows one to estimate the columns of the actual matrix \(\mathbf{A}\) containing the linear coefficients up to scale factors. Note that the columns of this matrix containing the quadratic coefficients are directly provided by Algorithm 1 (up to scale factors too).

4 Simulation results

In this section, we present and discuss the results obtained by the proposed B-SO-BSS method presented in Algorithm 1 to unmix the bilinear mixtures, with adding the step described in Sect. 3.6 to estimate the actual mixing coefficients. Herein, we just present the results obtained when the sources are non-zero-mean since we found nearly the same performance for both zero-mean and non-zero-mean cases. In our simulations, the processed data are non-negative, and hence it is possible to compare the obtained results to those obtained by the NMF-Grd-LQ algorithm presented in Meganem et al. (2014b) which is an NMF-based method adapted to LQ mixtures, exploiting the non-negativity of data involved in mixtures. Note that the physical constraints of the NMF-Grd-LQ algorithm, originally adapted to remote sensing applications, i.e the sum of the linear coefficients equal to 1 and the quadratic mixing coefficients lower than 0.5, have been omitted in our simulations and the NMF-Grd-LQ method has been modified accordingly.

4.1 Performance criteria

In order to evaluate the performance of the methods, we calculate the Signal-to-Interference Ratio (SIR) and the Normalized Mean Square Error (NMSE) related to each actual centred source according to the following equations

$$\begin{aligned} SIR_{s_i}= & {} 10\log _{10}\frac{\sum _{n=1}^{N}\widetilde{s}_i(n)^2}{\sum _{n=1}^{N}(\widetilde{s}_i(n)-\hat{s}_i(n))^2} \end{aligned}$$
(30)
$$\begin{aligned} NMSE_{s_i}= & {} \frac{\sum _{n=1}^{N}(\widetilde{s}_i(n)-\hat{s}_i(n))^2}{\sum _{n=1}^{N}\widetilde{s}_i(n)^2} \end{aligned}$$
(31)

where N represents the number of available samples for each signal and the notation ‘ \(\hat{}\) ’ refers to the estimated values after removing the permutation and scale factor indeterminacies. In the same way, we calculate \(SIR_a\) and \(NMSE_a\) related to all the mixing parameters, \(a_j\) and \(a_{j,k}\), estimated according to Sect. 3.6.

4.2 Tests

We performed the following four experiments:

Experiment 1

We considered artificial mixtures of synthetic sources. The mixing parameters \(a_j(i)\) and \(a_{j,k}(i)\) were generated randomly with values uniformly distributed between 0 and 1. The generation of two sources was realized as follows: at first, we generated two independent and identically distributed (i.i.d.) signals \(e_1(n)\) and \(e_2(n)\) uniformly distributed over [0, 1], then we filtered them by two first-order auto-regressive filters in order to obtain two auto-correlated source signals according to the model \(s_i(n)=e_i(n)+ \rho _i s_i(n-1)\). The chosen parameters were \(\rho _1=0.7\) and \(\rho _2=0.5\). The tests were repeated using different source sample numbers N: 10,000, 1000, and 100. Finally, three observed signals were generated using the BL model (2).

Experiment 2

We generated artificial mixtures of two real-world sources. These sources, shown in Fig. 1 and described in Duarte et al. (2014), correspond to the activities (which can be seen as effective ionic concentrations) of ions Na\(^+\) and K\(^+\) measured for 41 samples. As in Experiment 1, the mixing parameters \(a_j(i)\) and \(a_{j,k}(i)\) were generated with random values uniformly distributed between 0 and 1. Thereafter, we generated three observed signals using the BL model (2), even if the mixture model of the concentrations of the chemical species is usually approximated by a linear-quadratic model.

Fig. 1
figure 1

Activities of Na+ and K+ ions

Table 1 Simulation results using our algorithm (B-SO-BSS) with one and four lags, and using NMF-Grd-LQ algorithm

Experiment 3

The third experiment aims at evaluating the robustness of our method to noise. The mixtures were, first, generated in the same way as in Experiment 1, with \(N=10{,}000\). A zero-mean Gaussian i.i.d noise was then added to the observed signals in order to obtain a noisy setting. The SNR (Signal to Noise Ratio) values are varied from 60 dB down to 30 dB.

Experiment 4

With the same strategy concerning noise effect adopted in Experiment 3, we added a zero-mean Gaussian i.i.d. noise to the observed signals generated in Experiment 2. As in Experiment 3, the SNR values are varied from 60 dB down to 30 dB.

4.3 Results

In order to separate the mixed sources and to estimate the mixing parameters, we applied the steps described in Algorithm 1 and Sect. 3.6 in two different configurations: using only one lag (\(\tau _i=1\)), and using 4 lags (\(\tau _i=\{1,2,3,4\}\)).

Fig. 2
figure 2

Comparison between estimated and actual synthetic sources, in the case of a noiseless artificial BL mixture, using 3 different methods: B-SO-BSS (\(\tau _i=1\)) (top), B-SO-BSS (\(\tau _i=1,2,3,4\)) (middle) and NMF-Grd-LQ (bottom). Only a part of signals corresponding to \(n \in [200,249]\) is shown here

Fig. 3
figure 3

Comparison between estimated and actual chemical sources, in the case of a noiseless artificial BL mixture, using 3 different methods: B-SO-BSS (\(\tau _i=1\)) (top), B-SO-BSS (\(\tau _i=1,2,3,4\)) (middle) and NMF-Grd-LQ (bottom)

In the first two experiments, we performed 100 Monte Carlo simulations for our method and NMF-Grd-LQ. At each simulation, we, randomly, modified the source signals and the mixing parameters in the case of Experiment 1, and only the mixing parameters in the case of Experiment 2. The mean of SIR and NMSE of the sources and all the mixing parameters, and the CPU time,Footnote 4 averaged over all 100 simulations realized in Experiment 1 and Experiment 2, are shown in Table 1.

Table 2 Simulation results obtained by Experiment 3 using B-SO-BSS with one and four lags and NMF-Grd-LQ when mixtures are corrupted by noise
Table 3 Simulation results obtained by Experiment 4 using B-SO-BSS with one and four lags and NMF-Grd-LQ when mixtures are corrupted by noise

As can be seen, our proposed method leads to the best results. In terms of run-time, the proposed method is much faster than NMF-Grd-LQ.

Moreover, to evaluate the quality of the source estimation, comparisons are then provided in Figs. 2 and 3 between the actual and estimated sources obtained by different methods. Indeed, the actual sources used in Fig. 2 correspond to an example of Experiment 1 where \(N = 1000\) and the random mixing matrix \(\mathbf{A}_\mathbf{Exp1}\) is given by

$$\begin{aligned} \mathbf{A}_\mathbf{Exp1}=\left( \begin{array}{ccc} 0.0075 &{} \quad 0.1461 &{} \quad 0.3237\\ 0.1221&{} \quad 0.4249 &{} \quad 0.0723\\ 0.4813 &{} \quad 0.2845 &{} \quad 0.5262 \end{array} \right) , \end{aligned}$$
(32)

while Fig. 3 shows an example of Experiment 2 where the random mixing matrix \(\mathbf{A}_\mathbf{Exp2}\) is as follows

$$\begin{aligned} \mathbf{A}_\mathbf{Exp2}=\left( \begin{array}{ccc} 0.1549 \ &{} \quad 0.1405 \ &{} \quad 0.3916\\ 0.5258 \ &{} \quad 0.2041 \ &{} \quad 0.9370\\ 0.2047 \ &{} \quad 0.5108 \ &{} \quad 0.4310\\ \end{array} \right) . \end{aligned}$$
(33)

We notice in Fig. 2, which only shows 50 samples corresponding to \(n \in [200,249]\) for the sake of clarity, that both synthetic sources are much better estimated by our method than by the NMF-Grd-LQ method. Figure 3 shows that both chemical sources are well estimated by our method, while only one source is well estimated by the NMF-Grd-LQ method.

In the following, our goal is to evaluate the performance of the proposed method with only one, then four lags, when the observed signals are corrupted by noise. Moreover, comparisons with the NMF-Grd-LQ method are carried out.

In the last two experiments, we performed 100 Monte Carlo simulations and, at each simulation, we modified the source signals and the mixing parameters in case of Experiment 3, and only the mixing parameters in case of Experiment 4.

The mean of the SIR and NMSE of the sources and of all the mixing parameters obtained for these two experiments are presented in Tables 2 and 3. For clarity, \(SIR_s\) representation was performed versus the SNR values as shown in Fig. 4 (Experiment 3) and Fig. 5 (Experiment 4).

Fig. 4
figure 4

\(SIR_s\) versus SNR in the case of Experiment 3

Fig. 5
figure 5

\(SIR_s\) versus SNR in the case of Experiment 4

Considering Experiment 3, it is noted that the \(SIR_s\) obtained by the B-SO-BSS method are acceptable down to SNR equal to 40 dB. By comparing these results with those obtained by NMF-Grd-LQ, we notice that for high SNR values, B-SO-BSS gives the best results, however for low ones, NMF-Grd-LQ seems more efficient. Indeed, it can be seen that the \(SIR_s\) obtained by NMF-Grd-LQ remain acceptable down to \(SNR=30\) dB.

In the case of Experiment 4, the results obtained by the B-SO-BSS method are acceptable down to SNR equal to 50 dB. On the contrary, for all considered SNR values, NMF-Grd-LQ fails to yield accurate enough estimates.

In that regard, as expected, the presence of noise in the observed signals decreases the separation performance of our method. Nevertheless, performance remains acceptable for relatively high SNR values.

5 Conclusion

In this paper, we proposed a new and fast BSS method, called Bilinear Second-Order Blind Source Separation (B-SO-BSS), which is an extension of linear SOS methods, to separate sources mixed according to the bilinear model. First, we studied the statistical properties of the different extended sources when the actual sources are zero-mean and when they are not. Then, we presented the different steps performed in order to separate the actual centred sources and to estimate the actual mixing parameters. Finally, we presented the experimental results, obtained by our proposed method. As a first step, we evaluated the method separation performance when applied to noiseless artificial mixtures of synthetic or chemical sources. We, therefore, clearly noticed the effectiveness of our method as compared to the NMF-Grd-LQ method. As a second step, we evaluated the method robustness to noise, and as expected, we noticed that the presence of noise in the generated mixtures decreases the effectiveness of our method. However, the performance remained acceptable for relatively high SNR values.