1 Introduction

Multichannel (array) signal processing has increasingly gained prominence in the medical field for the acquisition and analysis of bio-medical signals. The most well-known examples are the bio-potentials recording devices, electroencephalography (EEG) in particular. In this context, an important but rather neglected issue is the recording setup and, in particular, the reference problem. Indeed, signal acquisition is performed with measuring electrodes, placed on or inside the human body and referenced to a reference electrode, itself placed on the body. Therefore, the electrical activity at the reference (never constantly zero) affects measurements at all other active electrode sites [1, 6, 7, 19]. In EEG, this type of acquisition setup is called common reference (CR) montage. In classical scalp EEG, the reference electrode is often placed on the head. In this case, this electrode is influenced by brain sources and by specific artefacts, depending on its location (eye artefacts for a frontally placed electrode, for example). The artefactual activity is thus present in all the measures. To eliminate the influence of the reference electrode, and consequently to ease the interpretation and the use of different signal processing techniques,Footnote 1 several montages (average, bipolar, Laplacian) can be derived from the CR recordings by simple manipulations (see [3, 14] for more details of the recording setup).

In depth EEG recordings, like in the recording setup from [6, 10], the signals are acquired from intra-cerebral contacts, placed along an electrode implanted in the brain (see Fig. 2 for an example of depth EEG implantation scheme). The reference can be either a surface electrode [6] or a user chosen contact of some depth electrode [10]. In both setups, the reference contact is placed as far as possible from the region of interest (the supposed epileptogenic zone in our clinical context). The reference signal is then supposed uncontaminated by the electrical activity recorded by the measuring contacts, but not necessarily null: the surface reference electrode, besides potentially propagated brain signals (assumed negligible), records also physiological artefacts (muscle, eyes) or other recording device artefacts, while the distant intra-cranial reference contact might record local brain potential changes. Both these activities (extra-cerebral artefacts or different structure activity) appear on all measured signals, as they are obtained as a potential difference between the measuring electrodes and the reference one. Finally, noise also affects the reference electrode, especially when it is placed on the scalp.

To avoid the reference problem, all iEEG signals are interpreted by clinicians using a bipolar (BL) derivation: neighbouring contacts on the same electrode are subtracted to obtain images of the local activity and to eliminate the reference.Footnote 2 Still, direct measures obtained by the CR montage can be useful for the interpretation, as they offer a global view, complementary to the local view furnished by the BL montage. Unfortunately, they are contaminated by the electrical activity recorded by the reference contact. An interesting attempt to reduce this influence, based on a constrained blind source separation (BSS) approach, was proposed by Hu et al. [6, 7] and further developed by [13]. The proposed idea was to estimate the reference signal and then eliminate this estimated reference signal from the CR montage. Ranta et al. [13] termed this montage as the zero reference (ZR) montage.

This contribution presents a unifying analysis of the reference estimation problem for the specific setup of the independent reference. A framework is developed, under whose umbrella the above mentioned BSS-based methods are closely related. Within this framework, we further develop a simple reference estimation approach which is shown to be reliant only on the second-order statistics (SOS) of the signals and which is optimal in terms of SNR maximisation (see Sect. 2.3.1). We further demonstrate the equivalence of this approach to the well-known minimum power/variance distortionless response (MPDR/MVDR) approach to signal estimation. These are well-known approaches in the array signal processing field, and while briefly describing these approaches in Sect. 2.3.2, we would refer the interested reader to the excellent book of van Trees [17] for more details. Finally, simulated examples and results on real iEEG recordings are presented in Sect. 3.

2 Methods

2.1 Signal model

The underlying signal model we consider is:

$${\mathbf{x}} (n) = {\mathbf{A}} {\mathbf{s}} (n)$$
(1)

where \({\mathbf{x}} (n)\in {\mathbb R} ^{(M \times 1)}\) is the vector of \(M\) observations at time instant \(n\) (measured EEG signals after sampling and quantization) and \({\mathbf{s}} (n)\in {\mathbb R}^{({\mathbf{Q}} \times 1)}\) is the corresponding vector of \(Q\) source realisations (underlying brain activity) at the same instant. \({{\mathbf{A}}} \in {\mathbb R} ^{(M \times Q)} = \left({\mathbf{a}}_1,\ldots ,{\mathbf{a}}_Q\right)\) represents the linear combination of the sources to yield the observation vector \({\mathbf{x}}\), where \({\mathbf{a}}_q \in {\mathbb R} ^{(M \times 1)}\). This model, also known as instantaneous mixture model, is widely accepted in the EEG processing field [15].

In the field of array signal processing, the vectors \({\mathbf{a}}_q\) are known as steering vectors, whereas in the field of EEG processing and in the BSS framework, these are often referred to as the mixing parameters. Note that we denote these terms as belonging in the real domain, as in the EEG applications, but the generalization to the complex case is immediate.

When using the common reference montage (subsequently referred to as CR), the signal model is obtained by modifying (1), as proposed in [6]. This implies that we will consider that the mixing \({\mathbf{A}}\) is unknown, except for one column whose each element is \(-1\):

$$ {\mathbf{x}} ({n}) = \left(\begin{array}{cc} -1 \\ \vdots & {\mathbf{A}}_2 \\ -1 \\ \end{array}\right) \left( \begin{array}{c} r(n)\\ {\mathbf{s}}_2(n) \end{array}\right),$$
(2)
$${\mathbf{x}} ({n}) \mathop{=}\limits^{\triangle} ({{\mathbf{a}}}_{1}\; {\mathbf{A}}_2 ) \left( \begin{array}{c} r(n)\\ {\mathbf{s}}_2(n) \end{array}\right)$$
(3)
$${\mathbf{x}} ({n}) \mathop{=}\limits^{\triangle} {\mathbf{a}}_1r(n) + {\mathbf{v}} (n),$$
(4)

where \({\mathbf{x}}(n)\) shall subsequently denote the measured CR EEG signals; \(r(n)\), the non-zero common reference signal; \({{\mathbf{a}}}_{1}\), the \(M\times 1\) column vector with each element being \(-1\); \(A_2\), the matrix of the remaining mixing parameters; and \({\mathbf{s}}_2(n)\), the remaining sources. Equation (4), where \({\mathbf{v}} (n)={\mathbf{A}}_2{\mathbf{s}}_2(n)\), presents an alternative, compact expression for the signal model, which will also be used in the following development.

Our aim of reference estimation is to make the best estimate of \(r(n)\) from the observations \(x_{m}(n)\) by a weighted linear combination \({{\mathbf{w}}}\in {\mathbb R} ^{(M\times 1)}\):

$$\widehat{r}(n) = {{\mathbf{w}}}^{\rm T}{\mathbf{x}} (n)$$
(5)

The only necessary hypothesis is that the reference \(r(n)\) is independent (in fact uncorrelated is sufficient) from the other sources \({\mathbf{s}}_2(n)\) (i.e. \( E \left\{s_{q}r\right\}=0, \forall s_{q}\in {\mathbf{s}}_2\), where \( E \left\{\cdot \right\}\) stands for the statistical expectation operator).

2.2 Analysis of the reference estimation problem

2.2.1 Non-blind estimation

For the sake of completeness, we consider first the case when the mixing \({\mathbf{A}}\) is known. In this case, the most immediate approach would be to try to invert the mixing, yielding estimates for all sources, \(r(n)\) included. The general approach followed in this case is to formulate the estimation as a least-squares optimisation problem:

$${\mathcal J}_{{\mathbf{w}}} = \mathop {\text{ argmin}}\limits_{\rm{{w}}} \Vert {\mathbf{w}} ^{\rm T}{\mathbf{A}} -{\mathbf{e}} ^{\rm T}_1 \Vert^2$$
(6)

where \({\mathbf{e}}_{m}\) is a column-vector of which the \(m\)th element is unity and the remaining elements are zero. What this cost function implies is the recovery of only the desired source, nulling the effect of other sources. Differentiating this cost function w.r.t. \({{\mathbf{w}}}\) and equating to 0 we obtain:

$${\mathbf{A}} {\mathbf{A}} ^{\rm T}{{\mathbf{w}}} = {\mathbf{A}} {\mathbf{e}}_1$$
(7)

Depending upon \(M\) and \(Q\), the analysis can be divided into three cases:

  1. 1.

    well-determined case: square full rank mixing \({\mathbf{A}} (M=Q)\)

  2. 2.

    over-determined case: rank deficient \({\mathbf{A}} (M>Q)\)

  3. 3.

    under-determined case: full row-rank mixing \({\mathbf{A}} (M<Q).\)

Obviously, when the mixing matrix is known and full-rank square, \({{\mathbf{w}}}\) is obtained as:

$${\mathbf{w}} = \left({{\mathbf{A}} {\mathbf{A}} ^{\rm T}}\right)^{-1}{\mathbf{A}} {\mathbf{e}}_1 = {\mathbf{A}} ^{-\rm T}{\mathbf{e}}_1$$
(8)

hence we obtain \(\widehat{r}(n)\) as

$$\widehat{r}(n) ={\mathbf{w}} ^{\rm T}{\mathbf{x}} (n)=r(n)$$
(9)

When the mixing is known and over-determined \(\text{ rank}({\mathbf{A}} ) = Q < M\), the solution for \({{\mathbf{w}}}\) is not unique, but it can be determined by reducing the dimension of the observations \({\mathbf{x}}\) (and thus of the mixing matrix \({\mathbf{A}} {\mathbf{w}} )\) to \(Q\) in order to obtain a full-rank invertible mixture, and then applying (6) in this reduced space. A classical approach for such dimension reduction is the principal component analysis (PCA). The estimated \(\widehat{r}(n)\) will be an exact reconstruction of \(r(n)\) in this case too.

Finally, when the mixture is known but under-determined, the solution will be given by:

$${\mathbf{w}} = \left({\mathbf{A}} {\mathbf{A}} ^{\rm T}\right)^{-1}{\mathbf{A}} {\mathbf{e}}_1$$
(10)

where \({\mathbf{e}}_1\) is now of dimension \(Q\times 1\).

In this case, we take recourse to the singular value decomposition (SVD) [16] of \({\mathbf{A}}\) as: \({\mathbf{A}} \mathop{=}\limits^{\triangle}{\mathbf{U}} \left(\Upsigma\;{\mathbf 0} \right){\mathbf V} ^{\rm T}\), where \({\mathbf{U}} \in {\mathbb R} ^{((M\times M))}\) and \({\mathbf V} \in {\mathbb R} ^{(Q\times Q)}\) are unitary matrices and \(\Upsigma \in {\mathbb R} ^{(M\times M)}\) is a diagonal matrix of the singular values of \({\mathbf{A}}\), and \({\mathbf 0}\) is an \(M\times (Q-M)\) matrix of zeroes. Using this decomposition, the least-squares estimate of \(\widehat{r}(n)\) from (10) can be obtained as:

$$\begin{aligned} {\widehat{r}}(n)&= {\mathbf{w}} ^{\rm T}{\mathbf{x}} (n)\\&= {\mathbf{e}} ^{\rm T}_1{{\mathbf V}} \left( \begin{array}{cc} {\mathbf I}_{M,M}& {{\mathbf 0}}_{M,Q-M}\\ {\mathbf 0}_{Q-M,M}&{\mathbf 0}_{Q-M,Q-M} \end{array} \right) {\mathbf V} ^{\rm T} \left( \begin{array}{c} r(n)\\ {\mathbf{s}}_2(n) \end{array} \right)\\&= {\mathbf{e}} ^{\rm T}_1 \left( \begin{array}{cc} {\mathbf V}_{1:M,1:M}{\mathbf V} ^{\rm T}_{1:M,1:M}&{\mathbf 0}_{M,Q-M}\\ {\mathbf 0}_{Q-M,M}&{\mathbf 0}_{Q-M,Q-M} \end{array} \right) \left( \begin{array}{c} r(n)\\ {\mathbf{s}}_2(n) \end{array} \right) \end{aligned}$$
(11)

where \(\mathbf I\) is the identity matrix and the subscripts for the matrices in the above equation indicate the corresponding dimensions of the matrices. We also use the notation \({\mathbf B}_{a:b,c:d}\) to indicate the sub-matrix of \({\mathbf B}\) consisting of rows \(a\) through \(b\) and columns \(c\) through \(d\). The matrix \({\mathbf V}_{1:M,1:M}{\mathbf V} ^{\rm T}_{1:M,1:M}\) in (11) guarantees the presence of residual interference. Furthermore, unit-gain on \(r(n)\) is not guaranteed. While this can be enforced, it should be clear that in the under-determined case, a clean extraction of the reference signal is not possible.

2.2.2 Reference estimation via blind source separation

When nothing about \({\mathbf{A}}\) or \({\mathbf{s}}(n)\) is known, model inversion needs to be done in a completely blind manner, and this is generally accomplished through an appropriate BSS approach. In the case where some a priori information is available (on the mixing or on the sources), the BSS becomes semi-blind source separation (sBSS). This is exactly our problem setting, where the mixing column for the reference source (the source of interest) is known.

The solutions proposed by [6] start by deriving from the measured \({\mathbf{x}}(n)\) the bipolar montage (BL) \({\mathbf{x}}_b(n)\). This BL montage is constructed from the CR montage by computing pairwise differences among the \(x_{m}(n)\), which eliminates the influence of the reference \(r(n)\) in the resulting \({\mathbf{x}}_b(n)\) signals. Separating the \({\mathbf{x}}_b\) by FastICA [8], one obtains statistically independent estimates of \({\mathbf{s}}_2\) sourcesFootnote 3 (if the number of measures is too small, \(M<Q\), one still obtains independent signals, but not necessarily close to \({\mathbf{s}}_2\)). Exploiting the absence of the reference \(r(n)\) in the new estimates, [6] propose two methods for estimating \(r(n)\) by comparing the \({\mathbf{x}} (n)\) from the CR montage (which includes the reference) with the sources obtained from \({\mathbf{x}}_b(n)\) (for details, see [6]).

Ranta et al. [13] exploited the same model (2) to derive a more robust and faster method. The basic idea being: if complete source separation needs a two step approach (whitening + rotation), and if one wants to estimate only one source, the rotation matrix does not need to be completely determined, determining one row is sufficient. It can be shown that such a constrained approach where one column of the mixing matrix is known has an optimal estimator that ties into a central framework dependent only on the SOS of the signals. These relations are subsequently described and the model generalized to this case.

2.3 Unified framework

2.3.1 The semi-blind source separation solution of [13]

In the absence of any a priori knowledge of the sources or the mixing system, the aim of blind source separation algorithms is to invert the mixing system to obtain the underlying sources. Such completely blind approaches suffer from the fundamental, unavoidable indeterminacy regarding the amplitude of the sources. From the perspective of blind source separation, a mixing of the kind in (1) is equivalent to:

$$\begin{aligned} {{\mathbf{x}}(n)} &= {\mathbf {ADD}}^{-1}{{\mathbf{s}}}(n)\\ &= {\widetilde{{\mathbf{A}}}}{\widetilde{{\mathbf{s}}}}(n)\end{aligned}$$
(12)

where \({{\mathbf{D}}}\) is some arbitrary diagonal scaling matrix which changes the amplitude, but not the time course of the sources. A unique solution of (12) for \({\widetilde{{\mathbf{A}}}}\) and \({\widetilde{{\mathbf{s}}}}{(n)}\) is therefore impossible.Footnote 4

Traditionally, therefore, BSS approaches consider unit variance sources \({\widetilde{{\mathbf{s}}}}{(n)}\). This, combined with the independence assumption, means that:

$${\varvec{\Upphi}}_{{\widetilde{\rm s}}{\widetilde{\rm s}}}= {\rm E} \left \{{\widetilde{{\mathbf{s}}}}{(n)}{\widetilde{{\mathbf{s}}}}^{\rm T}(n)\right\} = {\mathbf I},$$
(13)

which implicitly implies that \({{\mathbf{D}}} =\varvec{\Upphi}_{\rm ss}^{1/2}\). The aim of BSS approaches is then to invert the mixing system \({\widetilde{{\mathbf{A}}}}\).

Note that \({\widetilde{{\mathbf{A}}}}\) may be expressed in terms of its constituent components from the SVD as:

$${\widetilde{{\mathbf{A}}}} \mathop{=}\limits^{\triangle}{\widetilde{{\mathbf{U}}}}{\widetilde{\varvec{\Upsigma}}}{\widetilde{{\mathbf V}}}^{\rm T},$$
(14)

where \({\widetilde{{\mathbf{U}}}}\in {\mathbb R} ^{(M\times M)}\) and \({\widetilde{{\mathbf V}}}\in {\mathbb R} ^{(Q\times Q)}\) are orthogonal matrices and \({\widetilde{\varvec{\Upsigma}}}\in {\mathbb R} ^{(M\times Q)}\) contains the singular values.

Classic blind source separation algorithms demand that \(M\ge Q\) in order to obtain plausible source estimates. Our first analysis focusses therefore on the case when \({\widetilde{{\mathbf{A}}}}\) (thus \({{\mathbf{A}}}\)) is full-rank square, i.e., linearly independent rows, \(M=Q\). Note that when \({\widetilde{{\mathbf{A}}}}\) is row rank deficient (\(M>Q\)), the inversion problem can be easily separated into multiple well-behaved sub-problems by considering subsets of \(Q\) signals. An alternative approach is to perform dimension reduction using PCA followed by BSS on the reduced space.

From (14), obtaining the inverse of \({\widetilde{{\mathbf{A}}}}\) is equivalent to computing the individual components of its SVD. This is done in two stages: a whitening, followed by a rotation.

Consider \(\varvec{\Upphi}_{{\mathbf{xx}}}= \rm E \left\{{\mathbf{x}} ({\it{n}}){\mathbf{x}} ^T({\it{n}})\right\}\). The eigenvalue decomposition (EVD) of \(\varvec{\Upphi}_{{\mathbf{xx}}}\) can be written as:

$$\varvec{\Upphi}_{{\mathbf{xx}}}={\mathbf{U}}_{\mathbf{x}}\varvec{\Upsigma}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}$$
(15)

Under the assumption of (13) and considering (14), \(\varvec{\Upphi}_{{\mathbf{x}}x}\) can also be expressed as:

$$\begin{aligned} \varvec{\Upphi}_{{\mathbf{xx}}}&= \rm E \left\{{\widetilde{{\mathbf{A}}}}{\widetilde{\rm{s}}}({\it{n}}){\widetilde{{\mathbf{s}}}}^{\rm T}({\it{n}}){\widetilde{{\mathbf{A}}}}^{\rm T}\right\} = {\widetilde{{\mathbf{A}}}}\varvec{\Upphi}_{\widetilde{\rm{{s}}}{\widetilde{\rm{{s}}}}}{\widetilde{{\mathbf{A}}}}^{\rm T}\\ &= {\widetilde{{\mathbf{A}}}}{\widetilde{{\mathbf{A}}}}^{\rm T}={\widetilde{{\mathbf{U}}}}{\widetilde{\varvec{\Upsigma}}}^2{\widetilde{{\mathbf{U}}}}^{\rm T} \end{aligned}$$
(16)

From (15) and (16) \({\widetilde{\mathbf{U}}}={\mathbf{U}}_{\mathbf{x}}\) and the non-zero singular values of \({\widetilde{\varvec{\Upsigma}}}\) are given by \(\varvec{\Upsigma}_{\mathbf{x}}^{1/2}\).

Thus, we see that (16) can already give us two components of \({\widetilde{\mathbf{A}}}\): \({\widetilde{\mathbf{U}}}\) and \({\widetilde{\varvec{\Upsigma}}}\). We use this to first whiten the data, which yields:

$$\begin{aligned} \overset{\circ}{{{\mathbf{x}}}}(n)&= \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}{\mathbf{x}} (n)\\&= \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{{\mathbf{U}}} ^{\rm T}_{\mathbf{x}}{\widetilde{{\mathbf{A}}}}{\widetilde{\rm{\mathbf{s}}}}(n)\\&= {\widetilde{{\mathbf V}}}^{\rm T}{\widetilde{\rm{\mathbf{s}}}}(n) \end{aligned}$$
(17)

What remains is to estimate the unitary, rotation matrix \({\widetilde{{\mathbf V}}}\). Denote this estimate of \({\widetilde{{\mathbf V}}}\) by \({\widetilde{\mathbf W}}\). ICA approaches estimate \({\widetilde{\mathbf W}}\) by optimizing functions that maximise the statistical independence between the outputs \(y_{q}(n)\) of \({\mathbf y} (n)\), where \({\mathbf y} (n)={\widetilde{\mathbf W}}^{\rm T}\overset{\circ}{{\mathbf{x}}}(n)\). Effectively, what these methods aim to achieve is:

$$\begin{aligned} {\widetilde{\mathbf W}}^{\rm T}\overset{\circ}{{\mathbf{x}}}(n)&= {\widetilde{\mathbf W}}^{\rm T} \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}{\mathbf{x}} (n)\\&= {\widetilde{\mathbf W}}^{\rm T} \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{{\mathbf{U}}} ^{\rm T}_{\mathbf{x}}{\widetilde{{\mathbf{A}}}}{\widetilde{\rm{\mathbf{s}}}}(n)\\&= {\widetilde{\mathbf W}}^{\rm T} \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{{\mathbf{U}}} ^{\rm T}_{\mathbf{x}} \left( \begin{array}{llll} {\widetilde{{\mathbf{a}}}}_1&{\widetilde{{\mathbf{a}}}}_2&\cdots&{\widetilde{{\mathbf{a}}}}_{M} \end{array} \right) {\widetilde{\rm{\mathbf{s}}}}(n)\\&\mathop{=}\limits^{\triangle}{\widetilde{{\mathbf{D}}}}{\widetilde{\rm{\mathbf{s}}}}(n)\\&\mathop{=}\limits^{\triangle}{{\mathbf{D}}} {{\mathbf{s}}} (n) \end{aligned}$$
(18)

where, as before, \({\widetilde{{\mathbf{D}}}},\,{\mathbf{D}}\) are diagonal matrices of scale values. For a completely blind approach, such as ICA, this is the best result possible under no knowledge of the mixing system or source auto-correlations.

In contrast, when we know \({{\mathbf{a}}}_1\) and wish to estimate only the corresponding source \(r(n)\), we do not need complete inversion of \({\widetilde{{\mathbf{A}}}}\) and can adopt another strategy, namely:

$${\widetilde{\mathbf W}}^{\rm T}\overset{\circ}{\mathbf x}(n)= {\widetilde{\mathbf W}}^{\rm T} {\varvec{\Upsigma}}_{\mathbf{x}}^{-1/2} {\mathbf U}_{\mathbf{x}}^{\rm T} ( {\widetilde{{\user2{a}}}}_1\;{\widetilde{\mathbf A}}_2) {\widetilde{\rm {\mathbf{s}}}}(n)$$
(19)

whereby from (12) and the implications of (13),

$${{\mathbf{D}}} {\mathbf{s}} (n)= {\widetilde{\mathbf W}}^{\rm T} \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}(\Upphi ^{1/2}_{rr}{{\mathbf{a}}}_{1}\; {\widetilde{{\mathbf{A}}}}_2 ) \varvec{\Upphi}_{\rm ss}^{-1/2}{\mathbf{s}} (n)$$
(20)

From this, and given that \({\widetilde{\mathbf W}}\) is unitary, it follows that:

$${\widetilde{\mathbf W}}{\mathbf{D}} {\mathbf{s}} (n)= (\Upphi ^{1/2}_{rr}\varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}} {{\mathbf{a}}}_1\; \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}{\widetilde{{\mathbf{A}}}}_2 ) \varvec{\Upphi}_{\rm ss}^{-1/2}{\mathbf{s}} (n),$$
(21)

where \({\widetilde{{\mathbf{A}}}}_2 = \left(\begin{array}{lll}{\widetilde{{\mathbf{a}}}}_2,\ldots,{\widetilde{{\mathbf{a}}}}_{M}\end{array}\right)\). Thus, the first column of \({\widetilde{\mathbf W}}\) can be determined, except for the unknown scale factor of \(\Upphi ^{1/2}_{rr}\) as:

$${\widetilde{{\mathbf{w}}}}_1 = \alpha \varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}{{\mathbf{a}}}_1,$$
(22)

where \(\alpha\) is the unknown scale factor to be determined.

From (18) and (22), we have an effective demixing filter for \(r(n)\) which we express compactly as:

$$\begin{aligned} {\mathbf{w}}_1&= {\mathbf{U}}_{\mathbf{x}}\varvec{\Upsigma}^{-1/2}_{\mathbf{x}}{\widetilde{{\mathbf{w}}}}_1\\&= \alpha {\mathbf{U}}_{\mathbf{x}}\varvec{\Upsigma}^{-1}_{\mathbf{x}}{\mathbf{U}} ^{\rm T}_{\mathbf{x}}{{\mathbf{a}}}_1\\&= \alpha \varvec{\Upphi}_{{\mathbf{xx}}}^{-1}{{\mathbf{a}}}_{1} \end{aligned}$$
(23)

It remains now to fix the scale, which is done by ensuring that \({\mathbf{w}}_1\) introduces no distortion along \({{\mathbf{a}}}_{1}\), i.e., \({\mathbf{w}}_1^{\rm T}{{\mathbf{a}}}_{1} = 1\). Introducing this constraint yields:

$$ \alpha = \left({{\mathbf{a}}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{xx}}}^{-1}{{\mathbf{a}}}_{1}\right)^{-1}.$$
(24)

Note, also, that whereas traditional BSS approaches cannot be applied to the under-determined case, the sBSS solution described above still applies. For this case, where \(M<Q\), we can write (14) as:

$$\begin{aligned} {\widetilde{{\mathbf{A}}}}&= {\widetilde{{\mathbf{U}}}}{\widetilde{\varvec{\Upsigma}}}{\widetilde{{\mathbf V}}}^{\rm T}\\ (\Upphi ^{1/2}_{rr}{{\mathbf{a}}}_1\; {\widetilde{{\mathbf{A}}}}_2 )&= {\mathbf{U}}_{\mathbf{x}} ( \varvec{\Upsigma}_{\mathbf{x}}^{1/2}\; {\mathbf 0}_{M,Q-M}) {\widetilde{{\mathbf V}}}^{\rm T}\\&= {\mathbf{U}}_{\mathbf{x}}\varvec{\Upsigma}_{\mathbf{x}}^{1/2}{\widetilde{{\mathbf V}}}_p^{\rm T}, \end{aligned}$$
(25)

where \({\widetilde{{\mathbf V}}}_p = {\widetilde{{\mathbf V}}}_{1:Q,1:M}\).

Thus, the mixing model of (12) may be written in terms of (25) as:

$${\mathbf{x}} (n) = ( \Upphi ^{1/2}_{rr}{\mathbf{a}}_1\; {\widetilde{{\mathbf{A}}}}_2 ){\widetilde{\rm{\mathbf{s}}}}(n) = {\mathbf{U}}_{\mathbf{x}}\mathbf \Upsigma_{\mathbf{x}}^{1/2}{\widetilde{{\mathbf V}}}_p^{\rm T}{\widetilde{\rm{\mathbf{s}}}}({\it{n}})$$
(26)

Applying the whitening transform to (26) then yields:

$$\begin{aligned} {\widetilde{{\mathbf V}}}_p^{\rm T}{\widetilde{\rm{\mathbf{s}}}}(n) = \varvec{\Upsigma}_{\mathbf{x}}^{-1/2}{\mathbf{U}} ^{\rm T}_{\mathbf{x}} ( \Upphi ^{1/2}_{rr}{{\mathbf{a}}}_1\; {\widetilde{{\mathbf{A}}}}_2 ) {\widetilde{\rm{\mathbf{s}}}}(n). \end{aligned}$$
(27)

The solution of this equation for \({\widetilde{\rm{\mathbf{s}}}}(n)\) requires the right pseudo-inverse of \({\widetilde{{\mathbf V}}}_p^{\rm T}\) which, given that the columns of \({\widetilde{{\mathbf V}}}_p\) are orthogonal, is simply \({\widetilde{{\mathbf V}}}_p\). Moreover, for extracting only \(r(n)\), we require just the first row of \({\widetilde{{\mathbf V}}}_p\) (and, correspondingly, the first column of \({\widetilde{{\mathbf V}}}_p^{\rm T}\)). This is, in effect, the same solution as (22) and the demixing filter is identical to the solution for the determined case, when imposing unit gain along \({{\mathbf{a}}}_{1}\).

We shall show next that the solution we obtain for \({{\mathbf{w}}}_1\) using this scale factor in (23) is identical to well-known approaches from array technology, which only consider the SOS of the signals for the extraction of a ‘desired’ or ‘target’ signal along a known direction. As will be shown, the sBSS solution is also the best achievable in terms of SNR maximisation.

2.3.2 MPDR estimator

Recall that our aim is to find a linear combination \({{\mathbf{w}}}\), able to estimate the unknown source \(r(n)\). One possible approach is to minimise the output power of the resultant signal \(y(n) = {\mathbf{w}}^{\rm T} {\mathbf{x}} (n)\), under the constraint that the gain on the estimated reference signal (also denoted as the ‘target’ or ‘desired’ signal in this estimation context) remains unity. This may be posed as the following optimisation:

$$\begin{aligned} \mathcal J_{{\mathbf{w}}}&= \rm E \left \{|{\mathbf{w}} ^{\rm T}{\mathbf{x}} (n)|^2\right\} + \lambda ({\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1-1)\\&= {\mathbf{w}} ^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}{{\mathbf{w}}} + \lambda ({\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1-1), \end{aligned}$$
(28)

where \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) is the correlation matrix of \({\mathbf{x}}\), and \(\lambda\) is the Lagrange multiplier.

The solution to this constrained optimisation is obtained, after some manipulation, as:

$${{\mathbf{w}}} = \frac{\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{{{\mathbf{a}}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{{\mathbf{a}}}_{1}}}{{\mathbf{a}}}_{1}.$$
(29)

This solution is well known in array technology as the MPDR approach [17]. The title aptly describes the design considerations behind this approach: minimising the output power while keeping the desired signal undistorted.

2.3.3 Maximum SNR approach

We may also pose the search for the optimal \({{\mathbf{w}}}\) as an SNR maximising criterion. Consider the signal model of (4), under the linear combination \({{\mathbf{w}}}\):

$${\mathbf{w}} ^{\rm T}{\mathbf{x}} (n) = {\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1r(n) + {\mathbf{w}} ^{\rm T}{\mathbf{v}}(n)$$
(30)

The SNR after applying \({{\mathbf{w}}}\) is then easily obtained as:

$$\begin{aligned} \text{ SNR}&= \frac{\text{ E}\{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_{1}r({\it{n}})|^2\}}{\text{ E}\{|{\mathbf{w}} ^{\rm T}{\mathbf{v}} ({\it{n}})|^2\}} \\&= \frac{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1|^2 {\Upphi_{rr}}}{{\mathbf{w}} ^{\rm T}\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}{{\mathbf{w}}}} \end{aligned}$$
(31)

As the SNR is a positive value, maximising the SNR w.r.t. \({{\mathbf{w}}}\) is also equivalent to maximising the following cost function:

$$\begin{aligned} \mathcal J_{{\mathbf{w}}}&= \frac{1}{1 + {\text{ SNR}}^{-1}}\\&= \frac{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1|^2{\Upphi_{rr}}}{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1|^2{\Upphi_{rr}} + {\mathbf{w}} ^{\rm T} \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}} {{\mathbf{w}}}}\\&= \frac{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1|^2{\Upphi_{rr}}}{{\mathbf{w}} ^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}{{\mathbf{w}}}} \end{aligned}$$
(32)

The solution is obtained as:

$$\frac{|{\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1|^2{\Upphi_{rr}}}{{\mathbf{w}} ^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}{{\mathbf{w}}}}\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}{{\mathbf{w}}} = {{\mathbf{a}}}_{1}{{\mathbf{a}}}_{1}^{\rm T}{\mathbf{w}}.$$
(33)

from which (and recognising that \({{\mathbf{a}}}_{1}^{\rm T} {\mathbf{w}}\) is a scalar) we can conclude that

$${{\mathbf{w}}} \propto \varvec{\Upphi}_{{\mathbf{xx}}}^{-1}{{\mathbf{a}}}_{1}.$$

Selecting the constant of proportionality to yield unit gain along \({{\mathbf{a}}}_{1}\) results in \(\alpha = ({{\mathbf{a}}}_1^{\rm T} {\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}}_{1})^{-1}\) and the resulting \({{\mathbf{w}}}\) is identical to the sBSS and MPDR solutions.

2.3.4 Reduction to MVDR

Observe that maximising the SNR directly is equivalent to minimising the error term in Eq. (30), under the same unit gain constraint. The optimization problem is strictly similar to the one posed in (28):

$$\begin{aligned} \mathcal J_{{\mathbf{w}}}&= \rm E \left\{|{\mathbf{w}} ^{\rm T}{\mathbf{v}} ({\it{n}})|^2\right\} + \lambda ({\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1-1)\\&= {{\mathbf{w}}}\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}{\mathbf{w}} ^{\rm T} + \lambda ({\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_1-1). \end{aligned}$$
(34)

Assuming that \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\) is full ranked, this will lead to:

$${{\mathbf{w}}} = \frac{\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}}{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{\mathbf{a}}}_{1}}{{\mathbf{a}}}_{1}.$$
(35)

This solution is known as the MVDR approach as it minimises the variance of the output signal about \(r(n)\).

To prove the equivalence between the MVDR solution and the previous approaches, we factorise \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) as:

$$\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}} = \left({{\mathbf{a}}}_{1}{\mathbf{a}}_1^{\rm T}{\Upphi_{rr}} + \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\right)$$
(36)

Applying Woodbury’s identity [18] (and assuming \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\) is full-rank and invertible) , \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}\) can be written as:

$$\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1} = \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1} - {\Upphi_{rr}}\frac{\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{{\mathbf{a}}}_{1}{{\mathbf{a}}}_{1}^{\rm T} \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}}}{1 + {\Upphi_{rr}}{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{\mathbf{a}}}_{1}}.$$
(37)

Substituting this value of \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}\) in (29) and after some trivial algebraic manipulations, we obtain again the solution in (35).

Finally, if we return to the SNR maximisation problem (31) under the hypothesis that \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\) is invertible, we can directly write:

$$\text{SNR}\ \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}{{\mathbf{w}}} = {{\mathbf{a}}}_{1}{{\mathbf{a}}}_{1}^{\rm T}{{\mathbf{w}}}.$$
(38)

Again, as \({\mathbf{a}}_1^{\rm T} {\mathbf{w}}\) is a scalar, (38) reduces to:

$$\frac{\text{ SNR}}{{\mathbf{a}}_1^{\rm T}{{\mathbf{w}}}}\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}{{\mathbf{w}}} = {{\mathbf{a}}}_{1},$$
(39)

from which:

$${\mathbf{w}}= \frac{{\mathbf{a}}_1^{\rm T} {\mathbf{w}}}{\text{ SNR}}\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1} {{\mathbf{a}}}_{1}$$
(40)
$${\mathbf{w}}= \alpha \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{\mathbf{a}}}_{1}$$
(41)

In other words, \({{\mathbf{w}}}\) is a scaled version of \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{\mathbf{a}}}_{1}\). To impose the distortionless constraint, we may again redefine the scale factor such that:

$${\mathbf{w}} ^{\rm T}{{\mathbf{a}}}_{1} = 1$$
(42)

yielding

$$\alpha = ({\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}^{-1}{{\mathbf{a}}}_{1})^{-1}.$$
(43)

To conclude, if \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\) is also full-ranked, MVDR, MPDR and sBSS solutions are strictly equivalent. Nevertheless, in practical applications, \(\varvec{\Upphi}_{{\mathbf{v}} {{\mathbf{v}}}}\) is seldom known, while \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) can be estimated from the data, so we will focus in the next sections on this solution only.

2.3.5 Source estimate

As seen previously, the linear combination permitting an optimal recovery of the unknown reference signal \(r(n)\) can be written as:

$${\mathbf{w}} = \frac{\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{\mathbf{a}}_{1}}{{\mathbf{a}}}_{1}.$$
(44)

Applying this linear combination to \({\mathbf{x}} (n)\), we obtain the following estimate of the target signal:

$$\begin{aligned} \widehat{r}(n)&= {\mathbf{w}} ^{\rm T} {\mathbf{x}}(n)\\&= r(n) + \frac{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{\mathbf{v}} (n)}{{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}}_{1}} \end{aligned}$$
(45)

One can easily prove that this estimate is identical to \(r(n)\) if the mixing is over or well-determined (column rank of \({\mathbf{A}} \le M\)).

Consider first the case of a square invertible matrix \({\mathbf{A}}\) (\(M \times M\)). This case also implies that \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) is invertible and permits a factorisation of its inverse as:

$$\begin{aligned} \varvec{\Upphi}_{{\mathbf{x}}{{\mathbf{x}}}}^{\text{-1}}&= \left({\mathbf{A}} \varvec{\Upphi}_{\mathbf {ss}}{\mathbf{A}} ^{\rm T}\right)^{\text{-1}}\\&= {\mathbf{A}} ^{\rm {-T}} \varvec{\Upphi}_{{\mathbf{s}} {{\mathbf{s}}}}^{\text{-1}}{\mathbf{A}} ^{\text{-1}} \end{aligned}$$
(46)

The corresponding estimate \(\widehat{r}(n)\) is:

$$\begin{aligned}\widehat{r}(n)&= {\mathbf{w}} ^{\rm T}{\mathbf{x}} (n)\\&=\frac{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}}{{\mathbf{x}}}}^{-1}}{{\mathbf{a}}_1^{\rm T}\varvec{\Upphi}_{{\mathbf{x}}{{\mathbf{x}}}}^{-1}{{\mathbf{a}}}_{1}}{\mathbf{As} (n)}\\&=\frac{{\mathbf{e}}_1^{\rm T}{\mathbf{A}}^{\rm T}\left({\mathbf{A}}^{-\rm T} \varvec{\Upphi}_{{\mathbf{s}}{{\mathbf{s}}}}^{-1}{\mathbf{A}}^{-1}\right)}{{\mathbf{e}} ^{\rm T}_1{\mathbf{A}} ^{\rm T}\left({\mathbf{A}} ^{-\rm T}\varvec{\Upphi}_{{\mathbf{s}}{{\mathbf{s}}}}^{-1}{\mathbf{A}}^{-1}\right){\mathbf{A}} ^{\rm T}{\mathbf{e}}_1}{\mathbf{As}} (n)\\&= \frac{{\mathbf{e}} ^{\rm T}_1\varvec{\Upphi}_{{\mathbf{s}} {{\mathbf{s}}}}^{-1}}{{\mathbf{e}}^{\rm T}_1\varvec{\Upphi}_{{\mathbf{s}}{{\mathbf{s}}}}^{-1}{\mathbf{e}}_1}{\mathbf{s}} (n)\\&= r(n)\end{aligned}$$
(47)

when \({\mathbf{A}}\) is not full-ranked (as in the case when \(M>Q\) for example), \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) is not invertible and a dimension reduction step is necessary. Again, as in the case of the known matrix \({\mathbf{A}}\), principal component analysis (PCA) can be employed. The number of non-null eigenvalues of the covariance matrix \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) will indicate the new dimension of the system, equal to \(Q\). In theory, any well-conditioned linear transform \({\mathbf{P}}\) (\(Q \times M\)) applied to the measured signals will lead to similar results. Indeed, the original model (1) can be rewritten as:

$$\begin{aligned}{\mathbf{P}} {\mathbf{x}} (n) &= {\mathbf{P}} {\mathbf{A}} {\mathbf{s}} (n)\\ {\mathbf{x}}_{\mathbf P} (n) &= {\mathbf{A}}_{\mathbf P} {\mathbf{s}} (n),\end{aligned}$$
(48)

where \({\mathbf{x}}_P\) and \({\mathbf{A}}_P\) indicate the resulting observations and mixing system under the linear combination \({\mathbf{P}}\). The sBSS/MPDR approach can then be applied in this reduced, well-conditioned space to obtain a \(Q\)-dimensional weight vector \({\mathbf{w}}_{\mathbf{P}}\) as:

$${{\mathbf{w}}}_{{\mathbf{P}}} = \frac{\varvec{\Upphi}^{-1}_{{\mathbf{x}}_{\mathbf{P}}{\mathbf{x}}_{\mathbf{P}}}}{{\mathbf{a}} ^{\rm T}_{{\mathbf{P}} ,1}\varvec{\Upphi}^{-1}_{{\mathbf{x}}_{\mathbf{P}}{\mathbf{x}}_{\mathbf{P}}}{\mathbf{a}}_{{\mathbf{P}} ,1}}{\mathbf{a}}_{{\mathbf{P}} ,1}.$$
(49)

A similar analysis as in (47) proves that \(\widehat{r}(n)=r(n)\) i.e., the estimation is perfect.

Finally, when the mixing is under-determined (rank of \({\mathbf{A}} =M<Q\)), the estimated source will be equal to the original, plus some residue:

$$\begin{aligned} \widehat{r}(n)&= \frac{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{{{\mathbf{a}}}_{1}}}{\mathbf{A}} {\mathbf s}(n)\\&= \frac{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{{\mathbf{a}}}_{1}}{{\mathbf{a}}}_{1}r(n) + \frac{\text{ e}_1^{\rm T} {{\mathbf{A}} ^{\rm T}} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}}{{\mathbf{a}}_1^{\rm T} \varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}^{-1}{{\mathbf{a}}}_{1}}{\mathbf{v}} (n)\\&= r(n) + {\widetilde{{\mathbf{v}}}}(n) \end{aligned}$$
(50)

An interesting point must be noted here: unlike (10), the weight vector obtained by sBSS/MPDR methods is the solution of a constrained optimization problem. Therefore, the residue \({\widetilde{{\mathbf{v}}}}(n)\) contains a weighted average of the remaining signals, the weighting being inversely proportional to the power of each source. Thus, powerful sources would be more strongly suppressed as compared to weaker sources. Such a weighting is of advantage and, therefore, for under-determined conditions, this method is recommended even when the mixing matrix \({\mathbf{A}}\) is completely known.

2.4 Experimental setup

We illustrate the reference estimation approach and the benefits of the zero-reference montage (ZR), obtained by eliminating the estimated reference signal from the CR montage, using both simulated signals and real depth EEG measurements, as described next.

2.4.1 Simulation

The aim of this section is to compare the sBSS/MPDR method to matrix inversion (if \({\mathbf{A}}\) is known) or to BSS estimation. The three estimates of \(r\) are \(\hat{r}_{\rm sBSS}\), \(\hat{r}_{A}\) and \(\hat{r}_{\rm BSS}\), respectively.Footnote 5 Two simulation setups are possible:

  1. 1.

    Determined case \(M\ge Q\). We have regrouped the over-determined (\(M>Q\)) and the well-determined (\(M=Q\)) cases together, as the first one reduces to the second after dimension reduction. Therefore, we simulate in the sequel only this last case, that is a full-rank square matrix.

  2. 2.

    Under-determined case \(M<Q\). An important particular case of under-determined mixture is the noisy case: indeed, when considering noisy measures, well- or over-determined mixtures transform in under-determined mixtures, as noise can be considered as a source. This general formulation allows to consider independent noises for every channel (in which case the extra-columns of the mixing matrix corresponding to noise sources will be zero, except for one element) or spatially correlated noises (arbitrary supplementary columns in the mixing).

We have considered, for the simulation, four sources (\(Q=4\)), with the setup presented in Table 1 (see also Fig. 1a).

Table 1 Setup for the simulations
Fig. 1
figure 1

Simulated sources and under-determined mixture (\(M=Q-1\))

As seen in the table, only one source is Gaussian, in order to respect the basic hypothesis of independent component analysis (ICA) based blind source separation. Indeed, taking two or more Gaussian sources prevents ICA from succeeding because of the non-existence of the high order moments.Footnote 6

The powers of the sources are indicated respective to the first (target) one. The mixing matrix \({\mathbf{A}}\) was randomly generated (uniform distribution in \([-1, 1]\), in order to simulate dissipative propagation medium and dipolar-like sources). We have considered 1,000 mixing matrices, the results presented here being the mean values over all simulations.

The considered performance criterion was the correlation coefficient between the target source and its estimate obtained by the three tested approaches (matrix inversion, BSS and sBSS/MPDR). Although mean square error can also be considered for matrix inversion and sBSS/MPDR, it penalizes BSS approaches, as they are unable to estimate correct amplitudes.

2.4.2 Intra-cerebral EEG recordings

Intra-cerebral EEG recordings are acquired from multi-contact depth electrodes implanted in the brain in order to localise the epileptogenic zone (see Fig. 2 for an example). The reference is placed somewhere sufficiently far away from this zone, so it can be considered as independent, although unknown and different from 0. The depth electrodes might have from 10 to 15 contacts each, with a 2 mm distance between them. The total number of acquired signals varies around 100, depending on the number of implanted electrodes by patient. The unknown reference signal contributes to all recordings according to model (2).

According to this description of the recording setup, it appears that one might choose to estimate the reference either using all the recorded signals or after some dimension reduction. We will not insist here on the different choices and on their influence on the quality of the estimation from a medical interpretation point of view, this analysis will be presented elsewhere. We will only focus on three example of reference estimation and elimination considering a subset of the recorded signals. The obtained corrected montages will be called further on zero-referenced (ZR).

The considered signals are obtained from three patients diagnosed with temporal lobe epilepsy, at the University Hospital (CHU) from Nancy, France. Each patient gave his informed consent and the study was approved by the ethics committee of the hospital.

Fig. 2
figure 2

Depth EEG implantation example. a Implantation scheme (the electrodes insertion points are superimposed on the MRI image in a saggital view). b Axial view on the level of a horizontally inserted multi-contact depth electrode (left side up)

3 Results

3.1 Simulated signals

3.1.1 Determined case \(M=Q\)

In this simple case, all approaches should be essentially equivalent and the solutions should be ideal (assuming that the independence condition necessary for BSS approaches is respected). Indeed, extensive numerical simulations confirm this expected hypothesis: matrix inversion leads to perfect reconstruction (correlation \({\approx}1\)), while BSS and sBSS/MPDR solutions are very close between them and close to the ideal solution (mean correlation coefficients >0.99).

3.1.2 Under-determined case \(M=Q\)

A more crucial case arises when \(M<Q\). In such a case, the MPDR solution weights the residual sources by the inverse of their power, thus giving lower weights to the more powerful sources. Thus, we would expect that the data-adaptive structure of the MPDR would be better than when only applying the solution in (10).

This is presented in the simulation below, where we considered only three measuring channels for the four sources from Table 1 (\(M=3\), \(Q=4\), see an example of mixed signals in Fig. 1b). An example of source estimate \(\widehat{r}(n)\) is presented in Fig. 3.

Fig. 3
figure 3

Original (\(r\), dashed line) and estimated reference signals using the different approaches when \(M<Q\). \(\widehat{r}_{\mathbf A}\) is the estimate when using complete knowledge of \({\mathbf{A}}\) (correlation coefficient = 0.83), the \(\widehat{r}_{\rm BSS}\) is the solution using FastICA (correlation = 0.89) and the estimate using the sBSS/MPDR approach is \(\widehat{r}_{\rm sBSS}\)

(correlation = 0.91)

Mean and standard deviation values of the correlation coefficient between \(r\) and its estimates \(\hat{r}_{A}\), \(\hat{r}_{\rm BSS}\) and \(\hat{r}_{\rm sBSS}\) (1,000 simulations) are given in Table 2. The distribution of the correlation coefficient over all simulations is presented in the box-plots of Fig. 4.

Table 2 Performance evaluation using the correlation coefficient between the simulated original source \(r\) and different estimates (mean values over 1,000 simulations) for the under-determined case
Fig. 4
figure 4

Correlation coefficients distribution over 1,000 simulations. As it can be seen, the proposed sBSS method surpasses both matrix inversion and classical BSS, both in terms of mean value and of robustness to the mixing characteristics

3.2 Real depth EEG signals

3.2.1 Noisy ictal iEEG

The analyzed time window was 20 s length, recorded at a sampling frequency of 512 Hz. The signals are measured relative to a reference contact placed in the skull-bone (one of the exterior recording contacts of one depth electrode). The complete depth EEG has 112 common references channels.

The considered subset of signals is recorded by electrode \(OT\), implanted in the right median and lateral occipital lobe below the calcarine sulcus. We chose this electrode because it was initially involved by the epileptic discharge in this case. It had 12 measuring contacts inside the brain (\(OT_1\) to \(OT_{12}\), \(OT_1\) being the most profound).

Figure 5 shows an example of the clinical use of the estimated zero-reference montage (ZR). After an initial fast low voltage activity starting from second 4, the discharge appears as a rhythmic activity in the theta band (4–8 Hz) from second 10, on the lateral contacts \(OT_7\) to \(OT_{10}\). We used the proposed method to estimate the reference signal and to correct the original acquisition montage. As it can be seen in Fig. 5a, in the original common reference montage (CR) all signals had a rather noisy appearance, while in the corrected zero-reference montage (ZR, Fig. 5b) the obtained signals were much cleaner. This noise is due to an unexpected electrical noise appearing on the reference channel.

Fig. 5
figure 5

Depth ictal EEG example using different montages (20 s). Shaded column on the right approximately represents the brain structures explored by the \(OT\) electrode (according to the patient scanner). One can notice the elimination of the reference noise between b and a, while still preserving amplitude and topography information better than in c

To illustrate the clinical use of a zero-referenced montage, we compare it to the usual bipolar montage (BL), routinely employed for iEEG interpretation (Fig. 5c). Clearly, the BL montage eliminates the reference artefact and it provides a local view of the brain activity. On the other hand, it loses the amplitude and propagation information, preserved on the (zero)-referenced signals.

The spatial decrease of the signal amplitude is significantly different for the corrected ZR montage compared to the bipolar montage. In Fig. 6, CR and ZR montage clearly indicate a gradual decrease of amplitude roughly matching the separation between grey matter (containing the electrical sources) and white matter (electrically inactive). Maximum amplitude is noticed on contact \(OT_8\) with these montages. In contrast, bipolar montage shows maximum amplitude on \(OT_9-OT_{10}\) and a decrease of amplitude on \(OT_8-OT_9\). This local minimum could be falsely considered as electrically inactive (white matter) because of the low signal in this cerebral area.

Fig. 6
figure 6

Normalised power of the signals recorded by the different contacts of the \(OT\) electrode. The corresponding powers of the signals issued from the BL montage are represented in intermediate positions among the contacts. Shaded row on the bottom of the figure approximately represents the brain structures explored by the electrode

Concerning the electrical diffusion of the potentials inside the brain, visual analysis of the ZR montage allows identifying a clear electrical propagation in the white matter (contacts 4–6) from generators located in the grey matter (contacts 7–10). This diffusion, absent on the BL montage, might be useful to better estimate the location and the orientation of the neural generator.

3.2.2 Artefacted spontaneous iEEG

In the previous example, all signals were issued from a one of the implanted depth electrodes having 12 contacts. In the general case, this is not the most favourable situation, as the signals might be highly correlated and thus the covariance matrix \(\varvec{\Upphi}_{{\mathbf{x}} {{\mathbf{x}}}}\) might be badly conditioned (i.e. numerically difficult to invert in, for example, Eq. 29). The second example we present here concerns one contact by implanted depth electrode (7 electrodes, thus 7 contacts in all), measured with respect to a scalp reference placed in the FPz position according to the 10–20 system. The signals have 5 s length and are sampled at 512 Hz. Right temporal lobe was implanted with depth electrodes, from the anterior to the posterior part, in order to delineate the epileptogenic zone. The anatomical structures explored by the considered contacts are the insula (\(T_1\) and \(H_1\)), the entorhinal cortex (\(TB_1\)), the hippocampus (\(B_1\) and \(C_1\)), the temporal pole (\(P_1\)) and the amygdala (\(A_1\)). The raw signals are presented in Fig. 7a.

As seen in Fig. 7a, all signals are perturbed by additive noise and artefacts, very likely affecting the reference electrode. Both noise and artefacts disappear, as expected, when using the bipolar montage obtained by subtracting neighbouring contacts from the same depth electrode, see Fig. 7b. For example, the peak appearing around second 4 on the common reference montage is completely removed. On the other hand, this visualization also drastically reduce the amplitude of some patterns not present on the reference, but still appearing on several contacts (positive peak after second 1). For the corrected ZR montage, this pattern is preserved clearly identified on \(TB_1\), \(P_1\) and \(A_1\), which are implanted in neighbouring and connected regions of the brain. Roughly,Footnote 7 this can be evaluated by computing the correlation coefficients \(\rho\) between the involved signals: \(\rho_{TB_1,P_1}=0.69\), \(\rho_{TB_1,A_1}=0.84\) and \(\rho_{P_1,A_1}=0.80\). These correlations are significantly higher than all the other correlation values among electrodes: the next value equals 0.46 between contacts \(A_1\) and \(C_1\), still situated in closely connected regions(amygdala and hippocampus). The relations between these signals (and presumably between the corresponding brain areas, anatomically connected in the human brain) are masked on the CR montage (more than half of the correlations are >0.8 because of the reference artefact) and they are reduced on the BL montage because of the elimination of the activity appearing simultaneously on two neighbouring contacts (\(\rho_{TB_1-TB_2,P_1-P_2}=0.60\), \(\rho_{TB_1-TB_2,A_1-A_2}=0.32\) and \(\rho_{A_1-A_2,P_1-P_2}=0.04\)).

Fig. 7
figure 7

Depth EEG obtained using one contact for each electrode (two neighbouring contacts for the BL montage). The signals are ordered by anatomical structure: insula (\(T_1\) and \(H_1\)), entorhinal cortex (\(TB_1\)), hippocampus (\(B_1\) and \(C_1\)), temporal pole (\(P_1\)) and amygdala (\(A_1\))

 

3.2.3 Interictal spikes

A last example is presented for interictal spikes enhancement. These pathological EEG patterns appear between seizures in epileptic patients and they are markers of the epileptic disease having a characterized morphology. They are usually present on several recorded signals, increasing thus the correlation between them. Visual analysis of these patterns helps the neurologists to localize malfunctioning regions in the brain. In classical iEEG analysis, a spike changing its polarity on two neighbouring signals of a bipolar montage indicates that the generator is situated close to the common contact (see example in Fig. 8b), signals \(A_{10}-A_{11}\) and \(A_{11}-A_{12}\)). On the other hand, the BL montage also might diminish the amplitude of a spike, as for example on signals \(TB_9-TB_{10}\) and \(TB_{10}-TB_{11}\). This effect of the BL montage is corrected on the ZR montage from Fig. 8c, where the spikes are preserved and they can clearly be distinguished from the background activity (although they are almost masked on the original montage, perturbed by the reference artefact, see Fig. 8a).

Fig. 8
figure 8

Depth EEG obtained using three contacts for three electrodes, all placed in the external right temporal lobe (four neighbouring contacts per electrode for the BL montage). The enhancement of the interictal spikes is delineated by the dotted lines

 

4 Discussion

The main objective of this work was to revisit the reference problem in iEEG signal processing and present a unified multichannel signal processing framework. Generalizing previous approaches presented in the literature, we have shown that, under certain realistic hypothesis, the reference signal can be estimated from the measures using a sBSS approach, based on partial knowledge of the mixing model.

We have shown furthermore that the sBSS approach is strictly equivalent to MPDR filter and that it achieves optimal performances in terms of SNR. The developed sBSS/MPDR algorithm was compared with completely blind approaches (i.e. considering that the mixing is completely unknown) and with direct matrix (pseudo-)inversion (i.e. considering that the mixing is completely known). We have shown, using different measuring setups (well, over and under-determined mixtures), that our method yields comparable or better results than both classical blind source separation and matrix inversion.

The main benefit of the reference estimation is the construction of a corrected reference-free montage ZR, which can help both clinical interpretation and further automatic EEG analysis. Possible applications (see first and second examples in the Sect. 3.2) are direct clinical analysis of ictal, interictal and background iEEG without transforming the data into a bipolar montage, offering thus a complementary view on the brain electrical activity. A promising research direction is the intra-cerebral source localization using iEEG measures, potentially allowing the localization of sources which are not situated in structures implanted by the iEEG electrodes (see for example the interictal spikes, enhanced in the third example). The usefulness of our method should be validated further for evoked potential (EP) analysis: by construction, it makes no hypothesis on the morphology of the informative signals, so it should accurately correct the reference for EP recordings also. On the other hand, as EP are mostly studied after averaging on several trials, the reference influence is anyway diminished and it probably disappears if the number of trials is sufficiently big.

As a final note, we would like to add that we have been able to prove in [12] that even the methods of [5, 7] converge to the MPDR solution. Thus, the state-of-the-art approaches for reference estimation all fall within our proposed framework.

5 Conclusion

We present a novel and rigorous methodology for the reference estimation problem in depth EEG signal processing. We prove that this method is optimal in terms of SNR maximisation and demonstrate, further, that it also encompasses existing approaches. The practical usefulness of the proposed approach was illustrated on simulated and real iEEG recordings taken in different situations (background, interictal spikes, epileptic seizure).