Introduction

Electroencephalography (EEG) records the dynamic of brain networks on a sub-second time scale. The high temporal resolution of EEG allows to study how brain activity propagates and interacts in large-scale networks by applying connectivity measures to the recorded signals. However, connectivity measures based on scalp electrode measurements (sensors space) are not revealing the true interactions among brain sources. Neighbouring electrodes measure signals that are highly correlated, leading connectivity algorithms to estimate sham links. Indeed, the measurements of the voltage potential at various locations on the scalp are the result of the simultaneous activity of many different configurations of distributed current generators in the brain (De Munck et al. 1988; Van de Steen et al. 2016; Haufe et al. 2013; Brunner et al. 2016). To obtain physiologically plausible results, the reconstruction of brain source activity before computing connectivity is strictly required. However, the underlying brain source activity cannot be estimated uniquely from the scalp data, without invoking priors or constraints on the inverse solution. Functional connectivity analysis in the source space has been divided into two main groups (Barzegaran and Knyazeva 2017). One group of methods employs neuronal models of interacting brain regions, i.e., dynamic causal models (DCM), as priors which are added to the spatial forward model to reconstruct the scalp EEG data. Thereby, reasonably realistic assumptions of source dynamics are required (Kiebel et al. 2006; Daunizeau and Friston 2007). Provided that the model assumptions are physiologically meaningful, the DCM approach allows to infer not only the source dynamics but also the coupling parameters shaping interactions among sources (Daunizeau and Friston 2007). The other group of methods does not use assumptions on the network structure and is characterized by a two-step procedure. First, the scalp data is inverted to the source space using distributed source models and then functional connectivity measures are applied to the estimated sources. A priori assumptions have to be introduced to solve the ill-posed inverse problem. For instance, Local AUtoRegressive Average (LAURA), the distributed linear inverse solution used here, incorporates biophysical laws into the minimum norm solution (De Peralta Menendez et al. 2004). By incorporating such priors, the distribution of the simultaneously active sources at each moment in time can be estimated from the high-density EEG scalp potentials informed by the individual anatomy derived from magnetic resonance imaging (MRI) and realistic volume conduction physics (Michel and He 2018, 2012; Michel et al. 2004; Michel and Murray 2012; Grech et al. 2008). The estimated activity at each solution point in the brain is described by a three dimensional dipole (x, y, z). After the estimation of the dipole activity at each solution point, the brain is usually parceled into regions before connectivity estimation, because the full spatial size of the data (more than 5000 solution points) is unreasonable in terms of computations and statistical power. The choice of the parcellation scheme and resolution is crucial as it has effects on network topological characteristics. It depends on the type, quality and resolution of data and on the study purpose and can be based either on anatomical or functional assumptions (Reus and Van den Heuvel 2013). The most commonly used anatomical-based parcellation atlases are, among others, the automated anatomical labeling (AAL) atlas (Tzourio-Mazoyer 2002; Evans et al. 2012) and FreeSurfer’s Desikan Killiany atlas (Desikan et al. 2006; Fischl et al. 2004). After parcellation, it is possible to build a graph representation of the brain (Rubinov and Sporns 2010) where nodes are associated to the brain regions of interest (ROIs), and edge weights are given by functional (Nolte et al. 2004; Stam et al. 2007; Ioannides et al. 2000) or effective (Baccalà and Koichi 2014; Wibral et al. 2014) connectivity measures that are robust to volume conduction effects. To estimate either directed or undirected connectivity, all the solution points estimated in each ROI need to be summed up in a unique time-series. The approaches proposed in the literature usually consist of two steps. In the first step, for each dipole, either the norm is computed or the direction of the dipoles is fixed using different techniques. One approach is the computation of the norm (i.e. computing absolute dipole amplitude while discarding the orientation of the dipoles) or the power modulation using the Hilbert transform (Baker et al. 2014; Brookes et al. 2011). This, however, may be problematic for connectivity estimation because the phase information contained in the original signal is lost (Vidaurre et al. 2016). Other current methods to fix the dipole orientation within a ROI are either the projection to the refined average direction across time and epochs (Coito et al. 2016); the selection of the dipole orientation orthogonal to the segmented grey matter based on the assumption that the orientation of the dipoles should resemble the orientation of the apical dendrites of the pyramidal neurons (Phillips et al. 2002) or the selection of the orientation maximizing the projected power (Barnes et al. 2004). The second step consists in either averaging all dipole time-series within the ROI once the dipole orientation is fixed (Hassan et al. 2017) or applying principal component analysis (PCA) to obtain the representative time-series (Supp et al. 2007; Gruber et al. 2008).

Other popular one-step solutions are either to compute the average cortical activity in each ROI by means of the instantaneous average of the signed magnitude of all the dipoles within the ROI (Astolfi et al. 2007) or to consider only the source activity of the solution point closest to the geometric center of each ROI, i.e., the centroid, as the representative source waveform (Coito et al. 2015; Sperdin et al. 2018; Canuet et al. 2011; Adebimpe et al. 2016). However, the selection of only one dipole out of hundreds does not necessarily properly represent the activity in a given ROI. Concerning the averaging approach, a common observation is a drastic amplitude reduction. Indeed, due to the extensive folding of the human cerebral cortex, some sources in the ROI may be almost perfectly parallel to each other, but inverted in orientation, leading to cancelation when averaging them. The resulting signal amplitude reduction could lead to decrease in accuracy of the subsequent analysis and affect the final results. For instance, connectivity estimation involves computing the inverse of the matrix containing the representative source waveforms, which, if the values are small, may lead to a bad-conditioned matrix with a high condition number (Cline et al. 1979), i.e., even a small error in the data can produce a large deviation in the solution. Moreover, low-amplitude time-series may increase the rate of false positive connections, e.g., low-amplitude time-series may easily fit in large-amplitude time-series leading to misleading high autoregressive coefficients (van Mierlo et al. 2018).

In this work, we propose to extract the dominant signal reflecting the main pattern of variation of all the solution points in the same ROI by using singular-value decomposition (SVD) and considering the first singular vector. This method enables both to identify the main direction of all the dipoles of a ROI and to discard the contribution of the outlier dipoles. The novelty with respect to the other approaches proposed in the literature is that SVD provides a population signal that incorporates the behavior of all the dipoles within the ROI without choosing or selecting specific active voxels, as it is usually done (Barnes et al. 2004; Supp et al. 2007; Gruber et al. 2008; Zhou et al. 2009).

The evaluation of the performance of a functional connectivity analysis method on real data sets is difficult because an objective ground truth is usually not available. Two type of data that are often used for method evaluation are: interictal spikes of patients with focal epilepsy where the focus localization is known from intracranial recordings or from successful surgery (Brodbeck et al. 2011; Megevand et al. 2014), and evoked potentials where the generation and the propagation of information in the brain is well understood (Kiebel et al. 2006). Here, we used these two type of datasets to demonstrate the validity of the proposed method. We analyzed spike data of patients with focal epilepsy who had positive outcome of the surgery after 12 months. We expected that our method correctly identifies the epileptic focus (the ROI lying within the resected zone) as major driver of the epileptic network. For the evoked potentials, we analyzed a dataset of visual evoked potentials (VEP) after presentation of face stimuli. We expected that the proposed method would reliably identify the main components of the VEP, i.e. the P100 and the N170 in the source space, and that the major driver of the network would be localized to the lateral, basal temporal and occipital cortices including the fusiform gyrus as shown trough EEG changes from implanted electrodes (Miller et al. 2017; Baroni et al. 2017; Hamamé et al. 2014). In addition, we evaluated the performance of the proposed method in realistically simulated data, where the ground-truth was known. We also compared the method to the common procedure of both, extracting the time-series of the centroid in each ROI and extracting the time-series with the highest power in each ROI.

Methods

Data Description

Dataset 1: Visual Evoked Potential of Face Perception

Many behavioral studies have investigated the process involved in visual stimuli such as face images (Haxby et al. 2000, 1999; Hoffman and Haxby 2000). Traditional measures are based on the N170 face-sensitive evoked response component (Rossion and Caharel 2011). Human faces evoke a large negative potential (N170) over the occipital-parietal scalp, more prominent over the right than the left hemisphere, which is reduced in evoked potentials elicited by other animate and inanimate non-face stimuli (Bentin et al. 1996). Applying effective connectivity in face perception, i.e., describing the network of directional effects of one brain region over another, may be a powerful instrument to study this visual process. In order to study these causal effects, it is important to precisely reconstruct the face-response stimulus in the source space. For this reason, we investigated the ability of our method to reconstruct the dynamics of visual evoked potential (VEP) in source space based on high density EEG (hdEEG) data.

Participants (N = 13, two males, age = 24.15 ± 3.41) sat in a dimly lit sound-attenuated and electrically shielded room with their head positioned on a chinrest at ~ 70 cm from the monitor. Each trial lasted 1.2 s and started with a blank screen lasting 500 ms. After the blank interval, one image (either a face or a scramble image) was presented for 200 ms and participants had the remaining 1000 ms to respond. The task was to report whether they saw a face or not (yes/no task) by pressing two buttons in a response box. Faces and scrambled images were randomly interleaved across trials. After the participant’s response, there was a random interval (from 600 to 900 ms) before the beginning of a new trial.

The experiment consisted of four blocks of 150 trials each, for a total of 600 trials, i.e., 300 with faces and 300 with scrambled images (Ales et al. 2012). For this study, we used the EEG data in response of the face images (300 trials per subject).

During the experiment, EEG data were recorded continuously at 1024 Hz through a 128-channel Biosemi Active Two EEG system (Biosemi, Amsterdam, The Netherlands). Electrode impedance was kept < 20 kΩ.

Dataset 2: Interictal Epileptiform Discharges in Focal Epilepsy

hdEEG source imaging plays a central role in diagnosis and management of patients with focal epilepsy (Brodbeck et al. 2011). However, recent work in the literature provided evidence that epilepsy is a disorder affecting neural networks (Sheybani et al. 2018). Thus, connectivity measures and graph analysis are promising tools to extract network information from both hdEEG and neuroimaging data (Van Diessen et al. 2014; Engel et al. 2013; Richardson 2012). We applied our method on pre-surgical interictal spikes (IEDs) recorded in patients with pharmaco-resistant focal epilepsy, who subsequently underwent epilepsy surgery.

The patients (N = 7, three males, age = 23 ± 14 years) were selected from those admitted for pre-surgical evaluation to the EEG & Epilepsy Unit, Department of Clinical Neurosciences, University Hospital of Geneva (HUG), Switzerland. They underwent hdEEG long-term (> 4 h) recording with 256 electrodes in the context of their pre-surgical evaluation, and subsequently underwent resection of the estimated epileptogenic zone causing their focal epilepsy. The outcomes of the surgery after 12 months along with the exact location of the resection zone were available from postoperative structural MRI and were used as validation for the localization of the generators of the interictal epileptic discharges.

The hdEEG was recorded with the Geodesic Sensor Net with 256 electrodes (Electrical Geodesic, Inc., Eugene, OR, U.S.A.). Electrode-skin impedances were maintained < 15 kΩ. The recordings were sampled at 1 kHz, referenced to Cz. Then, an epileptologist, G.T., marked 41 ± 18 hdEEG epochs containing the interictal spikes for each patient. Then, the 1-s hdEEG epochs centered on the spike peak were used as input of the analysis.

Preprocessing

The VEP EEG signals were downsampled at fs = 200 Hz and detrended to remove slow fluctuations and linear trends (Bigdely-Shamlo et al. 2015). The line and monitor noise (50 and 75 Hz, plus harmonics) were attenuated with an adaptive multitaper filter (Cleanline plugin for EEGLAB). EEG epochs were then extracted from the continuous dataset and time-locked from − 1000 to 1000 ms relative to the onset of each image. Noisy channels were identified by visual inspection and removed before preprocessing. Individual epochs containing non-stereotyped artifacts, peri-stimulus eye blinks and eye movements (occurring within ± 500 ms from stimulus onset) were also identified by visual inspection and removed from further analysis (mean number of epochs removed across participants: 6 ± 5). Data were cleaned from remaining physiological artifacts (eye blinks, horizontal and vertical eye movements, muscle potentials and other artifacts) through a PCA-informed ICA algorithm implemented in EEGLAB. After ICA cleaning, the identified artifact channels were interpolated using the nearest-neighbor spline method and the data were re-referenced to the average reference.

The EEG data containing the spikes of epileptic patients were filtered between [0.5 40] Hz with 5th order Butterworth filter avoiding phase distortion. Finally, the data were down-sampled at fs = 250 Hz.

EEG Source Estimation

In this study, we applied the LAURA algorithm implemented in Cartool (58) to compute the source reconstruction in the individual MRI applying the local spherical model with anatomical constraints (LSMAC) and taking into account the patient’s age to calibrate the skull conductivity (De Peralta Menendez et al. 2004; Brunet et al. 2011; Spinelli et al. 2000). The LSMAC method restricts the solution space to the gray matter of the individual brain.

Whole Brain Segmentation and Parcellation

Starting from the high-resolution T1-weighted image, using the Connectome Mapper open-source processing (Daducci et al. 2012) that calls the version 6 of the Freesurfer image analysis suite (62), we resampled the image to isotropic 1 × 1 × 1 mm3 and we segmented the whole brain in white matter, grey matter, i.e., cortical and sub-cortical structures, and cerebrospinal fluid based on the anatomical Desikan-Killiany et al. (2006) and Destrieux et al. (2010). At the end of the process, the cortex was parcellated into a total of 83 regions, which accounted for all the cortical structures of the Desikan-Killiany anatomical atlas, as well as the deep-grey nuclei and the brainstem (Daducci et al. 2012).

Projection Method Based on SVD

An estimate of the x − y − z − space coordinates of the circa M = 5000 cortical dipoles was obtained for each time point. The estimation of the dipole sources \(\tilde {x}\) is based on the solution of the following regularized equation:

$$\tilde {x}=\arg {\hbox{min} _{\varvec{x}}}\left( {\left\| {{\varvec{L}}{\varvec{x}} - {\varvec{y}}} \right\|_{N}^{2}+\varvec{\gamma}\left\| {\varvec{x}} \right\|_{M}^{2}} \right)~$$
(1)

where \({\left\| \cdot \right\|_{\text{M}}}\) represents the M-norm, L is the lead field matrix, x the dipole sources, y the EEG scalp potentials and γ is the regularization parameter which can be estimated by different criteria, e.g., by the L-curve approach (Hansen 1992).

Subsequently, first, we associated the dipoles to their respective ROI based on the 82 atlas labels (the brainstem was excluded). Second, for each ROI separately, we collected all the N time samples of the x − y − z − space coordinates of the n dipoles included in a given ROI and we organized the zero-mean data in a matrix D, [N × (n × 3)], as follows:

$${\varvec{D}}=\left[ {\begin{array}{*{20}{l}} {{d_{{x_1}}}\left( {{t_1}} \right)}&{{d_{{y_1}}}\left( {{t_1}} \right)}&{{d_{{z_1}}}\left( {{t_1}} \right)}& \cdots &{{d_{{x_n}}}\left( {{t_1}} \right)}&{{d_{{y_n}}}\left( {{t_1}} \right)}&{{d_{{z_n}}}\left( {{t_1}} \right)} \\ {}& \vdots &{}& \ddots &{}& \vdots &{} \\ {{d_{{x_1}}}\left( {{t_N}} \right)}&{{d_{{y_1}}}\left( {{t_N}} \right)}&{{d_{{z_1}}}\left( {{t_N}} \right)}& \cdots &{{d_{{x_n}}}\left( {{t_N}} \right)}&{{d_{{y_n}}}\left( {{t_N}} \right)}&{{d_{{z_n}}}\left( {{t_N}} \right)} \end{array}} \right]$$
(2)

After that we applied the SVD to this matrix:

$${\varvec{D}}={\varvec{U}}{\varvec{S}}{{\varvec{V}}^{\varvec{T}}}$$
(3)

where the apex T stands for the transpose and the columns of D can be seen as the linear combinations of the columns of U with the coefficients given by the columns of SVT. Because of the singular values contained in the diagonal of S appear in a decreasing order, we considered the first column of U u1 [N × 1], i.e., the orthonormal vector projected along the axis that represents the major orientation of all the dipoles, like the signal that explains most the variability of the data and as the best representation of the ROI content. In other words, Eq. (3) assumes that the data matrix D comprises hidden components ui that are mixed together through coefficients S. Standard matrix factorizations in linear algebra, such as SVD, owe their uniqueness to hard and restrictive constraints such as orthogonality (Cichocki et al. 2015).

Connectivity Estimation

Among the different techniques for extracting effective connectivity, information partial directed coherence (iPDC) properly accounts for size effects in gauging connection strength, as reported in detail in (Sameshima and Baccala 2014). In particular, iPDC is a multivariate spectral measure to compute only the directed influences between any given pair of signals (i,j) of a multivariate dataset. This information is condensed in a complex function \(~iPD{C_{i \leftarrow j}}\left( f \right)\) of the frequency f, which measures the relative interaction of the signal j with regard to signal i as compared to all j’s interactions to other signals in the multivariate dataset. While we refer the reader to (Takahashi et al. 2010) for the mathematical details, the procedure for computing iPDC is briefly described by the following two steps.

In the first step, the cortical waveforms \(\tilde {x}\) computed after applying the projection method described in the previous section, are fitted against a time-variant (tv) multivariate autoregressive (MVAR) model to overcome the problem of non-stationarity of the EEG data. If the EEG data are available as several trials of the same length, the cortical waveforms computed from the EEG data generates a collection of realizations of a multivariate stochastic process which can be combined in a multivariate, multi-trial time series:

$$\tilde {{\varvec{x}}}(t)=\left[ {\begin{array}{*{20}{l}} {\tilde {{\varvec{x}}}_{1}^{{(1)}}(t)}& \cdots &{\tilde {{\varvec{x}}}_{d}^{{(1)}}(t)} \\ \vdots & \ddots & \vdots \\ {\tilde {{\varvec{x}}}_{1}^{{(K)}}(t)}& \cdots &{\tilde {{\varvec{x}}}_{d}^{{(K)}}(t)} \end{array}} \right]\;t={t_1}, \ldots ,{t_N}$$
(4)

where t refers to the time points, N the length of the time-series, K the number of trials and d the number of ROIs.

Then the data in \(\tilde {x}\) are fitted against a tvMVAR model in the general form:

$$\tilde {x}(t)= - \sum\limits_{{r=1}}^{p} {{{\varvec{A}}_{\varvec{r}}}\left( t \right){\varvec{X}}\left( {t - r} \right)+{\varvec{W}}\left( t \right)}$$
(5)

where Ar(t) are the [d × d] AR matrices containing the model coefficients, W(t) is the stationarity zero-mean white noise process also called innovation process with covariance matrix ∑w, and p is the model order, usually estimated by means of the Akaike Information Criteria for MVAR processes (Akaike 1998). The General Linear Kalman filter approach is applied in order to estimate the coefficients of the time-variant AR matrices and the innovation process ∑w (Milde et al. 2010).

As the MVAR model is estimated, for each time-point t, having defined the complex matrix B(f) as:

$${\varvec{B}}\left( f \right)={{\varvec{I}}_{\varvec{d}}} - \sum\limits_{{r=1}}^{p} {{{\varvec{A}}_{\varvec{r}}}{e^{ - j2\pi f}}}$$
(6)

where Id is the identity matrix and j is the imaginary unit in this equation, the iPDC complex function from the time-series j to the time-series i is obtained by:

$$iPD{C_{i \leftarrow j}}\left( f \right)=\sigma _{{{w_{ii}}}}^{{ - 1/2}}\frac{{{b_{ij}}(f)}}{{\sqrt {{\varvec{b}}_{{\varvec{j}}}^{{\varvec{H}}}(f)\varvec{\varSigma}_{{\varvec{w}}}^{{ - 1}}{{\varvec{b}}_{\varvec{j}}}(f)} }}$$
(7)

where bj(f) and bij(f) are respectively the j-th column and the (j,i)-th element of matrix B(f), σw_ii is the (i,i)-th element of the innovation covariance matrix ∑w, and the apex H in \({\varvec{b}}_{j}^{H}\) stands for Hermitian transpose, i.e., obtained from bj by taking the transpose and then the complex conjugate of its components.

The complex function \(iPD{C_{i \leftarrow j}}\left( f \right)~\) of Eq. (7) is usually analysed in terms of its absolute value.

Simulation

To test if the SVD method is capable of detecting an effective connectivity map of the human brain, we simulated a simple four-node (ROI) network with different delays. We generated the time-course of the dipoles laying in the right occipital region of the brain and then a delayed version of 3 ms with the same profile in the left occipital region. The same signals with a reduced amplitude (80%) and a delay of 5 ms were placed in the left and right inferior temporal regions of the brain. The time-course chosen was the average VEP reconstructed in the source space of the Dataset 1 in the right occipital cortex in the first 500 ms after the stimulus. The orientation of the dipoles was chosen perpendicular to the cortex. Each realization had a sample rate of 200 Hz with 100 time points.

After having reconstructed these waveforms, white Gaussian noise with a SNR = 5 was added to the simulated waveforms and it also generated the background activity of the other dipoles of the model. These M = 5000 dipoles were then multiplied with the lead field matrix L estimated for each subject of the Dataset 1 obtaining the simulated EEG. We obtained 20 epochs for each subject by adding 20 different profiles of noise.

Results

Application on Visual Evoked Potentials

In Fig. 1a, we report the dipoles (arrows in the figure) representing the source waveforms in the right lateral occipital cortex of a representative subject (sub #1) in the 500 ms after the face stimulus from different perspective views. We chose the right lateral occipital cortex to visualize the results, because we clearly localized the N170 component in this region. In addition, this source localization is also consistent with the literature (Grill-Spector et al. 2004) using MRI localizer scan that revealed two additional extrastriate regions beyond the fusiform face area that responded more strongly when subjects viewed faces than when they viewed objects. These include brain regions in the occipital gyri and in the superior temporal sulcus (Hoffman and Haxby 2000). To be able to compress four dimensions, i.e., x, y, z and time axes, in a 2-D figure, the x-axis is carrying both the information of the x dimension and the time dimension. In other words, we rigid translated each dipole along the x-dimension to represent its time evolution. In Fig. 1b, we report the three different projection planes of the space represented in Fig. 1a. In Fig. 1c, the time-series representing the time-course of the source waveforms in the right lateral occipital cortex projected in the x, y and z axes are depicted respectively from top to bottom. Interestingly, in the estimated VEP source waveforms after the visual stimulus, the orientation over time of all the set of dipoles is not random. Furthermore, we qualitatively observe the existence of a main direction that maximizes the magnitude of the majority of dipoles. Having noticed that, summing the dipoles content in each ROI by the orthonormal vector projected along the axis in space that represents the major orientation of all the dipoles should explain most the variability of the data and be an accurate representation of the ROI content.

Fig. 1
figure 1

a All dipoles representing the solution points in the source space are reported as arrows. Each colour represents the dynamics of a different dipole over time in the right lateral-occipital brain region in a representative subject (sub #1). b Views of the x–z-plane, x–y-plane and z–y plane are represented from top to bottom for dipoles of (a). c x–y–z–time-components of all the dipoles in the right lateral-occipital region

As previously stated, in several previous studies, e.g. (Coito et al. 2015; Sperdin et al. 2018), the dipole lying in the centroid is considered as representative for the entire ROI. For this reason, in Fig. 2, we compared the temporal patterns and the frequency content of the hdEEG recordings 500 ms following the stimulus presentation Fig. 2a with the reconstructed time-series in the inverse space obtained from the proposed SVD method Fig. 2b and the source activity in the centroid Fig. 2c for sub#1. In Fig. 2b and c respectively, we reported the first principal component computed from both the first eigenvector for each ROI and for the three x–y–z-components of the source activity in the centroid. After applying SVD, dealing directly with the first eigenvector or re-projecting the first eigenvector on the original data space is a user choice. It depends if the user needs to deal with normalized time-series or if she/he cares about the amplitude content of the signal. Observing both the proposed reconstruction (Fig. 2b) and the centroid time-series (Fig. 2c), we found that they strongly differ in the amplitude magnitude as visible in their absolute power spectral density values. However, the relative power distribution among the canonical EEG-frequency bands does not significantly differ between the two different reconstructions (Mann–Whitney U-test, p > 0.98).

Fig. 2
figure 2

Signal and corresponding power spectral density average among trials of a 128 high-density EEG time-courses representing the visual evoked potential in a representative subject (sub #1), b ROI time-series computed though SVD in sub #1, and c the first principal component of the time-series lying in the centroid of each ROI

To emphasize the differences between the two methods, we compared the ability in detecting the P100 and N170 peaks of the proposed representative time-series based on SVD computation and the centroid one. P100 is the first dominant component in response to visual stimuli with a lateral occipital positivity (Alonso Prieto et al. 2011), followed by the N170. The N170 is a component of the evoked potential that reflects the neural processing of faces and its response should be maximal over occipital-temporal electrodes (Ghuman et al. 2014; Rossion and Jacques 2008). In Fig. 3, we report for a representative subject and for all the subjects the average EEG signal in the sensor space at electrode B11 (P8) located over the right parietal lobe Fig. 3a, b and the reconstructed time-series in the source space through the SVD and the centroid in the right lateral-occipital cortex Fig. 3c, d. Figure 3 shows that the centroid time-series has lower amplitude and a flatter morphology than the SVD time-series in a representative subject Fig. 3c and across subjects Fig. 3d in the source space. The results in Fig. 3 confirmed that the SVD time-series present a coherent pattern compared to the signal recorded on the scalp and the amplitude and the latency of the peaks of interest can be easily estimated. In order to check for latency differences between the methods, we computed as reference the Global Field Power (GFP) (Brunet et al. 2011) from the hdEEG for each subject in order to determine the latency of the maxima of the components P100 and N170. For instance, for sub#1, the two detected latencies were t = 105 ms for P100 and t = 145 ms for N170. We then calculated the inverse solution on the average evoked potential with Cartool (Brunet et al. 2011) and we localized the ROI containing the maximum of the norm of the source waveforms for both peaks. We then compared the latencies estimated in the time-series obtained by the proposed projection method (Fig. 4a) with the time-series derived from the centroid method (Fig. 4b) in the selected ROI. Results in a boxplot form (Fig. 4c) show the latencies estimated through the GFP, in blue, the SVD time-series, in green, and the centroid time series, in red. Figure 4d shows that the absolute difference between the latencies estimated through the GFP and the reconstructed time-series in the source space is higher for the centroid compared to the SVD time-series. From this evaluation, the SVD time-series seem to more reliably estimate the peak latencies in the VEP.

Fig. 3
figure 3

a Average ± SEM among trials of the EEG signal recorded on B11 (P8) electrode in sub#1. b Average ± SEM among subjects of the average of the EEG signal on B11 (P8). c Average ± SEM among trials of the proposed representative time-series in the source space computed through SVD (green) and of the centroid (red) in sub#1 in the right lateral-occipital cortex. d Average ± SEM among subjects of the average proposed representative time-series computed through SVD (green) and of the average centroid (red) in the right lateral-occipital cortex

Fig. 4
figure 4

a Proposed representative time-series (green) computed through SVD for the right lateral-occipital region. b Norm (violet) and the first principal component (green) of the x–y–z-time-components (respectively, blue, orange and yellow dotted lines) of the dipole lying in the centroid of the right lateral-occipital region. c Boxplot representing the latency in ms for each subject for P100 and N170 estimated through the EEG GFP (blue), the representative time-series computed through SVD (green) and the centroid time-series (red). d Boxplot representing the absolute difference in latency in ms for each subject for P100 and N170 estimated between the EEG GFP and the representative time-series computed through SVD (green) and the centroid time-series (red)

We then computed the values of the explained variance (average among trials) of each of the 82 representative time-series summing up the information content of the ROIs for all the subjects (Fig. 5a). The majority of the brain areas expected to be involved in face perception (red circles in Fig. 5) show higher explained variance. In Fig. 5b, c, we report the histogram containing all the explained variances for all the trials for sub#1 fitted against the generalized extreme value distribution (McFadden 1978). For instance, the average value of the location parameter was 94% for the left lateral occipital cortex and 70% for the right lateral occipital cortex in sub#1.

Fig. 5
figure 5

VEP: a Boxplot representing the percentage of explained variability for the proposed representative time-series for each ROI computed though SVD for all the subjects. Red circles highlight the ROIs that are mainly involved in the VEP. b Histogram representing the percentage of explained variance in the representative subject (sub #1) for all the time-series representing the left lateral-occipital brain region. c Histogram representing the percentage of explained variance in sub #1 for all the time-series representing the right lateral-occipital brain region

Finally, after computing the |iPDC| values during the first 500 ms after the stimulus, we compared the values of the outflow from each ROIs at N170 among the reconstructions based on SVD, the selection of the centroid for each ROI and the selection of the time-series containing the maximum power for each ROI (Barnes et al. 2004). The connectivity patterns between the different cortical regions were summarized by representing the total outflow from a cortical region toward the others, generated by the sum of all the statistically significant links obtained by application of the iPDC to the cortical waveforms (with their values). The total outflow for each ROI is represented by a sphere centered on the cortical region, whose radius is linearly related to the magnitude of all the outcoming directed links to the other regions. Such information is also coded through a color scale. The greatest amount of information outflow depicts the ROI as one of the main sources (drivers) of functional connections to the other ROIs (Babiloni et al. 2005). In Fig. 6, we report the average values computed across subject of the outflow for the SVD-time-series, the centroid time-series and the maximum-power time-series. We can note that the ROIs with the maximum outflow (> 95% percentile) were localized in the right lateral-occipital cortex, and in the inferior temporal cortex Fig. 6a when using the SVD reconstruction, in mesial temporal cortex near the hippocampus Fig. 6b when using the centroid time-series and in the right lateral-occipital cortex and in the right inferior temporal cortex Fig. 6c when using the maximum-power time-series. In the literature, the generation of N170 was proposed to be attributed to neural sources in lateral, basal temporal, and extrastriate occipital cortices (Grill-Spector et al. 2004, 75; Dalrymple et al. 2011; Itier and Taylor 2004; Botzel et al. 1995; Schweinberger et al. 2002), to the fusiform gyrus of the inferior temporal cortex (Kropotov 2016) in recognition of faces, which is in accordance with our estimation through the SVD reconstruction. The SVD reconstruction results in a precise and less blurry localization of the major drivers for the proposed VEP.

Fig. 6
figure 6

VEP: mean outflow across subjects computed from iPDC matrix for a the SVD time-series, b the centroid time-series and c the maximum-power time-series. Nodes dimension and colour identify the value of the outflow

Application on Interictal Spikes

For each epileptic patient, we applied our method to compute the representative time-series for each ROI. First, we evaluated if the frequency distribution did not significantly differ passing from the scalp EEG to our inverse representation. The Mann–Whitney U-test confirmed that the relative power distributions between scalp EEG and our inverse representation were not different in each frequency band for each patient (p > 0.95). After that, in order to compare the power of localization among the SVD time-series, the centroid time-series and the maximum-power time-series, we selected seven patients with anterior-mesial temporal lobe epilepsy with ILAE class I after surgery, i.e., completely seizure free, no auras (Brodie et al. 2018), in which part of the left temporal lobe was removed. For each patient, after computing the iPDC matrices, we estimated the outflow of information from each ROI during the advent of the spike. In Fig. 7 we report the mean outflow across patients computed with the SVD-time-series Fig. 7a, the centroid time-series Fig. 7b and the maximum-power time-series Fig. 7c. The ROIs with the value of the outflow above the 95% percentile, considered to be the main drivers during the advent of the spikes are: left fusiform, middle-temporal brain areas for the SVD time-series, left temporal-pole brain areas near the hippocampus for the centroid time-series and left inferior frontal brain areas for the maximum-power time-series. The first two methods correctly identified the left temporal lobe, but for the centroid time-series we can note that the range of outflow values (colorbar in b) is almost ten times smaller compared to the one of the SVD time-series (colorbar in a), thus, the resolution obtained exploiting the SVD resulted to be higher. We used the postoperative structural MRI as validation for the localization of the generators of the interictal epileptic discharges, the area removed from the surgery was the left anterior temporal lobe for all the seven patients classified as good outcome. Moreover, considering all the patients, we computed the laterality index defined as in (Coito et al. 2015) to assess whether this group of patients had more summed outflow ipsilateral or contralateral to the epileptic source. We found that seven out of seven patients had a greater ipsilateral outflow exploiting the SVD time-series, whereas four out of seven exploiting the centroid time-series. In addition, we computed the mean efficiency of the network across patients. Efficiency is a measure of how efficiently each node exchanges information. Using the SVD time-series we found that the most efficient nodes of the network (with values above the 95% percentile) were the left fusiform and the left middle-temporal brain areas, the same brain areas labeled as main drivers by the outflow measure. Brain regions having high efficiency suggest the existence of a high level of efficiency in communicating with the rest of the brain during the advent of the spike (Uehara et al. 2013).

Fig. 7
figure 7

Interictal spikes: mean outflow across good-outcome patients with left temporal lobe epilepsy computed from iPDC matrix for a the SVD time-series, b the centroid time-series and c the maximum-power time-series. Nodes dimension and color identify the value of the outflow

Finally, we computed the values of the explained variance of each of the 82 representative time-series summing up the information content of the ROIs in all the trials/epochs for each subject (Fig. 8). Each obtained histogram was fitted against the generalized extreme value distribution (McFadden 1978). The average value of the location parameter ± scale parameter was 75% ± 15%. Considering that we are trying to summarize the content of three different time-series in a unique signal, explaining more than 60% of the variance of all the dipoles in a ROI means being able to capture and describe at least the information contained in two out of three components. The data loss in a dimensionality reduction is unavoidable, but the fraction of the variance of the original data explained with our one-dimension representation seems to be a good achievement.

Fig. 8
figure 8

Interictal spikes: a boxplot representing the percentage of explained variability for the proposed representative time-series for each ROI computed though SVD for all the subjects. b Histogram representing the percentage of explained variance in the representative subject (sub #1) for all the time-series representing the left middle temporal cortex. c Histogram representing the percentage of explained variance in sub #1 for all the time-series representing the left fusiform brain region

Application on Simulated Data

In Fig. 9a, the simulated 128 hdEEG time-courses averaged among the 20 trials for one of the 13 simulated subjects are shown. These hdEEG signals were the input of the LAURA algorithm to estimate the source waveforms. The obtained SVD time-series averaged among the 20 trials for the same simulated subject of Fig. 9a are reported in Fig. 9b. After computing the |iPDC| values during the first 500 ms after the stimulus, we compared the values of the outflow from each ROIs at N170 for all the simulated subjects. The ROIs with the maximum outflow (> 95% percentile) were consistently localized in the right lateral-occipital cortex, and in the inferior temporal cortex as imposed by the simulation. The average outflow across all simulated subjects is displayed in Fig. 9c.

Fig. 9
figure 9

Simulated VEP: a 128 high-density EEG time-courses and b ROI time-series computed though SVD average among trials in a representative simulated subject, and c mean outflow across all simulated subjects computed from iPDC matrix for the SVD time-series. Nodes dimension and colour identify the value of the outflow

Conclusion and Discussion

With the final aim to improve connectivity estimation, we proposed a method able to overcome both the dipole orientation problem and to sum up of the information of different solution points in the same region of interest. The proposed projection method based on singular value decomposition sums up the information carried by hundreds of 3-D time-series in a unique 1-D signal representing most of the variability of the sources in each region of interest. Thanks to the orthogonality constraints (U V are orthogonal matrices and S is a diagonal matrix), the solution of SVD is unique and can be considered a reliable way for dimensionality reduction. The amplitude of the representative signal computed as the first orthonormal vector of the unitary matrix U is by definition independent on the original signal amplitudes. Thus, this solution overcomes a major drawback of the common procedure of averaging the dipoles, namely drastically reduced amplitudes after averaging all the dipoles in the same region of interest. Dealing with smaller amplitudes may distort the results of the connectivity estimation because it involves computing of the inverse of the matrix containing the data (Baccalà and Koichi 2014; Moraca 2008).

Additionally, we proposed a method able to create a population signal that summarizes the sources activity in each region of interest (ROI) giving an indication of the global explained variance and considering all gray matter solution points in the brain. In the majority of previous studies, a few voxels are selected for each ROI, for example the most active voxels, and afterwards the information carried by these most active voxels is summarized in a unique signal by a decomposition method. Indeed, in (Supp et al. 2007; Gruber et al. 2008), the authors defined the ROIs by carefully selecting voxels corresponding to cortical areas that showed significant differences in the gamma-band range. For analyzing the information transfer between the identified regions of interest in source space through partial-directed-coherence, a multivariate autoregressive model was fitted to the time series revealed by the inverse solution at each ROI. To overcome the problem that each current source density consists of three directions (X, Y and Z), they computed the first principal component of each triplet. In our work, we aimed to create a population signal directly from the activity of all the voxels contained in the same ROI without introducing a priori condition to select specific points/areas. Also in (Zhou et al. 2009), an fMRI connectivity analysis approach combining both principal component analysis (PCA) and Granger causality method was proposed to study directional influence between functional brain regions, but before applying this combined measure, the authors selected only the activated brain regions/voxels with BrainVoyager QX.

Moreover, the computational cost should also be considered as it influences the usefulness of the method in practice. The computational cost of singular value decomposition is much lower than the computational cost of other approaches based on the canonical polyadic decomposition (Cong et al. 2015). We also showed that the projection method based on SVD provides robust results for visual evoked potentials and epileptic spikes. The results have also been confirmed by simulations. Furthermore, by analysing the frequency content of the proposed time-series and comparing its features with the centroid time-series, the signal based on the SVD seemed to both resemble the EEG scalp features and to prevent to deal with signals with too low amplitudes for the subsequent connectivity estimation. The novelty of the SVD method also lies in the fact that it exploits the information of the overall population of dipoles in each ROI instead of considering only one time-series as representative of the complex activity pattern in a given brain region. Despite the lack of availability of an objective ground truth in both estimating the source activities and the causal interactions among them, observing the dynamics and the orientation of the dipoles over time in visual evoked potential and epileptic spikes seems to confirm the existence of a principal component that accounts for most of the variability in the data.

While the proposed method is computationally cheap and easy to implement, it relies on certain ad-hoc assumptions and constraints that can influence the accuracy of the results. The first assumptions are imbedded in the source localization method used to solve the inverse problem. Here, we used the linear inverse solution LAURA which assumes that the strength of each source falls off with the reciprocal of the cubic distance for vector fields and with the reciprocal of the squared distance for potential fields, according to Maxwell’s laws of electromagnetic field (De Peralta Menendez et al. 2004). Other assumptions might lead to different results. The second constraint lies in the definition of the regions of interest in the parcellation of the brain. Here, we used anatomic ROI definitions as proposed in previous studies (Coito et al. 2016; Milde et al. 2010; Astolfi et al. 2007) which might lead to wrong connectivity estimations in cases where estimated sources cross anatomical boundaries or when distinct sources are located in the same anatomical region (Daunizeau and Friston 2007). Anatomical segmentation is appropriate in structures that are anatomically well defined, but are less ideal in areas such as the frontal and parietal cortices, where there is the risk of mixing temporal signals into heterogeneous ROIs (Constable et al. 2013). Analysing the patterns and the frequency content of the final waveforms computed through SVD, e.g., Fig. 2, and checking the explained variability of each singular vector, e.g., Figs. 5 and 8, can lead the user to choose the most suitable parcellation.

Since the results may be influenced by the choice of the algorithm for estimating the source waveforms and from the brain parcellation, there are other approaches to define EEG networks that circumvent the issue of how to best segment the source maps into ROIs by explaining the EEG in terms of a discrete set of causally interacting clusters (Olier et al. 2013). While such direct approaches are theoretically appealing since they are based on a generative model of how the data are probabilistically produced, they also rely on several a-priori assumptions and include many parameters, leading to significant computational costs. The main assumptions in one direct approach (Olier et al. 2013) are that the dynamics of the sources can be modelled as random fluctuations of a small number of mesostates interacting according to a full Dynamical Causal Network that can be estimated and the dynamics of the mesostates can switch between multiple approximately linear operating regimes stable over finite periods of time. Critically, this model accommodates constraints on the number of meso-sources (a meso-source represents the mean field approximation to its underlying neuronal population dynamics), while retaining the flexibility of distributed source models in explaining the data (Daunizeau and Friston 2007). For experimental situations in which there is some a priori belief that there are multiple approximately linear dynamical regimes, this direct approach provides a natural modelling tool (Olier et al. 2013).

Whether applying a direct method based on Bayesian statistics or a two-stage method, as the one proposed in this work, depends on the user hypothesis and the final application. On the one hand, Bayesian approaches provide a natural and principled way of combining prior information with the data, within a solid decision theoretical framework, but it comes with a high computational cost, and user prior assumptions have to be translated into a mathematically formulated prior. Posterior distributions can be heavily influenced by these priors. On the other hand, two-stage approaches do not need to define priors and they are less computational demanding, but they may be influenced by the choice of the algorithm for estimating the source waveforms and from the anatomical segmentation used to define the ROIs. Our intention was to estimate source activity in the whole brain without any a priori assumption about the generative model of how the data are probabilistically produced. In our opinion, such an approach is preferred in studies that aim to compare and combine the effective connectivity among ROIs with the structural connectivity estimated by diffusion MRI in the same framework.