Introduction

Functional magnetic resonance images (fMRI) (Ogawa et al. 1990) measures the slowly changing blood oxygenation level dependent (BOLD) signals as an indirect measure of neural activity and is a popular neuroimaging technique thanks to its high spatial resolution (typically a few millimeters). On the other hand, electroencephalography (EEG) measures neural activity, almost directly, using scalp electrodes. Despite the superior temporal resolution (millisecond), the spatial reconstruction of EEG data is an ill-posed problem (projection of the 2D scalp measurement into the 3D brain space) and has a poor spatial resolution. Hence, the fusion/integration of these two modalities is of great value due to their complementary spatiotemporal properties (Babiloni et al. 2004).

Approaches to the fusion of EEG and fMRI are divided into two main categories: model-driven and data-driven. In model-driven methods, one modality is estimated from the other using computational biophysical models. Since precise knowledge about the neuronal substrates is rarely available, the use of model-driven methods has been decreasing (Valdes-Sosa et al. 2009; Ferdowsi et al. 2015). On the other hand, data-driven methods incorporate both modalities to estimate fused components. There is an extensive literature on the use of data-driven approaches for EEG-fMRI integration which are mostly based on decomposing the EEG and fMRI into different components. Traditional matrix decomposition techniques such as principal component analysis (PCA) and independent component analysis (ICA) have been used as the primary engines for the preprocessing, feature extraction, and in a more versatile way, the fusion of the two neuroimaging techniques. One reason for the popularity of these two techniques is the convenience of presenting the time-varying EEG and fMRI as a matrix of time \(\times\) space (channel or voxels).

PCA has a long history of application in EEG preprocessing such as in artifact rejection and extraction of meaningful brain activity (Lagerlund et al. 1997; Soong and Koles 1995). A basic problem is that components are defined by only two signatures (space and time) which are not determined uniquely, therefore orthogonality is imposed between the corresponding signatures of different components (Miwakeichi et al. 2004). ICA has been used extensively in both EEG and fMRI literature (Makeig et al. 2004; Jonmohamadi et al. 2014a; Jonmohamadi and Jones 2015; Comon and Jutten 2010) and is a popular tool for space/time decomposition. While it avoids the orthogonality constraint, uniqueness, is achieved at the price of imposing an even stronger non-physiological constraint, namely statistical independence of the sources (Miwakeichi et al. 2004).

In a typical EEG or fMRI experiment the data has higher dimension than time × space and it could go up to seven dimensions (Cong et al. 2015), e.g., in case of EEG time × space × frequency × trial × condition × participant × group. In order to use ICA and/or PCA for these higher dimensional data, unfolding of some modalities onto others and reducing the data dimension into a matrix is necessary, which is typically done by concatenation or stacking of the data (Delorme and Makeig 2004; Eichele et al. 2011; Dien 2012; Calhoun and Adali 2012; Cong et al. 2013). Such unfolding inevitably loses some potentially existing interactions between/among the folded modes (Cong et al. 2015) and makes the interpretation of the results difficult (Mørup et al. 2006).

Considering that the EEG and fMRI data can be expressed conveniently as a three or higher dimensional array (tensor) it is possible and favorable to use tensor decomposition techniques for the purpose of breaking the data into components with corresponding signatures from each dimension. One such techniques is the parallel factor analysis (PARAFAC), also known as canonical polyadic decomposition, (Carroll and Chang 1970; Harshman 1970). Besides being able to conveniently deal with multidimensional data, the second main advantage of the PARAFAC over the PCA and ICA is that uniqueness is achieved without the need for imposing strong physiologically irrelevant assumptions such as orthogonality and statistical independence.

Tensor Factorization-Based EEG-fMRI Analysis

Tensors are higher order generalization of matrices, i.e., multiway arrays, which can represent additional types of variables in their higher dimensions (Hunyadi et al. 2016; Cichocki et al. 2007; Samadi et al. 2016). Recently, tensor decomposition techniques such as PARAFAC or Tucker have become attractive in signal processing (Cichocki et al. 2015; Cong et al. 2015; Sen and Parhi 2017) and in neuroimaging applications for being able to naturally present the inherently multidimensional data and preserve their structural information defined by inter-dependencies among various modes of variability such as time, space, participant, or frequency (Hunyadi et al. 2016). In neuroimaging, often one or more sources of variability exist between the measurement from different modalities, for example, neural activity might have a similar temporal pattern in EEG and MEG (magnetoencephalography) (Hunyadi et al. 2016). Alternatively, using mathematical manipulations, such as convolution of the EEG (or MEG) signals/features with a HRF, it can be assumed that EEG and fMRI share temporal similarities. Other sources of variability could include participant, task, and group.

So far, all except one tensor-based fusion of EEG-fMRI have been based on different variants of the coupled matrix-tensor factorization (CMTF) (Acar et al. 2013), in which data is structured in such a way that a matrix contains fMRI data and a 3rd order tensor contains the EEG activity and it is assumed that there is one common mode of variability between the matrix and the tensor, for example, time (Martinez-Montes et al. 2004) or participant (Acar et al. 2013, 2014; Hunyadi et al. 2016, 2017). The factorization of the structured data is achieved by imposing constraints on the optimization algorithm. Hence, the CMTF is based on the strong assumption that components in the shared dimension are equal. To relax this assumption, several alternatives have been introduced, such as Advanced CMTF (Karahan et al. 2015; Acar et al. 2014) which allows both shared and nonshared components in the common mode between the matrix and the tensor or Relaxed Advanced coupled tensor-tensor factorization (Rivet et al. 2015), soft coupling (Seichepine et al. 2014), and approximate coupling (Farias et al. 2016) which provides similarity rather than the equivalence between the common components. Only recently, Chatzichristos et al. (Chatzichristos et al. 2018) for the first time, used the coupled tensor-tensor decomposition (CTTD) of the EEG and fMRI data fusion. They also used the so called ’soft’ coupling approach to alleviate the strong assumption related to the canonical haemodynamic response function (HRF). They demonstrated their superiority over the ICA based fusion approaches (Calhoun et al. 2009, 2006; Mijović et al. 2012) using simulated data.

To the best knowledge of the authors only two papers have reported the use of source-level EEG for fusion with fMRI data (Karahan et al. 2015; Jonmohamadi et al. 2018) and all the other mentioned EEG-fMRI technique have used sensor-level EEG, and to estimate the brain maps, post-processing was applied to the fused sensor-level EEG maps. The source-level EEG has two main advantages over the sensor-level counterpart: firstly, data is less mixed as it is spatially filtered using typically a minimum norm or a minimum variance filter. Secondly, the fused components are accompanied directly with the EEG brain maps and it is a desirable property to be able to compare the fMRI and EEG brain maps for the same activity directly. Since we are using both sensor and source-level EEG from multiple bands, the result of the fused components will be accompanied with scalp and brain maps from each frequency band as well as the corresponding brain map from the fMRI.

Methods

In this manuscript, the italic lower case refers to scalar (a), bold italic lower case refers to vectors (a), bold italic upper case refers to matrices (A), and calligraphic upper case letters refers to tensors (\(\mathcal {A}\)). The glossary of the mathematical characters used in the following sections is provided in Table 1.

Table 1 Glossary of the characters and operands used in the manuscript

Data Structure

The millisecond resolution of the EEG provides rich temporal information on the dynamic changes of brain activities. However, since EEG captures activity from large number of physiological and non-physiological sources using a limited number of sensors, mathematical operations are required to filter noise sources and at the same time extract the spectral, temporal, and spatial information of the sources of interest. By default the recorded EEG from \(N_e\) electrodes, during \(N_t\) time samples, i.e., \({{\textbf{X}}}(e,t)\in {\mathbb {R}}^{(N_e\times N_t)}\), does not contain the spectral information as a 3rd dimension. Hence time-frequency analysis techniques such as wavelet (e.g., (Kronland-Martinet et al. 1987)) can be applied to extract the spectral signatures of the activities at each electrode, or alternatively, temporal band-pass filtering can be applied to divide the wide band EEG, \({\mathbf {X}}(e,t)\), into the popular delta (0–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), beta (12–28 Hz), and gamma (+ 28 Hz) sub-bands,

$$\begin{aligned} {\textbf{X}}(e,t)= \sum _{f=1} ^{N_f} {\textbf{X}}_f(e,t). \end{aligned}$$
(1)

Using the band-pass filtering the EEG data can be presented in a 3rd order tensor

$$\begin{aligned} \varvec{\mathcal {X}}(e,t,f), \varvec{\mathcal {X}}\in {\mathbb {R}}^{N_e\times N_t\times N_f}, \end{aligned}$$
(2)

where f refers to frequency and \(N_f\) is the number of the sub-bands.

One inherent limitation of EEG is that the source-level activity (3D brain space) is recorded using the sensor-level electrodes (2D space). As a result, the activities of different sources are highly overlapped at the sensor-level. Inverse solutions such as minimum-variance (Van Veen et al. 1997; Jonmohamadi et al. 2014b) spatial filters can be used to project the sensor-level data into the source-level \(\Omega\), to reduce the overlap of the sources and at the same time estimate the brain maps associated with certain activities. A scanning grid is required to cover the brain space and spatial filter coefficients \({\textbf{w}}(g)\) should be applied for every point of the scanning grid g to estimate the EEG activity from that point,

$$\begin{aligned} {\textbf{x}}(g,t)={\textbf{w}}^T(g){\textbf{X}}(e,t), g \in {\Omega }. \end{aligned}$$
(3)

The projected EEG into the brain space is a matrix with \(N_g\) grid points and \(N_t\) time points \({\textbf{X}}(g,t)\in {\mathbb {R}}^{N_g\times N_t}\), which has a substantial increase in the space domain from \(N_e\) to \(N_g\), compared with the sensor-level EEG, for example, 64 sensors compared with a few thousands voxels. Similar to the sensor-level EEG, the source-level EEG can also be band-pass filtered (\({\textbf{X}}_f(g,t))\) to create the 3rd order tensor,

$$\begin{aligned} \mathcal {\mathcal {X}}'(g,t,f), \varvec{\mathcal {X}}\in {\mathbb {R}}^{N_g\times N_t\times N_f}. \end{aligned}$$
(4)

The fMRI data from \(N_v\) voxels and \(N_{t'}\) time points, i.e., (\(\varvec{{Y}}(v,t') \in {\mathbb {R}}^{N_v\times N_{t'}}\)), is recorded with a much slower sampling rate (typically a few seconds) compared with the EEG. In order to temporally relate the EEG with the fMRI, the preprocessed EEG data is convolved with a canonical HRF (Glover 1999) function to take the haemodynamic delay into account,

$$\begin{aligned} \breve{\mathcal {X}}(e,t,f)=\mathbf {\mathcal {X}}(e,t,f)\otimes h(t) \in {\mathbb {R}}^{N_e\times N_t\times N_f}. \end{aligned}$$
(5)

Since the participants are performing a task, the experimentally specified paradigm signal could be used as a temporal constraint, to extract the task and non-task related EEG and fMRI intervals. In our example data the paradigm \({\textbf{p}}(t)\) (often a boxcar in fMRI studies) refers to a 30 s of working-memory task followed by a 30 s of non-task (resting state) period, which occurs for 9 cycles. The paradigm signal is also convolved with the canonical HRF.

$$\begin{aligned} \breve{{\textbf{p}}}(t)={{\textbf{p}}}(t)\otimes h(t) \in {\mathbb {R}}^{N_{t}}. \end{aligned}$$
(6)

It is known that the brain has a lag structure (Mitra et al. 2014; Feige et al. 2017), meaning different brain regions have different HRF. To account for this varying HRFs, the convolved paradigm signal was shifted several times to create a matrix of the paradigm signals with each row of it having slightly a shift in phase compared with the previous row. This matrix could be used as a temporal constraint to extract the corresponding spatial, spectral and participant related features

$$\begin{aligned} \breve{{\textbf{P}}}(t)= \begin{pmatrix} \breve{{\textbf{p}}}(t-\phi ) \\ \breve{{\textbf{p}}}(t-(\phi -1)) \\ .\\ .\\ .\\ \breve{{\textbf{p}}}(t-(\phi -2\phi ) \end{pmatrix}, \end{aligned}$$
(7)

where \(\phi\) is the number of the time sample shifts and set to 4 in this work. Both the EEG and paradigm signals are downsampled to the double of fMRI sampling rate (\(t\Rightarrow t'\)). Then fMRI was upsampled to 0.909 Hz to match the same sampling rate of the EEG and paradigm signal. This upsampling of the fMRI provides subsample shifting of the signals in which \(\breve{{\textbf{P}}}(t)\) covers − 4.4 s to + 4.4 s with 1.1 s steps with respect to the paradigm signal.

Another dimension of data arises when the recording involves several participants (\(N_s\)) which is typical in EEG and fMRI studies. Therefore, the fMRI is presented by a 3rd order tensor and the sensor-level or source-level EEG with a 4th order tensor:

$$\begin{array}{*{20}l} \mathcal {Y}(v,t',s)\in&{\mathbb {R}}^{N_v\times N_{t'}\times N_s}, \\ \breve{\mathcal {X}}(e,t',f,s),\in&{\mathbb {R}}^{N_e\times N_{t'}\times N_f\times N_s}, \\ \breve{\mathcal {X}}'(g,t',f,s),\in&{\mathbb {R}}^{N_g\times N_{t'}\times N_f\times N_s}. \end{array} .$$
(8)

PARAFAC

PARAFAC approximates the original tensor, as the sum of \(N_c\) rank one tensors,

$$\begin{aligned} \begin{aligned} \mathcal {Z}=\sum _{i=1} ^{N_c}\mathcal {Z}_i + \mathcal {E} \approx \sum _{i=1} ^{N_c}\mathcal {Z}_i \end{aligned} \end{aligned}$$
(9)

where \(\mathcal {E}\) is the error tensor, and \(\mathcal {Z}_i\) is rank 1 tensor corresponding to component i. In the case of EEG and fMRI this can be written as

$$\begin{aligned} \begin{aligned} \mathcal {Y}(v,t',s) \approx&\sum _{i=1} ^{N_c}\varvec{{\hat{v}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{s}}}_i, \\ \breve{\mathcal {X}}(e,t',f,s) \approx&\sum _{i=1} ^{N_c}\varvec{{\hat{e}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{f}}}_i \circ \varvec{{\hat{s}}}_i, \\ \breve{\mathcal {X}}'(g,t',f,s) \approx&\sum _{i=1} ^{N_c}\varvec{{\hat{g}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{f}}}_i \circ \varvec{{\hat{s}}}_i, \\ \end{aligned} \end{aligned}$$
(10)

where \(\circ\) is the outer product and \(\varvec{{\hat{v}}}_i\), \(\varvec{{\hat{t}}}_i\), \(\varvec{{\hat{s}}}_i\), \(\varvec{{\hat{f}}}_i\), \(\varvec{{\hat{e}}}_i\), and \(\varvec{{\hat{g}}}_i\) are known as “loading vectors” which correspond to the fMRI spatial, temporal, participants, EEG spectral, sensor level EEG spatial, and source space EEG spatial signatures of the ith component, respectively. Here for the simplicity the loading vectors are also referred to as ’components’. Similarly, the loading matrices \([{\varvec{{\hat{V}}}}, {\varvec{{\hat{T}}}}, {\varvec{{\hat{S}}}}], [{\varvec{{\hat{E}}}}, {\varvec{{\hat{T}}}}, {\varvec{{\hat{F}}}}, {\varvec{{\hat{S}}}}],\) and \([{\varvec{{\hat{G}}}}, {\varvec{{\hat{T}}}}, {\varvec{{\hat{F}}}}, {\varvec{{\hat{S}}}}]\) contain the spatial, temporal, spectral, and participants’ signatures of all components for fMRI and EEG data.

Coupled Tensor–Tensor Factorization

In the case of coupled tensor-tensor factorization for the fMRI and sensor-level EEG, the original CMTF (Sorber et al. 2015) can be written as

$$\begin{aligned} \begin{aligned}&J(\varvec{{\hat{V}}},\varvec{{\hat{T}}},\varvec{{\hat{S}}},\varvec{{\hat{E}}},\varvec{{\hat{F}}}) \\&\quad =\Vert \mathcal {Y}-\sum _{i=1} ^{N_c}\varvec{{\hat{v}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{s}}}_i\Vert ^2 + \Vert \breve{\mathcal {X}}-\sum _{i=1} ^{N_c}\varvec{{\hat{e}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{f}}}_i \circ \varvec{{\hat{s}}}_i\Vert ^2. \end{aligned} \end{aligned}$$
(11)

In the above cost function, the fMRI and EEG are coupled in time and participant modes. The coupled components are exactly the same in the two tensor data which is known to be a strong assumption. Moreover, it is highly likely that the EEG and fMRI have different data ranks, i.e., \({N_{EEG}}\) and \({N_{fMRI}}\). One way to alleviate this assumption is to use partial coupling, where only some of the components in time and participants are coupled between the two tensors

$$\begin{aligned} \begin{aligned}&J(\varvec{{\hat{V}}},\varvec{{\hat{T}}}_{fMRI},\varvec{{\hat{S}}}_{fMRI},\varvec{{\hat{E}}},\varvec{{\hat{T}}}_{EEG},\varvec{{\hat{F}}},\varvec{{\hat{S}}}_{EEG}) \\&\quad =\Vert \mathcal {Y}-\sum _{i=1} ^{N_{fMRI}}\varvec{{\hat{v}}}_i \circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{s}}}_i\Vert ^2 \\&\qquad + \Vert \breve{\mathcal {X}}-\sum _{i=1} ^{N_{EEG}}\varvec{{\hat{e}}}_i\circ \varvec{{\hat{t}}}_i \circ \varvec{{\hat{f}}}_i \circ \varvec{{\hat{s}}}_i\Vert ^2. \end{aligned} \end{aligned}$$
(12)

The \(\varvec{{\hat{T}}}_{fMRI}\) and \(\varvec{{\hat{T}}}_{EEG}\) are partially the same which is also the case with \(\varvec{{\hat{S}}}_{fMRI}\) and \(\varvec{{\hat{S}}}_{EEG}\):

$$\begin{aligned} \begin{aligned} \varvec{{\hat{T}}}_{fMRI} = \big [\varvec{{\hat{T}}}_{com};~\varvec{{\hat{T}}}'_{fMRI}\big ], \varvec{{\hat{T}}}_{fMRI} \in {\mathbb {R}}^{N_{t'}\times N_{fMRI}} \\ \varvec{{\hat{T}}}_{com} \in {\mathbb {R}}^{N_{t'}\times N_{com}}, \varvec{{\hat{T}}}'_{fMRI} \in {\mathbb {R}}^{N_{t'}\times (N_{fMRI}-N_{com})}, \end{aligned} \end{aligned}$$
(13)

and similarly

$$\begin{aligned} \begin{aligned} \varvec{{\hat{T}}}_{EEG} = \big [\varvec{{\hat{T}}}_{com};~\varvec{{\hat{T}}}'_{EEG}\big ],~~\varvec{{\hat{T}}}_{EEG} \in {\mathbb {R}}^{N_{t'}\times N_{EEG}} \\ \varvec{{\hat{T}}}_{com} \in {\mathbb {R}}^{N_{t'}\times N_{com}},~~\varvec{{\hat{T}}}'_{EEG} \in {\mathbb {R}}^{N_{t'}\times (N_{EEG}-N_{com})}, \end{aligned} \end{aligned}$$
(14)

where \(N_{com}\) is the number of the common components in time and participant domain between EEG and fMRI and ’;’ indicates vertically concatenated matrices. Similar Eqs. to 13 and 14 are applicable for the participant loading matrices (\(\varvec{{\hat{S}}}_{fMRI}\) and \(\varvec{{\hat{S}}}_{EEG}\)).

Since the subjects are performing a common task, i.e., 30 s of 2-Back followed by 30 s of 0-Back memory tasks, the temporal signatures of desired EEG and fMRI features are approximately known

$$\begin{aligned} \varvec{{\hat{T}}}_{com}=\breve{{\textbf{P}}}(t')^T, N_{com}=2\times \phi + 1. \end{aligned}$$
(15)

Therefore, the \(\varvec{{\hat{T}}}_{com}\) can be used as a temporal constraint for solving Eq. (12) and its corresponding spatial, spectral, and participant loading matrices can be extracted as task related features. However, applying the \(\varvec{{\hat{T}}}_{com}\) as the partially known temporal loading matrix is also a strong assumption and fluctuations in the power of the EEG and fMRI are expected which will not match with the shape of the \(\varvec{{\hat{T}}}_{com}\). In the case of the coupled data factorization, the initialization of the loading matrices is known to be an important step (Vervliet et al. 2016). Hence, rather than considering the \(\varvec{{\hat{T}}}_{com}\) as a known loading matrix, it is was used as the initializer of the coupled EEG and fMRI loading matrix.

There are several methods on deciding the number of the components (\(N_{EEG}\) and \(N_{fMRI}\)). However, these methods could show substantially different results. For example, the rank estimate provided by TensorLab toolbox (Vervliet et al. 2016) indicates 44 is the number of components related to the sensor-level EEG which has a low error in PARAFAC estimation, whereas the core consistency diagnosis of N-way toolbox (Andersson and Bro 2000) indicated 92% score when only \(N_{EEG}=5\). While setting \(N_{EEG}=44\) resulted in many similar/identical components, the \(N_{EEG}=5\) appeared to be too small. The trial and error was suggested as alternative in (Vervliet et al. 2016). Using the trial-and error approach it appeared setting \(N_{EEG}=20\) and \(N_{fMRI}=14\) results in components which are not similar/identical and the same time most of them are reproducible through rerunning the PARAFAC. Similarly, the \(N_{com}\) was set to 9, which covered + 4.4 s to − 4.4 s with 1.1 s steps with respect to the paradigm signal. In general, the \(N_{EEG}\) and \(N_{fMRI}\) are dependent on factors such as the type of experiment or recording systems and therefore the aforementioned \(N_{EEG}\) and \(N_{fMRI}\) cannot be the same for other EEG-fMRI coupling analysis. There were seven coupled components reproducible through rerunning the coupled tensor-tensor factorization. The coupling of the EEG and fMRI could be done in two ways, source-level EEG and fMRI or sensor-level EEG and fMRI. They result in similar features however, the earlier is more time consuming due to the size of the source-level EEG tensor (i.e., \(\approx\) 285 min as compared with \(\approx\) 40 min). Hence, the coupling was first applied between the sensor-level EEG and fMRI tensors (\(\breve{\mathcal {X}}\) and \(\mathcal {Y}\)) and then the resultant common temporal and participant loading matrices were used as a known prior for PARAFAC on source-level EEG (\(\breve{\mathcal {X}}'\)). The spectral loading matrix of the sensor-level EEG were used for the initialization of the PARAFAC on source-level EEG. The block diagram of the proposed method is shown in Fig. 1.

Fig. 1
figure 1

The block diagram illustrates the spatial, temporal, and spectral operations required to create the 4th order EEG and 3rd order fMRI tensors. The EEG and fMRI tensors could be coupled in temporal and participant domains. The paradigm signal could be used as a temporal constraint for the coupled tensor-tensor decomposition

Data Acquisition

Part of the data for this study was obtained from a recent study (Jonmohamadi et al. 2018) and the other part was recorded recently. Participants performed an N-Back working-memory task with alternating 0-Back and 2-Back conditions. During the 0-Back condition, participants responded to the current arrow on the display screen, whereas during the 2-Back condition, participants responded to the arrow two trials earlier. Each arrow was on-screen for 0.5 s. The experimental conditions alternated in a 30 s boxcar with each of the conditions repeated nine times totaling in a 540 s of duration. Before the recording, participants practiced the task at least two times outside the scanner, with at least 80% accuracy achieved.

In total there were 6 participants: 3 males (participants 1, 2, and 3) at the ages of 20, 33, and 38 and 3 females (participants 4, 5, and 6) at the ages of 24, 26, and 22). All participants scored more than 95% correct for the memory task.

EEG data were continuously recorded from 64 channels using the Standard BrainCap MR and BrainAmp MR Plus amplifiers (Brain Products, Munich, Germany) inside an MRI machine. The EEG was acquired using the manufacturer standard cap layout, with the ground electrode located at AFz, reference electrode at FCz, and a drop-down electrode attached centrally to the participant’s back for the recording of electrocardiography. The impedance of the electrodes were below 10 kOhm.

In order to compute the participant’s individual leadfield, EEG-MR co-registration was achieved by placing Vitamin E capsules, at electrode positions Cz, F5, CP5, and FC6.

MR images were acquired on a 3 T Siemens Skyra, Erlangen, Germany, with a 20-channel head coil. BOLD fMRI data were acquired using a T2*-weighted echo planar imaging (voxel size 3 \(\times\) 3 \(\times\) 3 mm).

Data Preprocessing

The analysis of EEG was done using the fieldtrip toolbox (Oostenveld et al. 2011). Two Matlab based toolboxes of TensorLab (Vervliet et al. 2016) and N-way toolbox (Andersson and Bro 2000) were used for the PARAFAC tensor decomposition. The brain source images are shown using the FSLeyes toolbox (McCarthy 2018).

The gradient artefact was removed using realignment parameter-informed artefact correction (Moosmann et al. 2009), and the ballistocardiogram artefacts were rejected using the statistical feature extraction for artifact removal (Liu et al. 2012). EEG was band-passed filtered to 1–20 Hz using a 4th order Butterworth filter, downsampled to 100 Hz, and re-referenced to the FCz electrode. The EEG was then projected into the source-level using the minimum-variance beamformer. The leadfields of the beamformer were calculated using a five-layer realistic finite element model of the head, obtained using the individual MRI structural scan of the participants (Vorwerk et al. 2018). The grid size for the beamformer was 8 mm.

It is well known that EEG sources with theta and alpha band activities are associated with working-memory tasks (Debener et al. 2005; Khader et al. 2010; Dong et al. 2015; Stokić et al. 2015; Esposito et al. 2009), therefore, both sensor and source-level EEGs were band-pass filtered to theta and alpha bands. Next, the EEGs were Hilbert enveloped, convolved with a canonical HRF (Glover 1999), and downsampled to the frequency of the fMRI recording (0.4545 Hz). Similarly, the paradigm signal was also convolved with the canonical HRF.

The fMRI processing included brain extraction, registration, and motion correction using the FSL toolbox (Jenkinson et al. 2012) and normalization of the data.

Spatial Standardization

In order to present the source-level EEG of different participants in the same space, an 8 mm scanning grid template was created using the T1 brain and warped into the individual brain spaces (MRIs). In the case of fMRI, a 3 mm T1 MNI brain template was used to co-register the individual fMRIs. The white matter was masked out.

Temporal Normalization and Centering

Centering (removal of the non-zero mean from the data) is required prior to scaling (Bro 1997). Since the task includes 9 cycles of 30 s of 0-Back memory task and 30 s 2-Back memory task, the paradigm signal has the frequency of 1/60 Hz. Centering is achieved by band-pass filtering of the fMRI and EEG time courses of Eq. 8 at frequencies of 1/50 to 1/100 Hz. This removes the transient activities and only the stationary ones remain. In order to scale the sensor-level EEG, source-level EEG, and fMRI time courses, the mean of the Frobenius norms of all the time courses of each tensor data were calculated and then each time course divided by the corresponding mean. Using this approach, the sensor and source-level EEG and fMRI time courses have similar variances.

Results

In order to identify the task related components, a correlation test was performed with the temporal signature of the components and the 4.4 s, 0.0 s, and − 4.4 s shifted paradigm signal. Out of the 9 coupled components, components 2-8 had a high correlation coefficient (>0.85). The result of the correlation test is shown in Fig. 2, where the subfigure (a) refers to the EEG and subfigure (b) refers to the fMRI components. The first 9 scores are the same between the EEG and fMRI as they were coupled. Besides the coupled components, the component 12 and 18 of the EEG has also higt correlation scores.

Fig. 2
figure 2

The plot of the correlation coefficients obtained by correlation test between the temporal loading matrix of EEG (upper) and fMRI (lower) with three shifted paradigm signals of + 4.4 s, 0.0 s, and − 4.4 s. The first 9 factors are coupled between the EEG and fMRI and therefore are the identical

The temporal, participant, spectral, and spatial signatures of the these components are shown in Figs. 3 and 4. According Fig. 3a, component 2 is in the theta band (5.5 Hz) and several participants, with different extents, have contributed to this component. The EEG sensor map of the coupled component 2 shows the medial frontal channel AFz and surrounding channels pick up highest amount of the memory related theta activity. There are also smaller amount of negativity which is bilateral on the mid posterior channels. The presence of medial frontal theta during memory tasks has been shown in many previous memory studies (Debener et al. 2005; Dong et al. 2015; Michels et al. 2010; Berger et al. 2014) and the slight negativity of the posterior channels in theta band is shown in (Berger et al. 2014) which is consistent with the scalp map shown in Fig. 3a. The localization of the theta sources has also been reported to have various anterior and posterior origins (Berger et al. 2014) due to the use of different memory tasks. The corresponding EEG source map of the coupled component 2 indicates the origin of theta are being areas which cover frontal cingulate gyrus. According to the fMRI map of the coupled component 2, areas such as premotor, left/medial frontal pole, dorsal cingulate and posterior parietal, shown in Fig. 3a in red colour, have increases in BOLD due to the 2-Back memory task. These areas are shown to be the part of the dorsal attention network (Moosmann et al. 2009) and is similar to the maps found by other studies (Owen et al. 2005; Jonmohamadi et al. 2018).

Fig. 3
figure 3

The result of the coupled partial tensor-tensor PARAFAC decomposition on a 4th order EEG tensor and 3rd order fMRI tensor, partially coupled in the temporal and participant domains. In each subfigure, the upper row, from left to right, shows the temporal, participant, and spectral signatures, whereas the lower row, left to right, shows the sensor-level EEG, source-level EEG, and the fMRI signatures of a coupled component. The hot color of the EEG and fMRI maps indicates the positive and cold color indicates the negative relation to the paradigm signal. The plot of the temporal signature shows the paradigm signal (in blue) as well as the actual temporal signature of each component (in red). This figure only shows coupled components 2–6. Coupled components 7 and 8 together with the EEG components 12 and 18 are shown in Fig. 4 (Color figure online)

Fig. 4
figure 4

Continue of the Fig. 3 a, b are the result of the coupled partial tensor-tensor PARAFAC decomposition on a 4th order EEG tensor and 3rd order fMRI tensor, partially coupled in the temporal and participant domains. In each subfigure, the upper row, from left to right, shows the temporal, participant, and spectral signatures, whereas the lower row, left to right, shows the sensor-level EEG, source-level EEG, and the fMRI signatures of a coupled component. The hot color of the EEG and fMRI maps indicates the positive and cold color indicates the negative relation to the paradigm signal. The plot of the temporal signature shows the paradigm signal (in blue) as well as the actual temporal signature of each component (in red). a, d correspond to the components 12 and 18 and are specific to the EEG as they are no coupled to the fMRI (Color figure online)

While there was only one theta band EEG activity related to the memory task, there were several alpha band components which were negatively time locked to the paradigm, either dominated by one participant such as shown Figs. 3b, d and 4d or common with several participants with relatively similar contributions such as Figs. 3c, e and 4c. The corresponding fMRI maps of these components are mostly showing the default mode network which include frontal gyrus, medial prefrontal cortex, amygdala, cerebral cortex, and areas of precuneus cortex and posterior cingulate cortex. The scalps maps of these alpha band activities resembles the resting EEG maps as shown in Ma et al. (2015) where also inter and intra subject differences were acknowledged in the resting state alpha band scalp maps. The EEG source maps show areas such as lateral occipital cortex, occipital pole, and occipital fusiform gyrus as the origins of the alpha activity. The coupled component 4 belongs to the female participants.

According to the temporal signatures, most of the components had similar latencies to the canonical HRF (6 s) but there were two components which maintained consistent lags. Coupled component 5 in Fig. 3d remained 2.2 s ahead of the paradigm signal, whereas the component 18 of the EEG in Fig. 4d remained 6 s behind the paradigm signal. Both of the mentioned components are dominated by participant 3, with component 5 being from the left occipital area and the component 18 from the right side.

Discussion

In the field of biomedical imaging, use of more than one imaging modality to capture the variability of a physiological process is a common practice. A primary goal of multimodal data processing is to find components in each of the modalities which are related to the physiological process of interest.

Traditionally, matrix factorization techniques such as PCA and ICA have been used for the purpose of the multimodal data processing. These techniques, rely on the nonphysiological assumption such as orthogonality and statistical independence to achieve uniqueness among the components. In neuroimaging, typical modes of variability in the data are time, space, participant gender, participant age, participant condition, and the tasks in the paradigm. Additionally, mathematical operation on the data such as time-frequency analysis, spatial and temporal filtering can increase the dimension of the data. Matrices by default represent variation of the data in two modes, hence in order to present extra modes of variability in a matrix, unfolding is necessary, which is know to leads the loss of information (Cong et al. 2015). On the other hand, tensors conveniently present multidimensional data and tensor decomposition techniques such as PARAFAC or Tucker decomposition do not impose constraints in the optimization process. Tensor-based analysis of concurrent EEG-fMRI have received increasing attention in recent years (Vanderperren et al. 2010; Karahan et al. 2015; Ferdowsi et al. 2015; Hunyadi et al. 2016, 2017; Acar et al. 2017a, b; Deshpande et al. 2017; Sen and Parhi 2017; Van Eyndhoven et al. 2017; Chatzichristos et al. 2018; Kinney-Lang et al. 2019). However, all the tensor-based fusion of the EEG-fMRI methods, except (Chatzichristos et al. 2018), has been in fact under the matrix-tensor factorization framework, i.e., fMRI in a matrix and EEG in a 3rd order tensor, and only one mode of variability such as participant or time have been used as the common loading vectors for the decomposition of the two modalities. Here, we introduced a framework in which a 4th order EEG tensor (time \(\times\) space \(\times\) frequency \(\times\) participant) was partially coupled in time and subject domains with a 3rd order (time \(\times\) space \(\times\) participant) fMRI tensor. Moreover, a matrix containing the paradigm signal and its shifted versions (\(\times\) 9) were used for the initialization of the coupled temporal loading matrix. Out of the nine coupled components, seven components were found to be time-locked to the paradigm signals. The corresponding spatial, temporal, spectral, and participant signatures of these components recapitulated the know well-known resting state and attention networks, available in the literature. Another further two EEG components were also related to the task but were among the non coupled components. One of these two components had a substantial delay of 6 s with respect to the paradigm signal. Moreover, one coupled component had only contribution from the female participants.

Although the use of the 6 healthy subjects did not show any major finding in regards to the resting state and attention task EEG and fMRI data, but the approach described here demonstrated that using a single process of CTTD, the temporal, spectral, and spatial similarities and differences between the participants can be identified all at once.

Conclusion

Compared to matrix decomposition, the tensor decomposition techniques are superior due to being able to inherently represent the multidimensional data and the achieving uniqueness without imposing strong assumptions. In recent years the tensor decomposition techniques have gained popularity in the fusion of the biomedical data. Here, a novel tensor-based fusion method was introduced for extraction of the task related EEG and fMRI features. In the proposed method, a single 4th order EEG tensor was partially coupled in time and participant modes to a 3rd order fMRI tensor. The application of the methods was demonstrated on simultaneous EEG-fMRI data from 6 subjects performing 0-Back and 2-Back memory tasks.