Introduction

The characterization and identification of typical brain developmental patterns can provide important insights into brain organization and function. Brain function can be described by cognitive processes, which may include, but are not limited to, sensory, grammatical, semantic processing, memory retrieval, or motor events (Hernández et al., 2022). One method of studying the timing properties of cognitive processes is focusing on their underlying brain mechanisms. This can be achieved using functional neuroimaging techniques such as magnetoencephalography (MEG) and electroencephalography (EEG) (Hernández et al., 2022) by extracting neural sources (latent components) describing these processes. MEG is a powerful neuroimaging technique that measures the magnetic fields generated by neuronal activity, which arises collectively within the brain from population neuronal responses to target stimuli (Hämäläinen et al., 1993). The direct measurement of neuronal currents by MEG, its high spatial resolution, and its excellent temporal resolution makes it an especially useful noninvasive technique for studying brain function. Thus, MEG is the preferred method for studying the rapid spatiotemporal dynamics of brain activity (He & Liu, 2008). To this end, event-related fields (ERFs) and event-related potentials (ERPs) have been identified as important research tools for understanding brain developmental patterns of pediatric cohorts. ERFs and ERPs represent a time-locked MEG/EEG activity that measures brain responses elicited by stimuli.

Multi-subject MEG studies that focus on large pediatric cohorts have high potential to provide important insights into brain organization and brain development in children and adolescents. One approach to this is by using multi-subject latent component analysis, through which latent components within multiple MEG datasets are learned jointly by exploiting dependence across the datasets (Gabrielson et al., 2020; Akhonda et al., 2018). However, as we discuss next, many existing multi-subject latent component analysis techniques in neuroimaging are based on matrix factorization methods. Matrix representations cannot account for the multiple dimensions of the data, such as participant, stimulus condition, variations in time and space, and the relationships across these dimensions.

Over the past decades, substantial efforts have been made in finding ways to model and extract common hemodynamic or electrophysiological components from multi-subject task and resting state neuroimaging (NI) data such as fMRI and MEG/EEG. The common components detected during the task may be indicative of a typical or atypical patient state and can be further used to discover the prognostic imaging biomarkers. Data fusion and joint analysis methods based on matrix decompositions such as joint independent component analysis (jICA) (Calhoun et al., 2006), group ICA (GICA) (Calhoun et al., 2001; Labounek et al., 2018; Calhoun & Adali, 2012; Salman et al., 2019), dictionary learning (Jin et al., 2020; Akhavan et al., 2022), independent vector analysis and its transposed variant (tIVA) (Adali et al., 2015) have been used for analysis and fusion of NI data. A reason for the popularity of these methods is the convenience of presenting the time-varying NI data as a matrix of \(\mathrm {time} \times \mathrm {space}\). It was shown in many studies that these matrix-based approaches are powerful for extracting meaningful components (Calhoun & Adali, 2012). The group-level ICA methods exploit high-order statistics (Hyvärinen & Oja, 2000) of the data and enable assessment of complex spatiotemporal relationships (Calhoun & Adali, 2012). However, a primary problem of two-way techniques is that components are defined only by two signatures, which are not determined uniquely without further constraints on the model. The uniqueness is achieved by imposing constraints such as independence or sparsity (Lahat et al., 2015; Adali et al., 2014; Acar et al., 2013; Jin et al., 2020; Akhavan et al., 2022). To use matrix-based methods for the higher dimensional data, the unfolding and the dimension reduction into a matrix is required, which is done by concatenation or stacking of the data (Calhoun et al., 2009; Delorme & Makeig, 2004). Such unfolding inevitably discards the inherent multilinear structure of brain imaging data, and therefore, may ignore complex important interactions between/among the folded modes (Cong et al., 2015).

Given that most NI data can be conveniently expressed as a high order array, tensor decomposition techniques are preferred to represent the original data as a mixture of the latent components with corresponding signatures from each dimension. In addition, certain tensor decompositions provide uniqueness under mild constraints. The uniqueness property is critical for an unambiguous interpretation of the components, finding matches with neural processes and/or component signatures. Moreover, tensors provide a natural representation of the inherently multidimensional NI data and preserve the structural information among the tensor modes, thus effectively exploiting the multilinear correlation structure and enabling robust group-level statistical analyses for multiple datasets.

Among the tensor decomposition techniques, the canonical polyadic (CP) decomposition and the Tucker decomposition (Carroll & Chang, 1970; Kolda & Bader, 2009) are particularly useful in fMRI, MEG/EEG processing of real (Cong et al., 2015) and complex-valued data (Kuang et al., 2019). The main advantage of the CP model (Carroll & Chang, 1970) is that it is essentially unique up to scaling and permutations (Sidiropoulos & Bro, 2000). However, it is worth noting that the CP model cannot effectively take into account higher-order statistical information like ICA-based methods (Kroonenberg & De Leeuw, 1980). The disadvantage of the Tucker decomposition compared with the CP model is limited model interpretability without imposing the orthogonality constraint, which is unrealistic for the brain components. Thus, in this paper we chose the CP tensor format as our primary model of interest.

Tensor-based analysis of MEG/EEG has received increased attention during the last decade (Cong et al., 2012; Wang et al., 2018, 2020; Zhu et al., 2020; Liu et al., 2021; Chatzichristos et al., 2022). The CP model has been extensively used for high-order decompositions of multi-subject EEG data (organized as a channel \(\times\) time \(\times\) subject third-order tensor) or wavelet EEG (organized as a channel \(\times\) time \(\times\) frequency \(\times\) subject) (Cong et al., 2012; Wang et al., 2018; Wang et al., 2020; Vanderperren et al., 2013). The specialized multiway algorithms have been proposed for ERP analysis of EEG to deal with noisy and nonstationary signals using the Bayesian CP model (Wu et al., 2014) and fifth-order ERP feature extraction (Wang et al., 2018). In (Kinney-Lang et al., 2018, 2019), authors employed the CP decomposition for developmental feature extraction from EEG pediatric datasets. In (Zhu et al., 2020; Liu et al., 2021), the CP model was used to study the functional connectivity patterns of MEG data (organized as time \(\times\) frequency \(\times\) connectivity third-order tensor).

Despite a substantial number of studies dedicated to high-order ERP analysis, the multidimensional nature of MEG has not been fully exploited for the data-driven extraction of sensor-level ERF components. MEG ERF components can better inform about the rapid spatiotemporal dynamics of brain information processing compared with EEG due to higher spatial resolution of MEG. Provided that ERFs are collected using the same stimuli, the assumption is that activity elicited by the same stimuli is highly correlated among subjects, which can be seen as a prerequisite for applying the CP decomposition. Thus, multi-subject MEG studies generate ERFs that can be naturally represented using CP tensor format.

Several works (Stephen et al., 2013; Ablin et al., 2021; Ikeda & Toyama, 2000; Jung et al., 2001; Boonyakitanont et al., 2022) focus on the characterization and identification of sensor-level MEG ERFs using matrix-based approaches such as ICA/jICA algorithms. The algorithms that have been used in (Stephen et al., 2013; Pinner et al., 2020; Ablin et al., 2021; Ikeda & Toyama, 2000; Jung et al., 2001; Boonyakitanont et al., 2022) inherently transform three-dimensional (3D) multi-subject MEG data into two-dimensional (2D) matrix representation. For multi-subject MEG data, such a 2D transformation loses the multidimensional low-rank structure that may provide an intrinsic description of the spatiotemporal interactions. On the hand, the low-rank structure of MEG data can be fully captured by the CP tensor format as we propose. Hence, we model the multi-subject MEG data as a 3D tensor with dimensions of \(\mathrm {subject} \times \mathrm {time} \times \mathrm {channel}\). This high-order representation of the multi-subject MEG dataset maximizes the simultaneous use of spatiotemporal modes and multilinear interactions across modes within the data.

Our goal is the identification of typical brain developmental patterns that could be used as descriptive imaging signatures in a healthy population of children and adolescents. Using CP decomposition, we propose a group-level tensor analysis method to characterize and identify sensor-level ERF components in task-related multi-subject MEG data. The proposed model enables the analysis of a multi-subject MEG dataset as a third-order tensor and, thus, exploits the multidimensional nature of the group-level data. We use hierarchical clustering on principal components (HCPC) (Husson et al., 2010; Argüelles et al., 2014) approach to find subject groups using a supplementary cognitive measures dataset.

We summarize our contributions as follows: The paper presents a CP analysis framework to robustly identify common brain developmental patterns for multi-subject sensor-level MEG data. The proposed formulation of the CP model shown in Fig. 3 was capable of extracting typical early (M150), and late latency (M300a and M400) ERF components representative of visual spatial attention, associative memory and semantic processing, similar to the results reported in existing ERF/ERP studies. We develop a group-level inference approach that allows robust statistical inferences directly using CP component matrices. We demonstrate the statistical significance of tensor group-level analyses by identifying the discriminative ERF components that can differentiate between high performance and low performance groups. We show that the discriminative ERF components were significantly correlated with major cognitive domains such as attention, episodic memory, executive function, and language comprehension.

A preliminary work using the same MEG data with similar clustering of subjects used ICA model and is presented as a conference contribution (Boonyakitanont et al., 2022). The current paper presents a novel formulation for group-level analysis using the CP model, a detailed description of the clustering approach (see “Subject Subgroup Identification”), and novel experimental results.

This paper is organized as follows. The description and notations for CP decomposition are introduced in “Notations and Definitions”. We describe multidimensional generative data model and CP tensor decomposition for multi-subject MEG data in “Tensor Analysis of MEG Data for Brain Pattern Extraction”. The experimental setup is described in “Experimental Design”. In “Results”, the typical ERF components extracted from the CP model and the group-level tensor-based statistical inference results are presented. The experimental results are discussed in “Discussion”. The conclusions and future work are presented in “Conclusion”.

Materials and Methods

Notations and Definitions

In this paper, the mathematical notations and definitions are adopted from (Kolda & Bader, 2009) and (Cichocki et al., 2016). We denote scalar with the lower case letter x, vectors with boldface lowercase letters (\(\mathbf {x}, \mathbf {y}, \mathbf {z}, \cdots\)), matrices with boldface capital letters (\(\mathbf {X}, \mathbf {Y}, \mathbf {Z}, \cdots\)), and tensors with bold calligraphic uppercase letters (\(\varvec{\mathcal {X}}, \varvec{\mathcal {Y}}, \varvec{\mathcal {Z}}, \cdots\)). The number of dimensions is called the order, and each dimension is referred to as a mode. \(\Vert \cdot \Vert _F\) denotes the Frobenius norm, \(\mathbf {A} \otimes \mathbf {B}\) denotes the Kronecker, \(\mathbf {A} \odot \mathbf {B}\) denotes the Khatri-Rao product and \(\langle \mathbf {a}, \mathbf {b} \rangle = \mathbf {a}^T \mathbf {b}\) denotes the inner product of two vectors. A rank-1 tensor is expressed as the outer product of vectors, i.e., \(\varvec{\mathcal {X}} = \mathbf {a} \circ \mathbf {b} \circ \mathbf {c}\), where \(\circ\) represents the vector outer product. The mode-n matricitization of a given tensor along dimension n is denoted by \(\mathbf {X}_{(n)} \in \mathbb {R}^{I_n \times I_1 I_2 \cdots I_{n-1} I_{n+1} \cdots I_N}\) (Kolda & Bader, 2009). The \(n-\)mode product of a tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{I_1 \times \cdots \times I_n \times I_N}\) and a matrix \(\mathbf {A} \in \mathbb {R}^{J_n I_n}\) along the nth mode, denoted as \(\varvec{\mathcal {X}} \times _{n} \mathbf {A}\), is a tensor of size \(I_1 \times \cdots \times J_n \times \cdots \times I_N\).

Participants

The participants included 170 healthy children (89 male, 81 female) and adolescents between the ages of 9 and 15 (\(M=11.92\) years, \(SD = 1.18\)) with no reported clinical diagnoses from the Mind Research Network (MRN) in Albuquerque, New Mexico (90) and the University of Nebraska Medical Center (UNMC) in Omaha, Nebraska (80) as part of the Developmental Chronnecto-Genomics (Dev-CoG) study (Stephen et al., 2021). The participants and parents signed consent forms approved by each institutional review board (IRB) prior to joining the study. All procedures were approved by the MRN and UNMC IRBs prior to the start of the experiment.

Neuropsychological Testing

All the participants completed the Wechsler abbreviated scale of intelligence (Second Edition; WASI-II; (Wechsler, 2011)) to assess full-scale IQ (FSIQ) and NIH-Toolbox Cognitive Battery(Weintraub et al., 2013) (NIHTB-CB) tests assessing age-adjusted neuropsychological (T) scores in six cognitive domains: attention, episodic memory, executive function, language, processing speed, and working memory. The data collection also included the Conners 3 Inattention/Hyperactivity scores (Conners, 2008) for assessing attention-deficit hyperactivity disorder (ADHD), and children with diagnosed ADHD were excluded from the study. There were no significant differences (\(p > 0.05\)) in terms of age or gender with respect to the neuropsychological measures in participants from the MRN and UNMC.

MEG Experimental Paradigm

Participants completed a multisensory task while MEG data were recorded (see Fig. 1). The visual stimulus was a full-screen, black and white vertical grating (0.25 cycles/degree). The auditory stimulus was a 40 Hz modulated 1000 Hz tone. For multisensory stimulus, the auditory and visual stimuli were presented simultaneously. The baseline fixation was a red box in the center of the screen. Subjects were instructed to press their index finger when they saw anything, heard anything, or both. Each MEG trial began with a fixation for an intertrial interval (ITI) that pseudo-randomly changed between 2400 and 2600 milliseconds (ms) in 10 ms increments. Following fixation, a sensory stimuli (auditory (AUD), visual (VIS), or audio-visual (AV)) was presented for 800 ms (Stephen et al., 2021). The total task duration was approximately 18 min.

Fig. 1
figure 1

Multisensory task paradigm. The presentation started with a baseline fixation screen, followed by the appearance of auditory (AUD), visual (VIS) or multisensory stimulus (AV)

MEG Data Acquisition and Image Preprocessing

The MEG data acquisition and preprocessing details were previously published in (Stephen et al., 2021). MEG recordings were acquired with the Elekta/MEGIN MEG system with 306 magnetic sensors (204 gradiometers and 102 magnetometers) in a magnetically shielded room. The MEG data were continuously sampled at 1000 kHz with a passband between 0.1 and 330 Hz. We used preprocessing technique such as signal-space separation (SSS) (Taulu & Kajola, 2005) for MEG data denoising and to ensure comparability between magnetometer and gradiometer source reconstructions (Garcés et al., 2017). The MEG sensor-level artifacts were removed during prepossessing at both the MRN and UNMC sites. MEG epochs between \({-}100\) to 1000 ms (1100 time points) around the stimulus onset were averaged across 300 trials within respective stimuli and formed sensor-level ERFs time-locked to the stimulus condition (AUD, VIS, or AV).

Prior to MEG recording 3D digitization was performed to collect positioning data for four head-position indicator (HPI) coils, and the scalp surface. The HPI coils data were collected throughout the recordings, which allowed offline head movement correction (Stephen et al., 2021). The Maxfilter program was used to adjust the location of the head to a common head location within the dewar. The movement compensation extension of the Maxfilter program allows one to correct for head movement throughout the scan (effectively correcting small changes in head position through re-mapping the MEG data to a constant head) (Taulu & Simola, 2006). Another use of this capability is to map each subject’s MEG data to a common head position within the dewar. Prior work has shown that too much adjustment of the head position can lead to noise amplification. Therefore, we chose a head position that was the closest to the average participant head location within the dewar and mapped all data to this common head position. Once the data were mapped to a common head position, the results were compared across participants as is often done with EEG sensor data. We did not perform source reconstruction and worked in sensor space when we applied tensor decomposition. Using the sensor’s spatial adjacency matrix, we associated each sensor with a sensor spatial region (Occipital, Frontal, Parietal, Temporal, and left/right hemisphere). These sensor regions shown in Fig. 2 do not correspond to the brain regions or anatomical labels. Throughout the paper, the approximate sensor groups were used to describe ERF component spatial patterns on the scalp topographic map (topomap).

Fig. 2
figure 2

Top down view of MEG magnetometers. MEG regional division based on the sensor’s spatial adjacency matrix. Inserts show global field power of MEG. The approximate sensor groups were used to describe the ERF components spatial patterns on the scalp topographic map. R - right, L - left. Adopted from (Stephen et al., 2013)

Fig. 3
figure 3

Brain developmental patterns discovery via tensor decomposition. a-b Illustration of tensor analysis for multi-subject MEG data. (a) Tensor formation by arranging the subjects along the first dimension. b MEG tensor decomposition into R rank-1 components. Each rank-1 component represents a distinct spatiotemporal brain activity pattern with subject weight (\(\mathbf {a}_r\)) and temporal (\(\mathbf {b_r}\)) and spatial signatures (\(\mathbf {c_r}\)). c Tensor group-level analysis, which includes subgroup identification and group-level statistical inference. d Left: Component association patterns with cognitive domains. Right: Sensor spatial locations associated with the components

Tensor Analysis of MEG Data for Brain Pattern Extraction

Multidimensional Model for Multi-Subject MEG Data

The MEG experimental paradigm shown in Fig. 1 results in simultaneously recorded neural measurements elicited in C common sensors at T timepoints across K subjects. As a result, the observed MEG recordings are modeled as a mixture of the underlying neural sources of interest synchronized in time across subjects within a specific task. To identify the common brain developmental patterns elicited by sensory stimuli across subjects, we applied the CP tensor decomposition to extract the latent brain activity patterns. The proposed approach has two important advantages. The CP representation of MEG data allows us to take into account the higher-order structure of the multi-subject data to extract common patterns across subgroups. By virtue of the CP decomposition, the MEG factorization provides a unique solution under mild constraints (Kruskal, 1977; Sidiropoulos & Bro, 2000). The importance of the uniqueness condition cannot be overstated since it allows finding meaningful components unambiguously and matching them to the true brain processes.

To preserve the intrinsic multidimensional nature of multichannel MEG data, the data are tensorized as third-order tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{K \times T \times C}\) by stacking subject ERF matrices \(\mathbf {S}_k \in \mathbb {R}^{C \times T}\) in subject mode. Fig. 3a shows the generative model for multidimensional representation of multi-subject MEG data.

Multi-Subject MEG Tensor Decomposition

By adopting the tensorization strategy shown in Fig. 3a, we present the CP model of the multi-subject MEG data as a third-order tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{K \times T \times C}\) (\(\mathbb {R}^{\mathrm {subject} \times \mathrm {time} \times \mathrm {channel}}\)). The CP decomposition approximates tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{K \times T \times C}\) as a sum of rank-1 tensors:

$$\begin{aligned} \varvec{\mathcal {X}} \approx \sum _{r=1}^{R} \mathbf {\lambda }_r \circ \mathbf {a}_r \circ \mathbf {b}_r \circ \mathbf {c}_r = \mathbf {\Lambda } \times _1 \mathbf {A} \times _2 \mathbf {B} \times _3 \mathbf {C} \end{aligned}$$
(1)

where \(\mathbf {a}_r \in \mathbb {R}^{K}, \mathbf {b}_r \in \mathbb {R}^{T}, \mathbf {c}_r \in \mathbb {R}^{C}\) are the factor vectors normalized to the 2-unit norm; \(\mathbf {\lambda }_r\) represents the scale factor for each component, and the norms absorbed into diagonal matrix \(\mathbf {\Lambda }\); \(\mathbf {A} \in \mathbb {R} ^{K \times R}, \mathbf {B} \in \mathbb {R} ^{T \times R}, \mathbf {C} \in \mathbb {R} ^{C \times R}\) are the factor matrices and R is the rank or number of components. Each rank-1 tensor obtained from the \(\mathbf {\lambda }_r \circ \mathbf {a}_r \circ \mathbf {b}_r \circ \mathbf {c}_r\) decomposition, can be interpreted as a distinct spatiotemporal brain pattern, where \(\mathbf {a}_r\), \(\mathbf {b}_r\) and \(\mathbf {c}_r\) are the subject weights of the time-varying spatial patterns, timecourses, and spatial maps, respectively, as illustrated in Fig. 3b. The CP model optimizes a least-squares fit of the following cost function (Kolda & Bader, 2009):

$$\begin{aligned} \begin{aligned} f(\mathbf {\Lambda }, \mathbf {A}, \mathbf {B}, \mathbf {C}) =&\\ \min _{\mathbf {\Lambda }, \mathbf {A}, \mathbf {B}, \mathbf {C}}\frac{1}{2} \Vert \varvec{\mathcal {X}} - \mathbf {\Lambda } \times _1 \mathbf {A} \times _2 \mathbf {B} \times _3 \mathbf {C} \Vert _F^2,&\\ \mathrm {s.t} \ \Vert \mathbf {a}_r\Vert _2 = \Vert \mathbf {b}_r\Vert _2 = \Vert \mathbf {c}_r\Vert _2 = 1, \forall r = 1, \cdots, R. \end{aligned} \end{aligned}$$
(2)

We apply alternating least squares (ALS) to estimate the factor matrices (Cichocki et al., 2016; Kolda & Bader, 2009). The minimization problem is solved by fixing two matrices and optimizing over the third-one. Each least squares subproblem is convex and has a closed-form solution (Kolda & Bader, 2009).

Component Number Estimation

As in many dimensionality reduction methods, a critical step is the selection of the optimal number of components. We use three methods to make this choice for the CP decomposition: the core consistency diagnostic (CORCONDIA/CCD) (Bro & Kiers, 2003), the average congruence product (ACP) (Tomasi & Bro, 2005) and the Bayesian information criterion (BIC) (Schwarz, 1978) as a function of tensor rank R. The CCD measures the similarity between the estimated core and the superdiagonal ideal core, in the absence of noise (Bro & Kiers, 2003). According to (1), the CP core can be estimated as

$$\begin{aligned} {\mathcal {G}} = {\mathcal {X}} \times _1 \times _2 \mathbf {A}^\dagger \times _3 \mathbf {B}^\dagger \times _3 \mathbf {C}^\dagger . \end{aligned}$$
(3)

The CCD in (\(\%\)) is defined as in (Bro & Kiers, 2003)

$$\begin{aligned} \mathrm {CCD}(\%) = 100 \times \Big {(} 1- \frac{\Vert \varvec{\mathcal {G}}_R - \varvec{\mathcal {I}}_R\Vert _F^2}{R} \Big {)}, \end{aligned}$$
(4)

where \(\varvec{\mathcal {G}} \times \mathbb {R}^{R \times R \cdots \times R}\) and \(\varvec{\mathcal {I}} \times \mathbb {R}^{R \times R \cdots \times R}\) are the estimated and ideal CP cores, respectively. We choose the model with the highest number of components such that

$$\begin{aligned} \hat{R}_{\mathrm {CCD}} = \mathrm {arg} \max _{r} (\mathrm {CCD}) \ \ \mathrm {s.t} \ \mathrm {CCD}(r) \ge \eta, \end{aligned}$$
(5)

where \(0< \eta < 100\%\) is the threshold coefficient, with \(r = 1, \cdots R\). Typically, \(80\%<\eta < 90\%\) is used.

Furthermore, to assess the number of components, we computed the ACP measure of all fitted models for a given tensor rank R. The ACP metric measures the correlation between components extracted from different models for a given tensor rank R:

$$\begin{aligned} \begin{aligned} \mathrm {ACP}&= \max _{\mathbf {P}} \mathrm {tr} \Big {(}({\mathbf {A}_{r}^{(1)}}^{T} \mathbf {A}_{r}^{(2)}) \\ {}&({\mathbf {B}_{r}^{(1)}}^{T}\mathbf {B}_{r}^{(2)}) ({\mathbf {C}_{r}^{(1)}}^{T} \mathbf {C}_{r}^{(2)}) \mathbf {P}\Big {)}, \end{aligned} \end{aligned}$$
(6)

where \(\mathbf {A}_{r}^{(k)}, \mathbf {B}_{r}^{(k)}, \mathbf {C}_{r}^{(k)}\) represents the rth component of the ith solution \(i=1, \cdots I\), \(r = 1, \cdots R\), \(\mathbf {P}\) is the permutation matrix that accounts for the ambiguity (Harshman et al., 1970) of ordering the solutions, and \(\mathrm {tr}(\cdot )\) is the trace of the matrix. We select the model that produces the highest ACP value such that

$$\begin{aligned} \hat{R}_{\mathrm {ACP}} = \mathrm {arg} \ \max _{r} \ (\mathrm {ACP}), \ r = 1, \cdots R. \end{aligned}$$
(7)

We used the BIC metric further to assess the number of components as an information-theoretical criterion. The BIC measure is based on the negative log-likelihood and the maximum a posteriori (MAP) approximation (Stoica & Selen, 2004).

The BIC metric is defined in terms of the sum squared error (\(\mathrm {SSE} = \Vert \varvec{\mathcal {X}} - \hat{\varvec{\mathcal {X}}} \Vert _F^2\)) (Mørup & Hansen, 2009), where \(\varvec{\mathcal {X}}\) stands for the original data tensor, and \(\hat{\varvec{\mathcal {X}}}\) denotes the fitted model

$$\begin{aligned} \mathrm {BIC} = S \log {\frac{SSE}{S}} + F\log {S}, \end{aligned}$$
(8)

F is the degree of freedom, and \(S = \prod _{n=1}^N I_n\) is the number of tensor data elements. We chose the model that produces the lowest BIC value such that

$$\begin{aligned} \hat{R}_{\mathrm {BIC}} = \mathrm {arg} \ \min _{r} \ (\mathrm {BIC}), \ r = 1, \cdots R. \end{aligned}$$
(9)

Clustering Analysis for Subject Subgroup Identification

In this section, we present the clustering analysis methodology for identifying subgroups. A preliminary version of the clustering approach presented here using cognitive measures collected during the Dev-Cog study presented as a conference contribution (Boonyakitanont et al., 2022). The detailed clustering protocol is described in the Supplementary Methods Sect. 3.1. We partitioned the subject cohort (\(N=170\)) into distinct subgroups using a neuropsychological dataset. We performed HCPC clustering using nine cognitive variables from six cognitive domains, including the Connors 3 inattention/hyperactivity scores: WASI-II FSIQ, ORRENG, PICVOCAB, PSM, LSWM, DCSS, FICA, INATTENTION and HYPERACTIVITY. The HCPC method (Husson et al., 2010; Argüelles et al., 2014) combines three standard techniques (principal component analysis (PCA), hierarchical clustering, and the K-means algorithm) to obtain a higher quality clustering solution. A schematic view of the subgroup identification using the HCPC algorithm is presented in Fig. 4.

First, the PCA algorithm is applied to the neuropsychological dataset, represented as a subject score matrix \(\mathbf {P} \in \mathbb {R}^{K \times L}\), \(K= 170\), \(L= 9\), to reduce the dataset into fewer dimensions called principal components (PCs), which are uncorrelated with each other. We compute a distance matrix \(\mathbf {D} \in \mathbb {R}^{K \times K}\) of these PCs, which uses the dissimilarity measure such as distance correlation (Székely et al., 2007). The distance correlation measure allows the detection of nonlinear correlations (Székely et al., 2007) that might not be identified by the Pearson correlation (Székely et al., 2007), which may result in suboptimal performance of the downstream tasks. Next, we apply hierarchical clustering using Ward’s D2 (Murtagh & Legendre, 2014) method on the distance matrix \(\mathbf {D}\) to select the clusters based on the height of the hierarchical tree. The significant clusters are selected on the basis of the approximately unbiased (AU) probability (Efron et al., 1996) p-values with \(p < 0.05\). The quality of clustering is assessed according to the compactness metrics (Halkidi et al., 2002a, b) (see Supplementary Methods Sect. 3.1 and Supplementary Fig. S.3). The cluster stability is evaluated as a function of the number of clusters using the Jaccard similarity index (J) (Jaccard, 1912) via a nonparametric bootstrap technique with a number of repetitions \(n=1000\) (Supplementary Methods Sect. 3.1). The final clustering solution is obtained by applying the K-means algorithm to the hierarchical clustering output.

Fig. 4
figure 4

Hierarchical clustering on principal components of neuropsychological (T) scores for subject group identification using Ward’s D2 criterion (Murtagh & Legendre, 2014). PCA is used on the subject cognitive matrix \(\mathbf {P} \in \mathbb {R}^{K \times L}\) to remove highly correlated continuous variables. Next, we apply hierarchical clustering using Ward’s D2 method on the distance matrix \(\mathbf {D}\) to select the clusters based on the height of the hierarchical tree. The distance matrix \(\mathbf {D} \in \mathbb {R}^{K \times K}\) is computed using the dissimilarity measure such as the distance correlation (Székely et al., 2007) of the PCs. The initial number of clusters \(N_{{C}_{k}}\) is assessed according to the compactness metrics (Halkidi et al., 2002a, b), and the cluster stability is evaluated using the Jaccard similarity index (Jaccard, 1912) via a nonparametric bootstrap technique with a number of repetitions \(n=1000\) (see detailed protocol in Supplementary Methods Sect. 3.1). We select significant clusters based on the approximately unbiased probability p-values (Efron et al., 1996), as shown in Fig. 8a. We provide the final clustering solution by applying the K-means algorithm to the hierarchical clustering output

Numerical Experiments

Data Preprocessing

The MEG multi-subject dataset consists of 170 subjects taken from the Dev-CoG study (Stephen et al., 2021). Before the tensor analysis, we normalized the data by centering the third-order MEG tensor across the time mode, and scaling within the subject mode by its standard deviation (Bro & Smilde, 2003). We used 204 planar gradiometers and 102 magnetometers out of available 306 sensors after SSS preprocessing (see “MEG Data Acquisition and Image Preprocessing”). Thus, the data preprocessing resulted in 170 \(C \times T\) ERF subject datasets with \(C=306\), and \(T=1100\). Furthermore, tensor analyses were performed separately for three stimulus conditions (AUD, VIS, and AV). We selected nine age-adjusted cognitive (T) scores from available neuropsychological measures (see “Neuropsychological Testing”) in the data analyses: the WASI-II FSIQ, Picture Sequence Memory (PSM) (T) score (Weintraub et al., 2013), Picture Vocabulary (PICVOCAB) (T) score (Weintraub et al., 2013), Oral Reading Recognition (ORRENG) (T) score (Weintraub et al., 2013), List Sorting Working Memory (LSWM) (T) score (Weintraub et al., 2013), Flanker Inhibitory Control and Attention (FICA) (T) score (Weintraub et al., 2013), Dimensional Card Sorting (DCCS) (T) score (Weintraub et al., 2013), and the Conners 3 Inattention/Hyperactivity scores. The neuropsychological (T) scores were aggregated to construct a cognitive score matrix \(\mathbf {P} \in \mathbb {R}^{K \times L}\), where K is the number of subjects and L is the number of cognitive tests. Prior to clustering, the matrix was standardized by the \(\textit{z}\)-score to account for scale differences. In addition to the neuropsychological (T) scores, we used parental socioeconomic status (SES), age and gender as model covariates (see the detailed protocol in “Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores”).

Experimental Design

The goal of this study was to estimate common imaging patterns representing typical brain development in healthy children and adolescents. Three MEG data tensors were constructed (\(\varvec{\mathcal {X}}_{\mathrm {VIS}}\), \(\varvec{\mathcal {X}}_{\mathrm {AUD}}\), and \(\varvec{\mathcal {X}}_{\mathrm {AV}}\)) for each stimulus condition according to the generative model shown in Fig. 3a. The tensor rank R was estimated for each original data tensor as described in “Component Number Estimation”. Three separate CP decompositions were conducted for each stimulus condition with the chosen tensor rank. The fitted CP models resulted in three estimated tensors, consisting of R-component factor matrices \(\mathbf {A} \in \mathbb {R}^{K \times R}\), \(\mathbf {B} \in \mathbb {R}^{T \times R}\), \(\mathbf {C} \in \mathbb {R}^{C \times R}\) that described the latent ERF spatiotemporal brain patterns.

To associate brain function with the cognitive performance observed in the neuropsychological tests, we partitioned the subject dataset into two distinct subgroups, high performance (HP) and low performance (LP), using the HCPC method (Husson et al., 2010; Argüelles et al., 2014). Thus, we could perform group-level statistical analyses using the extracted ERF components to identify group-level discriminative brain developmental signatures. We identify associations between extracted latent ERF components and cognitive processes by correlating these latent components with children’s scores in the cognitive domains. We hypothesize that statistically significant latent ERF components can differentiate between children’s brain patterns in those with low vs. high performance and could indicate brain developmental trajectory or cognitive development status. Fig. 3 illustrates the application of tensor decomposition to identify brain developmental patterns using MEG data.

Execution Details

The CP model (2) was fit using CP-ALS (Kolda & Bader, 2009) from the TensorLy toolbox (Kossaifi et al., 2019), MNE-Python (Gramfort et al., 2013) was used to generate topographic maps, and R software (Team RC et al., 2013) version 3.6.0 (R Foundation for Statistical Computing, Vienna, Austria) was used for statistical analyses. All experiments were performed on a Linux workstation with 4 Quad-Core Intel Xeon 3.1 processors and 16 GB memory.

Fig. 5
figure 5

Estimation of the number of components for the CP model, showing the mean values (\(N=100\)) of the CCD, ACP, BIC and RMSE metrics as a function of tensor rank R for 100 random initializations for each stimulus condition (VIS, AUD, and AV). ab Boxplots summarize the distribution of the mean CCD and mean ACP as a function of tensor rank. Median values are represented by black lines inside the boxplot, with the top of the whisker lines indicating the 25th and 75th percentile values. Mean values are plotted in white circles, and red circles represent outliers. Error bars represent the standard error of the mean. The plot of mean values of RSME suggests that all runs at fixed R yielded the same RMSE with a standard error of the mean \(< 0.0001\). These results suggest that all CP-ALS local minima are similar and presumably also similar to the global minimum. a CCD boxplot. b APC boxplot. c Mean and standard error of the RMSE and APC as a function of tensor rank R. d Mean BIC as function of tensor rank R

Model Selection and Evaluation

The model performance was assessed with qualitative and quantitative metrics. The qualitative assessment used interpretations of the extracted components and comparisons with existing literature findings on adolescent cohorts. We computed the reconstruction error of the CP model as \(\mathrm {RMSE} = \Vert \varvec{\mathcal {X}} - \hat{\varvec{\mathcal {X}}}\Vert /\prod _{n=1}^N I_n\), and the model fit \(\mathrm {FIT} = \big {(}1 - \Vert \varvec{\mathcal {X}} - \hat{\varvec{\mathcal {X}}}\Vert _F^2/{\Vert \hat{\varvec{\mathcal {X}}}\Vert _F^2}\big {)}\). The CP-ALS stopping criteria included reaching 1000 iterations or achieving a convergence tolerance of \(\epsilon \le 10^{-8}\). We investigated the model order and stability by running the CP-ALS algorithm 100 times for each stimulus condition and R values of one to ten, with each run randomly initialized. This procedure allowed us to determine whether some runs converged to local minima with high reconstruction error. The error plot in Fig. 5c reveals that all runs at fixed R yielded the same RMSE with a standard error of the mean (SEM) \(< 0.0001\). These results suggest that all CP-ALS local minima are similar and presumably also similar to the global minimum.

We assessed the number of components for the CP model (2) by generating average CCD (4) plots, average ACP (6) plots and average BIC (8) plots as a function of tensor rank R for \(R=1, \cdots, 10\). Fig. 5a–b and d show boxplots of the mean CCD (4), mean APC (6) and mean BIC (8) metrics for each stimulus, demonstrating the sensitivity of the solution to the selection of R and initialization parameters of the CP-ALS algorithm.

According to (Bro & Kiers, 2003), the tensor rank of the CP model should be chosen such that the CCD value is greater than \(90\%\). Fig. 5a reveals that \(R=2\) should be chosen for the AUD (\(M = 97.9, SD = 2.28\)) and VIS (\(M = 96.2, SD = 2.25\)) conditions, while \(R=3\) should be chosen for the AV (\(M = 93.1, SD = 1.37\)) condition.

The ACP values for different R was another method for assessing the number of components. Fig. 5b shows that adding more components resulted in lower mean and higher SEM values for the ACP metric. Similar to the CCD boxplot, the ACP boxplot confirms that \(R = 2\) is the correct number of components for the AUD (\(M = 0.988, SD = 0.05\)) and VIS (\(M = 0.998, SD = 0.02\)) conditions, while \(R = 3\) is the best for the AV (\(M = 0.988, SD = 0.05\)) condition.

The BIC (8) method was used as model-driven measure to complement the CCD and ACP metrics for assessing the number of components for different R. Fig. 5d shows that for the AUD (\(M= 4.53 \times 10^5, SD = 1015\)) and VIS (\(M= 6.63\times 10^5, SD = 930\)) conditions \(R=2\) and for the AV (\(M= 3.01 \times 10^5, SD = 845\)) \(R=3\) should be chosen based on the minimum BIC value. As shown in Fig. 5d, the BIC criterion demonstrates the agreement in terms of the number of components with the CCD and ACP measures. The final solution was selected based on the chosen R, which produced the minimum RMSE value, maximum CCD and ACP values, and minimum BIC value.

Statistical Analysis

We quantified the CP model performance to produce latent factors for differentiating subject subgroups using mixed measures analysis of covariance (ANCOVA). We performed post hoc analyses with two-tailed parametric t-tests and corrections for multiple comparisons using the false discovery rate (FDR) (Benjamini et al., 2001) with the significance level of \(\alpha = 0.05\) to determine statistical significance. The ANCOVA and post hoc analyses results were accompanied with F-statistics, t-statistics, p-values and effect size. The effect size was evaluated by generalized eta squared (\(\eta _{G}^2\)) (Olejnik & Algina, 2003), Cohen’s d values and characterized as small (\(< 0.06\)), medium (0.06–0.14), or large (\(> 0.14\)), according to (Cohen, 2013). Additionally, we reported the mean (M), and standard deviation (SD) of the measures of interest.

Group-Level Statistical Inference of CP Component Matrices

The columns of the factor matrices \(\mathbf {A} \in \mathbb {R}^{K \times R}\) in subject mode contain the component loading factors (coefficients), with the column index corresponding to the loading factors of the given Rth component. The loading factors of each component indicate how much of the component is required to reconstruct the subject’s source data (Acar et al., 2019). A higher subject loading factor signifies an increased contribution of that component (Stephen et al., 2013). Therefore, group-discriminative components can be obtained by statistically comparing the component loading factors of subgroups to determine significant between-group differences.

Group differences in the component loading factors were assessed with \(2 \times 2\) mixed measure ANCOVAs with the stimuli condition (AUD, VIS, or AV) as a within-subject factor and subgroup (HP vs. LP) as a between-subject factor. The ANCOVAs were calculated for each component and condition while controlling for age, gender and parental SES. In addition to the ANCOVA tests, planned direct comparisons between HP and LP groups were made for each component and condition separately to determine if subgroups differed significantly in the component loading factors of any specific stimulus condition while controlling for the same covariates. We applied a two-tailed level of significance (\(p < 0.05\)) and an FDR correction for the number of tests performed for each condition.

Prior to performing group-level statistical analyses, we examined group differences in the subject head motion between subgroups to determine if it may cause differences in component loading factors. We assessed group differences in the head motion values with one-way ANCOVA with the subgroup (HP vs. LP) as a between-subject factor while controlling for age. There was no significant difference in the mean values of the head motion (\(F_{1, 167}=-1.051\), \(p > 0.05\)) between subgroups. Post hoc independent samples two-tailed t-test (FDR corrected, \(p <0.05\)) with unequal variances correction confirmed there was a no significant difference in the head motion for HP (\(M= 0.922\), \(SD=0.807\)) and LP (\(M= 1.12\), \(SD=1.036\)) groups (\(t_{167}= -1.051, p = 0.295\)). These results suggest that the head motion would not impact results of the group-level statistical analyses. The summary of the head motion statistical analysis is presented in Supplementary Fig. S.4.

Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores

To identify the specific neuropsychological scores associated with the ERF components, we separately correlated the ERF components with the neuropsychological (T) scores in the HP group, LP group and full sample. The relationships were evaluated with Pearson’s correlation tests. Partial correlation analyses (controlling for age, gender and parental SES) were performed between the component loading factors in subject mode (columns of matrix \(\mathbf {A}\)) and neuropsychological age-adjusted (T) scores. Specifically, we computed the two-tailed Pearson’s partial correlation coefficient (r) between the component loading factors and the nine cognitive variables, namely, WASI-II FSIQ, PSM, PICVOCAB, ORRENG, LSWM, FICA, DCCS, and the Conners 3 inattention/hyperactivity scores. Partial correlations were considered significant below the FDR-corrected threshold (\(p < 0.05, N=170\)).

Fig. 6
figure 6

Tensor decomposition results of sensor-level MEG data for target stimuli. Temporal and spatial patterns from the components of the CP model. Top: The topographic maps (magnetometers view) show the density of spatial patterns that correspond to prominent time peaks denoted with red and blue arrows. Bottom: ERF component with signal traces from all individual MEG sensors averaged across subject ERF components. The shaded areas around each line depict the standard error of the mean. The average stimulus-related ERF timecourse is shown in yellow, and the average ERF (average across sensors) component is plotted in cyan. ab Occipital 130–150 ms component in the VIS and AV conditions. c Right temporal 280-300 ms component in the AUD and AV conditions. ei Late central 350-430 ms component in the VIS, AUD, and AV conditions

Results

The proposed model was used to extract MEG ERF components using CP decomposition, followed by statistical group-level analyses (see “Group-Level Statistical Inference of CP Component Matrices” and “Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores”). In this section, we describe the results of the multi-subject tensor analyses for extracting typical brain developmental patterns from the original MEG data.

Multi-Subject MEG Tensor Analysis Using the CP Model

We employed CP decomposition to determine the component structure of MEG ERF responses to multisensory task. The MEG tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{K \times T \times C}\) was decomposed using the CP factorization model (2) shown in Fig. 3b. The data tensor was fitted with the number of components R as determined in “Model Selection and Evaluation”. The average model FIT indicates (VIS: \(R = 2, M = 0.93, SD = 0.01\); AUD: \(R = 2, M= 0.91,SD =0.01\); AV: \(R = 3, M = 0.91, SD =0.01\); see experimental setup in “Model Selection and Evaluation”) that the extracted factors account for a large part of the explained variance of the original datasets. Supplementary Fig. S.5 and Supplementary Table S.2 show the mean values of \(\mathrm {FIT}\) metric of the fitted CP decomposition for each stimulus condition (VIS, AUD, and AV). To quantify the common associations between the original MEG ERF subject’s datasets and the extracted ERF components, we performed repeated measures correlation analyses (Bakdash & Marusich, 2017) between these paired datasets (see Supplementary Methods Sect. 3.2).

Table 1 Summary of MEG ERF components

The tensor analysis yielded seven ERF components (see Table 1), which describe patterns of temporal variance (temporal factors), spatial variance (spatial factors), and the subject factors. The spatial loadings provide the measure of activity in the MEG ERF as a function of time for each spatial factor. The subject loadings modulate the magnitude of these spatiotemporal patterns, representing the pattern’s activation strength for the specific subject. We categorized the components as functional MEG ERF components that correspond to prominent spatiotemporal peaks (Stephen et al., 2013), and the spatiotemporal variance explained (\(R^2\)), which was determined by the repeated measures correlation analyses.

Fig. 6 depicts the extracted temporal and spatial components time-locked to the target stimuli after CP tensor decomposition on the sensor-level MEG data (magnetometer view), and the ERF components gradiometer view is presented in Supplementary Fig. S.6. The temporal ERF components generated from individual sensor data averaged across subject ERF components are shown. The MEG topographic maps show the density of spatial patterns that correspond to prominent time peaks. The average ERF component (average across sensors, in cyan) and average ERF timecourse (in yellow) for each stimulus condition are plotted. The ERF components are well-matched to distinct peaks present on the average ERF timecourses. The temporal evolution of the MEG ERF topographic maps is shown in Supplementary Fig. S5.

The repeated measures correlation analyses (see Supplementary Methods Sect. 3.2) found significant correlations (\(p < 0.001\)) between ERF components and the original data (VIS, AUD, and AV conditions) for the overall common slope (Supplementary Table S.3). The ERF components (Supplementary Table S.3) accounted for \(72\%\), \(76\%\), and \(74\%\) of the spatiotemporal variance (\(R^2\)) in the VIS, AUD and AV conditions, respectively.

Fig. 7
figure 7

Early latency M50 and M100 subcomponents within 0–150 ms time window of the right temporal 280–300 ms ERF component for the AUD (Fig. 6c) and AV (Fig. 6d) conditions are shown. a AUD subcomponents. b AV subcomponents

Occipital Component/M150

The occipital component was found in the VIS and AV conditions, as shown in Fig. 6a–b. This component was associated with the first prominent visual peak at a latency of 130–150 ms. The spatial distribution map at 145–149 ms (see Fig. 6a–b) shows that the positive deflection reflects MEG activity in the bilateral occipital sensors. The positive deflection resembles the visual P100 wave described in previous MEG/EEG studies which could reflect the allocation of attentional resources (Boehler et al., 2008; Zhang & Luck, 2009).

Right Temporal Component/M300a

The right temporal component with the peak 280–300 ms was consistently found in the AUD and AV conditions in the right temporal and inferior left/right frontal sensors (see Fig. 6c–d). The positive deflection at a latency of 280–300 ms corresponds to the early phase of the P300a component, which has been linked to different processes, such as detecting and evaluating novel and orienting responses (Polich, 2007; Pfefferbaum et al., 1985; Vogel et al., 1998).

The M300a component revealed two separate early latency subcomponents at about 53–56 ms (M50) and 82–86 ms (M100). These early subcomponents were found in the right temporal sensors in the AUD and AV conditions. In addition, the M100a component followed the M100 component in the AV condition and peaked around 148 ms. We show a zoomed version of the M300a component within 0–150 ms time window in Fig. 7. Figure 7 depicts the peak latencies of the M50 and M100 subcomponents after the onset of a stimulus and the topographic scalp distribution of these early latency components. The M50 and M100 components exhibit much smaller amplitudes compared to the later M300a amplitude. It was noted that the amplitude of the M100 component was more robust and more evident than the M50 amplitude.

Late Central Component/M400

The late central component was extracted for all stimuli conditions (AUD, VIS, and AV), as shown in Fig. 6e–i. This component consists of a sequence of negative (VIS and AV) and positive peaks (AUD) at approximately 126–134 ms and a prominent peak at 350–430 ms. Figure 6e-i show that this component is primarily distributed in the left temporal-parietal and right prefrontal sensors. The prominent negative deflection resembles the late phase of the parietally distributed N400 component (Halgren et al., 2002; Marinković, 2004).

Additionally, the M400 component explained early latency M100a subcomponent peaked around 126–148 ms and was identified in the VIS, AUD and AV conditions.

Fig. 8
figure 8

Results of the hierarchical clustering on the principal components. a Dendrogram of hierarchical clustering based on Ward’s D2 criterion. The height of the branches indicates the dissimilarity between clusters. The number of retained clusters was chosen using an approximately unbiased probability measure (AU) (Efron et al., 1996). The significant clusters were selected based on unbiased probability p-values with \(p < 0.05\) corrected for multiple comparisons using FDR. The final clustering solution was obtained with the K-means algorithm. b Clustering solution projected on the principal components. (c)-(d) Subgroup associations with neuropsychological (T) scores. c Distribution of mean PCA scores averaged across subject subgroups. d Main effect of subject subgroup on neuropsychological (T) score. Independent samples two-tailed t-test (FDR corrected, \(p< 0.05\)) showed statistically significant differences in the WASI-II FSIQ, language, memory (\(p < 0.0001\)) and inattention scores (\(p < 0.01\)) between the LP and HP groups. Details can be found in Supplementary Table S.1. **** \(p < 0.0001\), ** \(p < 0.01\)

Subject Subgroup Identification

In this section, we present the clustering analysis results for identifying subgroups described in “Clustering Analysis for Subject Subgroup Identification”. The hierarchical clustering results using Ward’s D2 (Murtagh & Legendre, 2014) distance are shown in the dendrogram in Fig. 8a, with the height of the branches indicating the distance or dissimilarity between clusters. As depicted in Fig. 8a, two significant clusters were selected according to the approximately unbiased probability (AU) (Efron et al., 1996) p-values with \(p < 0.05\). The clustering solution projected on the PCs is shown in Fig. 8b. We used the HCPC clustering output to identify two subgroups with distinct distributions of mean PCA scores and categorized them as high (\(n=89\)) or low (\(n=81\)) performance. The distribution of mean PCA scores shown in Fig. 8c indicates that subjects in the HP group have higher PCA loading factors than the subjects in the LP group in all six cognitive domains except for the Conners 3 inattention and hyperactivity scores where a higher score implies greater inattention and hyperactivity.

We evaluated the effect of the subject group (HP vs. LP) on the cognitive assessments using independent samples two-tailed t-tests with unequal variances corrected for multiple comparisons using FDR (\(p < 0.05\)). We present the summary statistics of the neuropsychological (T) score distribution according to subject subgroup in Supplementary Table S.1. The groups did not differ significantly in terms of gender (\(\tilde{\chi ^2} = 0.00, p = 0.99\)), age (\(t_{167} = 0.03, p = 0.97\)) or parental SES (\(t_{167} = 0.517, p=0.61\)). However, the WASI-II FSIQs differed significantly (\(t_{167}=9.16, p < 0.0001\)), with higher FSIQ scores in the HP group than in the LP group. Similarly, the language (PICVOCAB, ORRENG), memory (PSM, LSWM), and executive function (DCCS, FICA) (T) scores differed significantly by group (Supplementary Table S.1; Fig. 8d; \(p < 0.0001\)), with cognitive (T) scores higher in the HP group than in the LP group. The Conners 3 hyperactivity score differed significantly by group (\(t_{167}=-2.17, p = 0.031\)), with lower scores in the HP group than in the LP group. The Conners 3 inattention score did not differ significantly between the HP and LP groups (\(t_{167}=0.52, p=0.61\)). Fig. 8d shows the subject subgroup distribution of standard age-adjusted cognitive (T) scores.

Statistical Group-Level Analysis

In this section, we present the group-level analysis results described in “Group-Level Statistical Inference of CP Component Matrices” and “Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores”. This section has two subsections. The first subsection evaluates the statistical significance of the ERF components (see “Multi-Subject MEG Tensor Analysis Using the CP Model”) to differentiate between the subgroups identified in “Subject Subgroup Identification”. The second subsection assesses the covariant relationships between ERF components and neuropsychological measures to correlate brain responses with cognitive performance. The component loading factors in subject mode (columns of matrix \(\mathbf {A}\)) and neuropsychological (T) scores were evaluated for normality. All analyses were corrected for multiple comparisons using FDR with a significance level of \(\alpha = 0.05\) unless stated otherwise.

Prior to performing the group-level statistical (see “Group-Level Discriminative Components”) and component-cognitive scores correlation analyses (see “Analysis of ERF Component Association with Cognitive Domains”), we analyzed pairwise component correlations (corrected for multiple comparisons using FDR (\(p < 0.05\)) for the VIS, AUD, and AV conditions). There were no significant correlations between the CP components (\(p > 0.05\) for all tests; see Supplementary Table S.6). These findings suggest that there is no need to adjust planned group-level statistical (see “Group-Level Statistical Inference of CP Component Matrices”) and partial correlation analyses (see “Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores”) for the presence of other CP components as model covariates. It should be noted that the CP model produces unique components so that the specific component or its factors are not associated with any other factors or other components (Kruskal, 1977; Kolda & Bader, 2009).

Table 2 Summary of ERF components loading factor ANCOVA results
Fig. 9
figure 9

Main effect of the subject group (\(N=170\), HP vs. LP) on component loading factors in subject mode for each stimulus condition. Boxplots summarize the distribution of the mean values of the component loading factors in subject mode for the HP and LP groups. Median values are represented by black lines inside the boxplot, with the top of the whisker lines indicating the 25th and 75th percentile values. Mean values are plotted in white circles. Error bars represent the standard error of the mean. Six components were statistically significant in the post hoc two-tailed t-test results (FDR corrected, \(p < 0.05\)), except for the occipital component/M150 in the AV condition, which did not differ between the HP and LP groups (\(p=0.864\)). The post hoc t-tests results are shown in Table 3. ****\(p < 0.0001\), ***\(p < 0.001\) (post hoc and FDR corrected \(p < 0.05\)), which means significantly different

Table 3 Comparison of ERF component loading factors by subject subgroup

Group-Level Discriminative Components

We applied mixed measures two-way ANCOVA (see “Group-Level Statistical Inference of CP Component Matrices”) on the component loading factors in subject mode of each ERF component and stimulus condition to determine significant effects after controlling for the covariates. The mixed measures two-way ANCOVA comparison of the component loading factors showed a statistically significant stimulus condition \(\times\) group interaction (see Table 2) for the Occipital/M150 (\(F_{1, 336} = 28.73, p < 0.0001, \eta ^2_{G} = 0.101\)) and R.Temporal/M300a components (\(F_{1, 336} = 6.82, p = 0.03, \eta ^2_{G} = 0.098\)). There was no significant stimulus condition \(\times\) group interaction for the L.Central/M400 component (\(F_{2,494} = 0.79, p = 0.982, \eta ^2_{G} = 0.004\)). The main effect of the subject subgroup was statistically significant for each component (Table 2; Occipital/M150: (\(F_{1, 336} = 33.96, p < 0.0001, \eta ^2_{G} = 0.113\); R.Temporal/M300a: (\(F_{1, 336} = 101.35, p < 0.0001, \eta ^2_{G} = 0.281\); L.Central/M400: \(F_{1, 494} = 176.73, p < 0.0001, \eta ^2_{G} = 0.311\)). Post hoc analyses with two-tailed t-tests corrected for multiple comparisons using FDR (\(p < 0.05\)) revealed six components with significant group differences (HP vs. LP) in the component loading factors. The details are shown in Table 3 and Fig. 9. Figure 10 depicts the group ERF components as solid lines (blue for HP and red for LP). The group ERF components peaked at the same time as the average group ERF timecourses, drawn in dashed lines (blue for HP and red for LP).

Occipital Component/M150

The Occipital/M150 (130–150 ms) group component is shown in Fig. 10a–b. The activity was concentrated in the left and right occipital sensors. The HP group demonstrated a higher activation strength than LP group in the VIS condition (see Fig. 10a). The main effect of the group (\(N=170\), HP vs. LP) on the component loading factors was statistically significant for the VIS condition (Table 3; Figs. 9 and 10a; VIS: \(t_{165} = 7.86, p < 0.0001\), HP > LP, post hoc two-tailed t-test). However, there were no significant differences in the component loading factors of the Occipital/M150 component between the HP and LP groups for the AV condition (Table 3; Figs. 9 and 10b; AV: \(t_{165} = 0.166, p=0.864\), post hoc two-tailed t-test).

Right Temporal Component/M300a

The R.Temporal/M300a group component is shown in Fig. 11a–b. The component was associated with the peak at 280–300 ms and accounted for the activity in the right temporal and inferior left/right frontal sensors. Post hoc two-tailed t-tests found a significant main effect of group (\(N=170\), HP vs. LP) on the component loading factors for both the AUD and AV conditions (Table 3; Figs. 9 and 11a–b; AUD: \(t_{165} = 7.31, p < 0.0001\), HP > LP; AV: \(t_{165} = 5.65, p < 0.001\), HP > LP).

Late Central Component/M400

The L.Central/M400 group component is shown in Fig. 11c–e. This component was associated with activity in the left temporal-parietal and right prefrontal sensors. Two-tailed t-tests identified a significant main effect of group (\(N=170\), HP vs. LP) on the component loading factors for the VIS, AUD and AV conditions (Table 3; Figs. 9 and 11c–d; AUD: \(t_{165} = -8.5, p < 0.0001\), HP < LP; VIS: \(t_{165} = -7.22, p < 0.0001\), HP < LP; AV: \(t_{165} = -7.20, p < 0.0001\), HP < LP).

Fig. 10
figure 10

Discriminative group MEG ERF components from the CP decomposition for the VIS and AV conditions are shown. The ERF components are indicated by solid lines (HP: high performance – blue, LP: low performance – red). The average group ERF timecourses are indicated by dashed lines. Main effect of the subject group (HP vs. LP, \(N=170\)) on the component loading factors in subject mode is summarized in the boxplots. The boxplots summarize the distribution of the mean values of component loading factors in the subject mode for the HP and LP groups. The median values are represented by black lines inside the boxplot, and the tops of the whisker lines indicate the 25th and 75th percentile values. The mean values are plotted in dark grey. The error bars represent the standard error of the mean. Post hoc analyses with two-tailed t-tests (FDR corrected, \(p < 0.05\)) indicate that the mean value of the component loading factors of the HP group was significantly different than for LP group with \(p < 0.001\) for six ERF components except for the occipital/M150 component in the AV condition (AV/M150: \(p=0.864\)). a-b Occipital M150 component. **** \(p < 0.0001\), *** \(p < 0.001\) indicate significant differences (FDR corrected, \(p < 0.05\)). The post hoc t-tests results are shown in Table 3

Fig. 11
figure 11

Discriminative group MEG ERF M300a and M400 components from the CP decomposition for the VIS, AUD, and AV conditions are shown. a-b Right temporal M300a component. c-e Late central M400 component. See full caption and legend in Fig. 10

Group-Level Sensitivity Analyses

In this section, we evaluate the discriminative performance of the CP decomposition and the nonparametric method based on the permutation statistics of the ERF in sensor space. Previous studies have shown that statistical nonparametric mapping (SnPM) is a robust approach that can reliably detect ERF activity in sensor space (Pantazis et al., 2003; Nichols & Holmes, 2002).

To quantify differences between subject subgroups, group amplitudes were compared by running time-point by timepoint nonparametric permutation two-tailed t-tests (Nichols & Holmes, 2002; Maris & Oostenveld, 2007) assessed at each sensor from 0 to 800 ms poststimulus. The nonparametric statistical threshold \(t_{\mathrm {max}}\) from the pseudo t-distribution was calculated to establish timepoint/sensor significance at the \(p < 0.05\) level. The sensors and timepoints identified by these t-tests denoted spatiotemporal regions where statistically significant differences in group amplitudes occurred.

To compare the group-level sensitivity of the CP decomposition and SnPM, we computed timepoint/sensor-wise t-statistics from a two-tailed nonparametric permutation t-test and determined the significance of the group-level mean amplitudes between subject subgroups, taking into account the covariates. We evaluated the group-level sensitivity by investigating the SnPM method’s ability to discriminate between subject subgroups. We present group-level ERF components and the statistical images (T-maps) after the CP and SnPM in Fig. 12 and Supplementary Figs. S.8-S.13. The SnPM identified five significant components: Occipital/M150 in the VIS condition (Supplementary Table S.4; VIS: \(p< 0.001; t_{165} = 6.32\)), R.Temporal/M300a component in the AUD and AV conditions (Supplementary Table S.4; AUD: \(p=0.022; t_{165} = 2.31\), AV: \(p=0.038; t_{165} = 2.09\)), and L.Central/M400 in the VIS, AUD and AV conditions (Supplementary Table S.4; VIS: \(p=0.001; t_{165} = -3.76\); AUD: \(p=0.002; t_{165} = -3.41\)).

The ERF components and T-maps generated after the CP and SnPM methods for the Occipital/M150 component in the VIS condition at 144-145 ms are shown in Fig. 12. The results illustrate that the CP decomposition provided a higher number of significant sensors than the SnPM method. We note a similar observation for the other ERF components presented in Supplementary Figs. S.8-S.13. It is evident from Fig. 12 and Supplementary Figs. S.8-S.13 that the number of adjacent sensors is smaller for the ERF components produced by the SnPM. We quantify the performance of each component estimation method by using Cohen’s d effect size (the standardized magnitude difference between groups (Sullivan & Feinn, 2012)) and p-values. Fig. 13 and Table 4 show that the CP decomposition resulted in a higher magnitude of the group differences and higher p-values compared with the SnPM method. In summary, the results presented in Table 4; Figs. 12, Supplementary Figs. S.8-S.13 and Fig. 13 demonstrate that the t-statistics and the magnitude of the effect are higher for the CP decomposition, suggesting better sensitivity over the nonparametric statistical approach.

Fig. 12
figure 12

Sensitivity analysis of the ERF components generated with different group imaging methods. Top row: Estimation of the Occipital/M150 component for the VIS condition by the CP decomposition and SnPM methods. Bottom row: The group-level T-maps between HP and LP groups for the CP and SnPM methods are shown. The T-maps (nonparametric permutation two-tailed t-test with a maximum t-statistics) are thresholded at \(p< 0.05\). The yellow circles on scalp maps show the location of significant sensors. a CP VIS M150 component. b SnPM VIS M150 component. The significant time interval of group differences (140-150 ms) is depicted in the shaded area. e Left: CP VIS M150 T-map at 145 ms. Right: SnPM VIS M150 T-map at 144 ms

Fig. 13
figure 13

Effect size comparison of the ERF components generated with different group imaging methods. The error bars represent the standard error of the mean. Cohen’s d effect size of group-level discriminative (HP vs. LP) components using the CP decomposition and SnPM method for the AUD, AV, and VIS conditions.

The effect size results are listed in Table 4

Table 4 Effect size comparison of group-level discriminative components after the CP and SnPM methods
Table 5 Component loading factor and cognitive (T) score associations

Analysis of ERF Component Association with Cognitive Domains

To correlate neuropsychological scores with ERF components, we performed two-tailed partial Pearson’s correlation tests between component loading factors in subject mode (columns of matrix \(\mathbf {A}\)) for the nine cognitive age-adjusted neuropsychological (T) scores in each subject group (HP: \(n=89\); LP: \(n=81\)) and the full sample (\(N = 170\)) (see detailed protocol in “Correlation Analysis between Component Loading Factors and Neuropsychological (T) Scores”). The partial correlation analyses were controlled for age, gender and parental SES. The correlations between component loading factors and cognitive scores were corrected for multiple comparisons using FDR with significance threshold \(p < 0.05\).

The correlation analyses indicated that three functional ERF components were significantly associated with the language, episodic memory and attention cognitive domains (see Table. 5). The results of two-tailed partial Pearson’s correlation analyses are summarized in Table 5, Figs. 14 and 15.

Among the nine cognitive variables, the PICVOCAB (T), PSM (T), DCCS (T), and Connors 3 hyperactivity scores were significantly correlated with ERF components. We present significant functional association patterns between the ERF components and the cognitive scores in Table 5.

The Occipital/M150 component in the VIS condition (Figs. 6a and 10a) was negatively correlated with the hyperactivity score in the HP group (\(r_{\mathrm {VIS-HYPERACTIVITY}}(86)=-0.233, p=0.03\); Table 5; Fig. 15b). However, the correlation was not significant (\(r_{\mathrm {VIS-HYPERACTIVITY}}(167) = -0.194, p = 0.051\); Fig. 14a) in the full sample. The correlation was consistent with large group differences (HP > LP; \(p < 0.0001\)) in the occipital component loading factors identified by two-tailed t-tests (see Table 3).

The R.Temporal/M300a component (Figs. 6c–d and 11a–b) had a statistically significant positive correlation with the PSM (T) score in both groups (Table 3; Fig. 15c–d; HP: \(r_{\mathrm {AUD-PSM}}(86)=0.342, p=0.003\), \(r_{\mathrm {AV-PSM}}(86)=0.364, p=0.001\); LP: \(r_{\mathrm {AUD-PSM}}(78) = 0.291, p = 0.039\), \(r_{\mathrm {AV-PSM}}(78) = 0.297, p = 0.016\)) and the full sample (\(r_{\mathrm {AUD-PSM}}(167) = 0.239, p = 0.004\); \(r_{\mathrm {AV-PSM}}(167) = 0.303, p = 0.001\); Fig. 14c–d).

Additionally, the correlation tests revealed significant associations between the R.Temporal/M300a component and DCCS (T) score (see Table 3 and Fig. 15e–f) in the HP group (AUD: \(r_{\mathrm {AUD-DCCS}}(86) = 0.267, p = 0.018\); AV: \(r_{\mathrm {AV-DCCS}}(86) = 0.277, p = 0.014\). The correlations between the R.Temporal/M300a component and PSM/DCCS scores were consistent with group differences (HP > LP; \(p < 0.001\)) in the component loading factors for the AUD and AV conditions (see Table 3).

The L.Central/M400 component in the VIS condition (Figs. 6e and 11c) was negatively correlated with the PICVOCAB (T) score in the HP group (\(r_{\mathrm {VIS-PICVOCAB}}(86) = -0.242, p = 0.013\); Table 5; Fig. 15a) and the full sample (\(r_{\mathrm {VIS-PICVOCAB}}(167) = -0.208, p = 0.017\); Fig. 14b).

No significant associations were found in the HP group, LP group or full sample for the other cognitive scores in the component loading factors (\(p > 0.05\) all tests). There were no significant differences in partial correlations between the groups (HP vs. LP) or stimulus conditions (VIS vs. AUD vs. AV) within subject groups (\(p > 0.05\) all tests).

Fig. 14
figure 14

Significant (FDR corrected, \(p < 0.05\)) two-tailed partial Pearson’s correlations (correlation coefficient, r) between ERF components and neuropsychological (T) scores in the full sample. The linear fit and \(95\%\) confidence intervals (CIs) are shown. The blue dots denote HP group, and red dots denote LP group. a The occipital component was negatively correlated with the hyperactivity score in the VIS condition. b The late central component was negatively correlated with the PICVOCAB score in the VIS condition. c-d The right temporal component was positively correlated with the PSM (T) score in the AUD and AV conditions

Fig. 15
figure 15

Significant (FDR corrected, \(p < 0.05\)) two-tailed partial Pearson’s correlations (correlation coefficient, r) between ERF components and neuropsychological (T) scores in the HP and LP groups (HP: blue dots, LP: red dots). The group linear fit and \(95\%\) CIs are shown (HP: blue line, LP: red line). a The late central (VIS) component was negatively correlated with the PICVOCAB score and significant in the HP group. b The occipital (VIS) component was negatively correlated with the hyperactivity score and significant in the HP group. c-d The right temporal component (AUD/AV) was positively correlated with the PSM score and significant in both groups. e-f The right temporal component (AUD/AV) was positively correlated with the DCCS score, significant in the HP group and showed a negative trend in the LP group. The correlation results are listed in Table 5

Discussion

This paper presents a tensor analysis-based model of MEG multi-subject data for identifying ERF components representative of typical brain developmental patterns in a healthy population of children and adolescents. The tensor analyses and tensor-based group-level statistical inferences outlined in this paper establish a foundational framework for extracting latent factors associated with children’s brain development from MEG datasets.

We contribute to the developmental neuroscience literature on the relationship between MEG activity and cognition by correlating ERF components from a healthy pediatric population with neuropsychological age-adjusted cognitive (T) scores and attentional indices.

ERF Components Extraction

To the best of our knowledge, this is the first study that modeled event-related field MEG data as a low-rank third-order tensor. We demonstrated that CP factorization could produce latent factors that result in functionally relevant ERF components and reveal meaningful spatiotemporal brain patterns. The CP model was shown to be highly effective in capturing informative data representation. For example, Fig. 6 and Supplementary Table. S.2 illustrate that ERF components were well-matched with the average ERF waveforms and demonstrated significant correlations (\(p < 0.001\)) with the original datasets (see Supplementary Methods Sect. 3.2; Supplementary Table. S.3). The tensor analysis successfully identified latent brain developmental patterns across subjects for each stimulus condition. As expected, the Occipital/M150 component similar to the visual P100/M100 wave, was extracted from the VIS and AV conditions, as shown in Fig. 6a–b. We observed P300 component in the R.Temporal/M300a component (see Fig. 6c–d) extracted from the AUD and AV conditions. The absence of the visual Occipital/M150 component in the AUD condition and the absence of the R.Temporal/M300a component in the VIS condition confirms that the CP model can extract meaningful patterns corresponding to expected ERF responses. The L.Central/M400 component (Fig. 6e–i) was extracted from the AUD, VIS, and AV conditions, representing activity in the left temporal-parietal sensors and likely capturing the motor response required in all three conditions.

In the present study, we found the M50/M100 subcomponents less dominant in terms of amplitude for the AUD and AV conditions. Our results agree with previous MEG studies (Kotecha et al., 2009; Edgar et al., 2014; Cardy et al., 2004), which reported that children do not show the M50/M100 adult-like waveforms until early adolescence. It was shown in (Kotecha et al., 2009; Bruneau et al., 1997; Ponton et al., 2000) that the amplitude of the auditory component becomes more prominent with increasing and remains stable through adulthood. Our future work may include longitudinal studies of the same MEG dataset where we can explore the latency and amplitude-age dependencies on the characteristics of the early auditory components.

Comparison of Group-Level Sensitivity Analyses

A significant problem in MEG research involves the detection of significant effects while controlling the FWER rate. In “Group‑Level Sensitivity Analyses”, we compared the sensitivity and statistical power of the CP tensor decomposition model and conventional statistical nonparametric approach based on permutation testing. We have demonstrated that the CP model provides 1.5–2 times higher effect sizes and lower p-values (see Fig. 13). The lower sensitivity of the SnPM method is most likely caused by a higher significance threshold required to control the FWER. Evidently, better performance of the CP model is associated with a lower rate of false negatives. Thus, tensor-based group-level inference alleviates the multiple comparison problem due to dimensionality reduction and provides higher statistical power.

Functional Associations of Group-Discriminating Components

The CP decomposition of the multi-subject MEG data provides insight into how tensor analysis can be used to explore relationships between brain patterns and cognitive function in high- and low-performance subjects. The statistical tests confirmed the effect of the cognitive group on the relationship between the component loading factors and designated pediatric subgroups (see “Statistical Group-Level Analysis”). We showed that children in the HP group were significantly different from those in the LP group (HP vs. LP; Table 3) in terms of the six components, with large effect sizes (\(\eta _{G}^2 > 0.113\), see “Group-Level Discriminative Components”). The group differences (HP > LP) were consistent with the neuropsychological (T) score t-tests, with the HP group scoring significantly higher than the LP group on all cognitive and behavioral tests, except for the Connors 3 inattention and hyperactivity scores. The subjects in the HP and LP groups did not show differences in the spatial distributions of all components; however, they demonstrated significant differences in the spatial activation strength and timecourse amplitude.

To identify ERF components as informative signatures of cognitive function, we correlated the ERF components with neuropsychological (T) scores in the full sample and each subject group. In the full sample, we found statistically significant correlations between neuropsychological (T) scores and specific ERF components (see details in “Analysis of ERF Component Association with Cognitive Domains”), namely, the PSM (T), PICVOCAB (T) and Connors 3 hyperactivity scores.

The correlation analyses between the ERF components and neuropsychological scores revealed significant associations between ERF components and PSM (T) score in both the HP and LP pediatric groups, whereas the PICVOCAB (T), DCCS (T) and hyperactivity scores were significantly correlated with ERF components only in the HP group.

It was shown in the literature that the PSM test measures episodic memory (Dikmen et al., 2014), and the DCCS test measures cognitive flexibility and executive function (Weintraub et al., 2013). The PICVOCAB test measures verbal ability and language comprehension (Weintraub et al., 2013), and the Connors 3 hyperactivity score is an attentional index (Conners, 2008). These cognitive indices are foundational cognitive processes that change rapidly during development and vary across individuals.

Early Latency Components

The hyperactivity score was negatively correlated with the Occipital/M150 component in the VIS condition using the full sample data. In addition, the analyses revealed a significant negative correlation between the hyperactivity score and the Occipital/M150 component in the HP group. Finally, as shown in Supplementary Table S.1, the hyperactivity score was significantly lower in the HP group than in the LP group. These findings suggest that the Occipital/M150 component patterns were consistent between the correlation results and ERF component group-level analyses (HP > LP; Table 3), as well as with the group-level neuropsychological (T) score analyses (Supplementary Table S.1). It was shown in the literature (Sokhadze et al., 2017; Ghani et al., 2020) that early and mid-latency ERP components (N100, N200, and P200) are related to involuntary attention selection mechanisms. Our findings are consistent with the literature on healthy individuals (Kramer et al., 1995; Ghani et al., 2021; Allison & Polich, 2008) and children with ADHD (Liotti et al., 2010; van Meel et al., 2007), which reported lower early latency component amplitudes with increasing cognitive workload in patients relative to healthy controls.

Late Latency Components

The right temporal/M300a component in the AUD and AV conditions was positively associated with the PSM (T) score (\(p < 0.01\)) in the HP group, LP group, and the full sample. The partial correlation analyses for the PSM (T) score shown in Figs. 14c–d and 15c-d suggest that the increase in the value of the component loading factors was associated with a better PSM (T) score. Additionally, the right temporal/M300a component was significantly positively correlated with the DCCS (T) score in the HP group, which is consistent with the component loading factor group difference (HP > LP; Table 3). Similarly, the correlation analyses for the DCCS (T) score shown in Fig. 15e–f and Table 5 suggest that the increase in value of the component loading factors associated with a higher DCCS (T) score. The HP group scored significantly higher (Supplementary Table S.1; \(p < 0.0001\)) than the LP group in terms of both the PSM (T) and DCCS (T) scores. According to the literature, the P300a/M300a component amplitude represents an orienting response, reflecting involuntary orientations to attention-catching changes (Sur & Sinha, 2009). Additionally, the P300a/M300a component was categorized as an indicator of implicit memory and item familiarity (Friedman & Johnson, 2000; Graf & Schacter, 1985; Rugg et al., 1998). The existing literature shows that the P300a/M300a component could indicate sustained attention and decreases in P300a amplitude with increasing cognitive workload (Berti & Schröger, 2003; Dyke et al., 2015; Horat et al., 2016).

The late central/M400 component showed a significant negative correlation with the PICVOCAB (T) score in the HP group and full sample. Similar to the N400 (Kutas & Federmeier, 2000) ERP component, the spatial scalp distribution was maximally concentrated in the left temporal-parietal sensors.

The association of the late central/M400 component with the PICVOCAB (T) score (see Figs. 14b and 15a) demonstrated that the reduced M400 component amplitude results in a higher PICVOCAB (T) score. The literature indicates (Fitz & Chang, 2019) that N400 component ERPs may reflect prediction error signals needed for learning; thus, larger ERP amplitudes could be correlated with errors.

Comparison of Group-Level Imaging Methods

In group-level brain imaging studies, the goal is to determine spatiotemporal patterns of variability between/among groups or conditions. The sensitivity and statistical power of group-level inferences is dependent on the stability and unique presentation of these patterns to determine where and when a specific brain activity occurs.

In MEG research, the most common approach to identify a location of brain activity is to employ mass-univariate hypotheses testing methods (Groppe et al., 2011). Mass-univariate hypothesis testing is based on executing multiple tests, which most often involves computing a parametric or nonparametric t-test for each timepoint/sensor. However, mass-univariate analyses in MEG have many shortcomings, such as (1) the high dimensionality of data requires a large number of tests corrected for multiple comparisons; (2) potentially overlapping sources of brain activity; (3) not taking into account interactions between timepoints/sensors; and (4) sensitivity of peak/mean amplitude measures to the analysis window (Luck & Gaspelin, 2017). Nonparametric approaches based on randomized permutation and cluster-based permutation tests (Groppe et al., 2011; Maris & Oostenveld, 2007) have been developed that inherently address multiple comparison problems (Westfall & Young, 1993) and locate the spatiotemporal effect of interest. However, the important drawback of nonparametric statistics is that with the increase in the number of tests, the power of the permutation test is diminished due to an overly conservative estimate of the significance threshold (Groppe et al., 2011). Thus, with an increase in the dimensionality of multi-subject MEG data, the strong FWER control of the permutation method may impact the sensitivity of the analyses, resulting in Type II error.

A more promising approach to overcome the shortcomings of mass univariate approaches is to use an effective multivariate approach to summarize the data. Group-level tensor decomposition is a multivariate latent space group-analysis technique that has been shown to be capable of (1) localization of common unique patterns of brain activity for a group of subjects in a data-driven way (Cong et al., 2012; Wang et al., 2018; Tangwiriyasakul et al., 2019; (2) dimensionality reduction (Cichocki et al., 2016; (3) extraction of region-of-interest independent signatures for group-level inferences (Cong et al., 2012; Tangwiriyasakul et al., 2019; Acar et al., 2017; (4) inherent alleviation of multiple-comparison problem; and (5) higher sensitivity by capturing complex spatiotemporal interactions (Acar et al., 2019; Kinney-Lang et al., 2017, 2019).

As we show next, we discuss the differences in statistical assessments of the CP model and conventional sensor/source level imaging methods. For the further discussion below, we assume that the time-frequency source reconstruction where subject datasets \(\mathbf {S} \in \mathbb {R}^{C \times T \times F}\) joined in the subject mode forming a fourth-order tensor \(\varvec{\mathcal {X}} \in \mathbb {R}^{K \times T \times C \times F}\) (\(\mathbb {R}^{\mathrm {subject} \times \mathrm {time} \times \mathrm {sensor} \times \mathrm {frequency}}\)).

Tensor Analysis in the Sensor/Source Level Space

After the MEG/EEG data are factorized by the CP model, the underlying component matrices can be readily analyzed by group-level statistical inference algorithms (Cong et al., 2015). Since the CP model performs simultaneous factorization and is fully multivariate, each factor of the latent CP component is identified at all levels of other factors.

Hence, the magnitude of the underlying CP component is quantified at each timepoint and sensor in the sensor-level space, which eliminates the necessity of selecting specific timepoints and sensor sites (timepoint, sensor and frequency at the source level) for the group amplitude extraction in the group-level inferences. Thus, statistical inference can be directly applied to the selected component signatures. For example, as shown in Fig. 3c, to determine the discriminative groups in the subject mode, the rth subject loading factor \(\mathbf {a}_r \in \mathbb {R}^{K}\) is used in multifactorial ANCOVAs to evaluate the experimental conditions. Similarly, the spatial \(\mathbf {c}_r \in \mathbb {R}^{C}\), temporal signatures \(\mathbf {b}_r \in \mathbb {R}^{T}\) or frequency signatures \(\mathbf {f}_r \in \mathbb {R}^{F}\) can be statistically evaluated to determine the significance of the spatiotemporal extent or frequency bands. Notably, due to dimensionality reduction, the required number of tests to reach statistical significance is dramatically reduced since only the limited number of samples (order of \(10^2\)) is used from each signature instead of all spatiotemporal/space-time-frequency features (order of \(10^6\) or higher).

Thus, the CP model reduces the problem of multiple comparisons in the group-level analyses since the extracted component signatures are used to determine discriminatory ERF components. As a result, the CP model could provide a higher sensitivity by reducing Type II error. In contrast, the univariate and nonparametric tests may fail to find a significant effect if they are applied to the full \(\mathrm {subject} \times \mathrm {time} \times \mathrm {sensor}\) data cube.

In summary, the CP tensor decomposition offers the following advantages compared with univariate parametric and nonparametric statistical methods, such as data-driven source separation, a region-of-interest independent measure for group-level analyses, identification of common spatiotemporal patterns for a group of subjects, and alleviation of multiple comparison problems due to dimensionality reduction, which could result in higher statistical power and better sensitivity as shown in “Group-Level Sensitivity Analyses” and “Comparison of Group-Level Sensitivity Analyses”.

Tensor Analysis for Source Localization

The localization of brain sources based on MEG/EEG recordings has been an ongoing topic of active research due to increased demand in clinical applications (Asadzadeh et al., 2020).

In the past decade several works have proposed tensor-based preprocessing (De Vos et al., 2007; Mørup et al., 2006; Becker et al., 2014a, b) for source localization. Primarily, the proposed tensor-based source localization approaches are based on the transformation of the evoked field data in the sensor space using a space-time-frequency (STF) or space-time-wave-vector (STWV) transform and subsequent application of the CP decomposition using the STF or the STWV transformed data. The details of transformations are described in (Becker et al., 2014b). The tensor group-level analysis in the source space would be similar to the tensor analysis in the sensor space (see “Multi-Subject MEG Tensor Decomposition”, “Statistical Group-Level Analysis” and “Tensor Analysis for Source Localization”). As suggested in (Becker et al., 2014b) in order to fit the dipole model, the STF data tensor of each subject should be constructed with one source per time and frequency under the hypothesis of oscillatory signals. We further refer our readers to the existing key papers (Becker et al., 2014a; Asadzadeh et al., 2020) for the history, and various applications.

Limitations and Future Work

The proposed generative model using CP decomposition implies that all subjects have the same number of latent components R, and all subjects share the common matrices \(\mathbf {B}\) and \(\mathbf {C}\). In other words, the CP model imposes strict assumptions such that the underlying brain patterns have identical timecourses and spatial maps across subjects. However, with real ERF MEG data, individual differences may exist in the timing and origin of the subject’s neural responses to the stimuli. For example, individual differences in the cognitive processing of the stimuli would result in differences in the timing and spatial distribution in MEG. To allow a variable number of components and spatiotemporal variability of brain patterns across subjects, a more flexible model can be used, such as constrained PARAFAC2 (Parallel Factor Analysis) (Afshar et al., 2018; Helwig & Snodgress, 2019; Chatzichristos et al., 2019), or higher-order block term decomposition (BTD2) (Chatzichristos et al., 2019). It has been shown in (Harshman et al., 1972; Helwig & Snodgress, 2019) that PARAFAC2 can handle the heterogeneity of subject’s responses and allows a variable number of latent components per subject (Afshar et al., 2018) via sparsity constraints. To address this limitation, our future work may include using the PARAFAC2 or BTD2 models to account for subject’s individual differences.

Another limitation of our generative model is fitting each stimulus condition as a separate CP decomposition. Alternatively, the multi-task multi-subject MEG data could be modeled as a coupled tensor–tensor decomposition (CTTD) (Chatzichristos et al., 2022; Jonmohamadi et al., 2020), where each stimulus (VIS, AV, and AV) represented as a third-order tensor and coupled in the subject mode. The multi-task joint learning enables the use of the complementary information (Lahat et al., 2015; Acar et al., 2013) from multiple stimuli and thus could result in the latent components with a higher discriminative power.

The extracted ERF components could be used as bioimaging markers for classification or prediction. Specifically, the subject loading factors found from MEG data using the CP model can be interpreted as feature extraction. The combination of machine learning techniques and multi-task tensor decomposition of MEG data could identify more reliable bioimaging markers that may enable the exploration of neurological differences associated with symptom onset, enabling early intervention. Thus, the application of multi-subject MEG tensor decomposition in context of machine learning is a promising direction for future research in cognitive neuroscience.

Conclusion

We demonstrated that CP decomposition can be used for the effective identification and characterization of latent spatiotemporal components of multi-subject MEG data. We described the generative model for the multidimensional representation of multi-subject MEG data, latent component extraction and group-level statistical inference methodologies. We demonstrated that the group-level tensor decomposition recovers meaningful distinct brain patterns of varying spatiotemporal brain activity across subjects in healthy population of children/adolescents and in subgroups. The advantages of the proposed method include successful identification of the underlying latent brain patterns in the form of factor matrices via tensor factorization to allow for statistical assessment of the identified sources. The presented tensor-based group-level inference using CP component matrices eliminates the need to select specific regions of interest, such as time windows or specific sensor sites.

Using the proposed approach, we show that the tensor group-level analyses and tensor-based feature extraction allow us to investigate differences in brain activity between different subject groups. Given the importance of group-level inferences in neuroimaging studies, the extracted latent ERF components could be used to study differences in brain patterns across groups and aid in understanding how spatiotemporal brain activity can explain cognitive function and developmental changes directly from electrophysiological measurements. The application of MEG tensor decomposition used in this study is a promising direction for future research on other populations with different age ranges or developmental disorders.

Information Sharing Statement

The code used in this manuscript can be found at https://github.com/ibelyaeva/meg-tensor-decomposition.