Introduction

Multimedia learning theories are rapidly expanding with the increased use of educational technology, which also provides more opportunities to design learning environments to provide a myriad of engaging audiovisual channels (Liu et al., 2018; Mayer & Moreno, 2003). These educational technologies provide large amounts of information through a combination of text and visuals, creating a more complex cognitive load that is not readily apparent (Brunken et al., 2003; Sweller, 1994). Addressing the intricacies and the implicit nature of this cognitive load, it is imperative that educational researchers employ advanced designs and techniques to assess cognitive load rather than using only subjective measures fraught with limitations (Anmarkrud et al., 2019; Brunken et al., 2003). Given the ongoing advances in technology, it is anticipated that these challenges can be mitigated through the use of EEG as a direct physiological measure (Wickens et al., 2013). Nonetheless, the field of EEG-related research encounters several noteworthy issues. First, the researchers captured different EEG frequency bands using electrodes in different brain regions. Second, these cognitive load studies have been conducted in specific environmental and technological contexts, and the findings show a great deal of variability. Therefore, this study adopted a meta-analytical approach to quantitatively review and integrate the findings of the existing literature, to sort out the EEG measurement index system in multimedia learning environments as a whole, and to analyze the factors that may affect the experimental results.

An overview of EEG measurements of cognitive load

Multimedia learning theory posits that learners amalgamate information from speech and images with existing knowledge to construct new schemas. This process requires cognitive engagement (Mayer & Moreno, 1998, 2003). This process is obviously complex in multimedia learning environments, which provide extensive access to information, and learners have to invest in cognitive resources to process, integrate, think creatively, solve problems and organize their knowledge in order to complete generative learning tasks. Therefore, cognitive load measurement is well-suited for studying multimedia learning, offering a framework to comprehend and quantify cognitive load in this context (Mutlu-Bayraktar et al., 2019; van Merrienboer & Sweller, 2010; Paas et al., 2003).

EEG is a direct physiological measure of cognitive load (Taylor et al., 2010). Like techniques such as ECG and fMRI, EEG has been widely used in multimedia learning environments. The most commonly used for analyzing cognitive load in EEG is the frequency domain analysis technique, which is a method of transforming a signal into the frequency domain for analysis, revealing the interrelationships between the frequency components and frequencies in the signal. Through frequency domain analysis, some information about brain activity can be obtained, such as frequency characteristics and energy distribution in frequency bands (Kramer, 1991).

Many studies have used EEG spectral power to measure cognitive load in multimedia learning under different conditions, and after organizing them, we found that they include the following three main aspects:

(1) Instructional resource characteristics: Examining how the design of learning resource influences cognitive load, such as the impact of text or image size and placement (Liu et al., 2021), as well as the role of teacher’s hand gestures in lecture-type videos (Pi et al., 2022). (2) Learner characteristics: Exploring the relationship between cognitive load and different learning styles, such as generative vs. self-explanatory learning (Pi et al., 2021) and passive vs. generative learning (Pi, Zhang, Liu et al., 2023). Additionally, considering learner psychological characteristics, such as anxiety levels (Rajendran et al., 2022), and the learners’ level of specialization in the subject matter (Bilalić & Campitelli, 2018). (3) Learning environment characteristics: Investigating how the presence of a teacher (Wang et al., 2020), the presence of peers, and external interferences can influence cognitive load (Nigbur et al., 2011). While the specific goals of these studies may differ, they all share a common technical foundation: the use of EEG metrics to effectively gauge learners’ cognitive load levels.

Despite the informative nature of the aforementioned studies, their aims and results are intricate, there are many possible moderating factors in the experimental process that affect the validity of the results, and there are no meta-analysis that specifically address the relationship between EEG spectral power and cognitive load in the field of multimedia learning. Therefore, the primary aim of this quantitative review is to synthesize and combine the results of the literature to quantify the effect of cognitive load on different EEG spectral powers, clarify the system of EEG measurement metrics as a whole, and analyze the factors that may affect the experimental results and compare them to derive which spectral powers are more suitable for measuring cognitive load. The current meta-analysis involves the following moderating variables including frequencies, Electrode position, Number of electrodes, Number of tasks, Interactivity, Learning resource, Gender, Age, Sample size.

The moderator variable of EEG measurements of cognitive load

Frequencies

Common frequency bands include δ (0.5–4 Hz), θ (4–8 Hz), α (8–13 Hz), β (13–30 Hz), and γ (above 30 Hz) (Staufenbiel et al., 2014). Increases in θ have been associated with high working memory activity, successful memory encoding, and retrieval (Miller et al., 2018). Increases in α reflect inhibitory functioning to deal with non-task-related processes, and its reduction reflects attentional concentration (Klimesch et al., 2007). Lobier et al. (2018) also found that low α (8–10 Hz) and high α (10–13 Hz) subbands in occipital-parietal regions have different roles, with low α having been shown to be associated with general attentional demands, and high α with the co-ordination and modulation of neuronal processing in the frontoparietal and visual systems. Studies have confirmed the involvement of β frequency in working memory, action processing, and attentional demands (Brinkman et al., 2014; Chen & Huang, 2016; Ray & Cole, 1985). In addition, some studies have been conducted on the sub-bands of β. Hanslmayr et al. (2009) found that an increase in β1 (13–20 Hz) helps to protect the representation of working memory, and Pavlov and Kotchoubey (2017) found that an increase in β2 (20–30 Hz) indicates an increase in the processing of information in learners. In addition, based on evidence that increases in mental load are associated with decreases in α power and increases in θ power as described above, θ / α ratios have also been used to assess mental workload (Fernandez Rojas et al., 2020; Fritz et al., 2014). β channels also have such ratios, although less commonly, e.g., β / (θ + α) has been used to assess engagement or task difficulty (Kramer, 1991). The different frequencies mentioned above, and the ratios within them, have the potential to greatly influence measures of cognitive load.

In addition, in EEG measurements, the spectral power of the measured frequencies varies depending on the electrode position, mainly in the frontal and central regions, where theta waves are more sensitive to cognitive processing activity than in other regions, and in the parieto-occipital region, where alpha waves are more sensitive to cognitive processing activity than in other regions. However, many studies have analyzed cognition or attention using EEG in multimedia learning environments not by measuring the spectral power of only one region, but by using electrodes distributed in multiple brain regions to acquire and synthesize these data (Suzuki et al., 2023). Therefore, we hypothesized that the use of different regions of electrodes at specific frequencies would have an effect on measures of cognitive load.

Electrode position

During the experiment, electrodes placed on the head or electrode caps captured electrode signals from different positions, which may lead to differences in measurement results. Many studies have examined EEG frequencies recorded at key electrode sites such as the frontal midline, parietal midline, and occipital midline, as these sites offer more comprehensive insights into cognitive processes (Eschmann et al., 2018; Meltzer et al., 2007, 2008). We therefore used electrode position as a grouping condition for the moderation analysis into midline electrodes versus other electrodes.

Number of electrodes

Despite the general acceptance of measuring EEG spectral power from multiple channels, there are still a few studies in recent years that have captured spectral power from a single channel, which may have focused on the activity at only one electrode position or due to practical conditions (Suzuki et al., 2023). Therefore, we performed moderation analysis based on experiments using single-electrode data and multi-electrode data as grouping conditions.

Number of tasks

Studies of multitasking conditions are very common, where participants are required to deal with multiple tasks at the same time, which leads to problems of competition and allocation of cognitive resources. In this case, changes in cognitive load may be more dramatic and easier to observe by being EEG. Therefore, we analyzed the number of tasks as a moderating variable for the grouping condition. We considered that the study used multitasking when there were distractor tasks or real-time subtasks competing with the main task for learners’ cognitive resources.

Interactivity

Research has shown that increased learner interaction with the learning system can be effective in managing cognitive load, depending on the nature of the task and the relevance of the interactive elements to the subject matter (Darejeh et al., 2022; Paas et al., 2004).In task conditions with interaction, where learners need to use cognitive resources that interact with the task system in addition to learning, the cognitive load on learners tends to surge (Yang et al., 2021). Conversely, a lack of feedback may in turn increase cognitive load due to self-doubt, or it may lead to cognitive laziness, which reduces cognitive load due to the learner’s lack of cues for the next task (Clark, 1994). We therefore conducted a moderation analysis by considering whether the task system used by participants was interactive or non-interactive.

Learning resource

In multimedia learning environments, the type of learning material has a significant effect on the cognitive load of learners. Video and pictures are two common types of learning materials that have different characteristics in conveying information and stimulating learners’ perceptual processes. It was found that the dynamic information and diverse visual stimuli in video materials made learners process more cognitive resources at a certain time compared to static pictures, which increased the measure of mental effort (Liu et al., 2018). It is likely that cognitive load is also susceptible to significant changes under different workload conditions due to complex cognitive processes. Therefore, we moderated and analyzed the forms of learning resources into pictures and videos.

Gender

We also examined the gender of the individuals who participated in the study. de Moura et al. (2017) investigated using psychophysiological data and found that there were differences in cognitive workload between the negotiation styles of males and females. Güntekin et al. (2007) found significantly higher amplitude of maximal peak-to-peak δ response in females than in males in EEG measurements, and differences in β and γ oscillatory responses. We therefore examined the moderating effect by including gender as a variable.

Age

Another potential moderator is age. Lemke et al. (2016) found that as adults age, the availability of mental resources and the ability to utilize them effectively changes, and that older adults may increase listening effort when performing the same listening task, and thus changes in EEG activity in response to cognitive load may be more pronounced. Cellier et al. (2021) found that early childhood (3–7 years) was characterized by a predominance of θ oscillations at the posterior electrodes, whereas the peak frequency of the dominant oscillations in the α range increased between the ages of 7 and 24 years. Therefore, we also included age as a moderator variable to examine its effect on cognitive load measures.

Sample size

We also analyzed the effect of sample size on cognitive load measures. The purpose of meta-analysis is to synthesize the results of multiple studies to obtain more comprehensive and reliable conclusions. However, small sample effects may have a misleading impact on the results of meta-analysis, as small sample studies tend to have lower statistical efficacy and higher risk of bias. Thus, analyzing sample size as a moderating variable helps us to rule out the possibility of a “small sample” effect.

Therefore, our study aims to address the following research questions:

  • RQ1: Which EEG frequencies and their subband frequencies are most suitable for quantifying cognitive load in multimedia learning? For a given frequency, is it more beneficial to measure cognitive load by acquiring signals from electrodes located in multiple brain regions?

  • RQ2: How do electrode position, number of electrodes, number of tasks, interactivity, learning resources, gender, age, and sample size affect EEG measures of learners’ cognitive load in multimedia learning?

Methods

Inclusion and exclusion criteria

To ensure the rigor and comprehensiveness of our review, we only considered studies published in peer-reviewed journals for inclusion. The selected studies have been determined to have to meet the following criteria: (1) The study had to be an empirical study investigating the cognitive load of learners in a multimedia learning environment. Purely theoretical articles and reviews were excluded. (2) The dependent variable of the included studies must be the spectral power of at least one frequency band measured by the EEG (i.e. δ,θ,α,β and γ). (3) The independent variable of the studies must be the level of cognitive load, and the experiments must include at least two sets of controlled experiments, i.e., a high level and a low level cognitive load condition. (4) The studies must have provided adequate statistical data to calculate effect sizes, such as means, standard deviations, sample sizes, or values from ANOVA and t-tests, as well as p-values; (5) The subjects of the studies can only be students. (6) Be published in the English language. (7) Studies involving other forms of EEG analyses (e.g. ERPs, brain network connectivity) were not considered.

Search strategy

Our literature search encompassed authoritative English language databases up until August 2023, including but not limited to: Medline, ERIC, Springer, IEEE Xplore, Web of Science, and Elsevier Science Direct. The search employed keyword combinations such as “cognitive load,” “cognitive workload,” “working memory,” “mental workload” in conjunction with “EEG,” “electroencephalogram,” “electroencephalography,” “brain waves,” and “spectral power.” The articles’ focus was limited to domains related to education and psychology, and the retrieval time range was set from January 1997 to July 2023.To further supplement the literature, we checked the references of identified articles and searched for relevant literature using Google Scholar and ResearchGate. Finally, we asked researchers in the field for unpublished manuscripts. We also contacted authors who did not provide specific data in their articles to obtain raw data but received no response.

The screening process was conducted in four phases: Identification: Duplicates were removed from the complete search record. Screening: Studies were screened by title and abstract to exclude those not meeting the inclusion criteria. Eligibility: When abstracts lacked essential information (e.g., details about EEG frequency domain analysis), the full text was examined to determine eligibility according to the screening criteria. Inclusion: Inclusion of studies that met the inclusion criteria in the analysis. Eventually, 26 records met the inclusion criteria for the meta-analysis, as illustrated in Fig. 1.

Fig. 1
figure 1

Flow Diagram of the Study Selection

Coding procedures

Coding was done by the second author and double-checked by the first, third, and fourth authors. Uncertain items were decided by discussion. The following literature characteristics were extracted from each study: author, publication year, sample size, mean age of participants, standard deviation, gender, frequency band used, electrode position, number of electrodes, number of tasks, and type of learning resource. Characteristics and effect sizes of studies are summarized in Table 1.

Table 1 Characteristics of studies included in the Meta-Analysis

Statistical analysis

Calculation of effect sizes and weights

For the data analysis in this study, we employed Comprehensive Meta-Analysis software, specifically designed for conducting meta-analysis, along with its Meta-regression 2 program (Brüggemann & Rajguru, 2022). Following the recommendation of Rosenthal and DiMatteo (2001), the effect size chosen was the product-moment correlation coefficient, denoted as r. This metric is preferred because it provides a robust representation of the linear relationship between variables and is more practical in interpreting study results compared to effect sizes like Cohen’s d or Hedges’ g. Therefore, this study adopted the correlation coefficient, r, as the effect size for quantifying the impact of each EEG measure on cognitive load. In this context, a positive effect size indicates an increase in mean EEG power with a rise in cognitive load, while a negative effect size signifies the opposite. According to Cohen (2016)’s criteria, an absolute effect size r between 0.10 and 0.30 is considered small, between 0.30 and 0.50 is deemed medium, and exceeding 0.50 is classified as large. The calculation of r is in the supplementary material.

Heterogeneity analysis

Heterogeneity was assessed using the I2 statistic, which quantifies the percentage of variation in effect sizes. According to Higgins et al. (2003)’s proposal, I2 greater than or equal to 75% represents high heterogeneity, 50% represents moderate heterogeneity, and 25% represents low heterogeneity. A chi-square test (Cochran’s Q statistic) was also employed for statistical evaluation. Given the negative values of the effect size (r) in this study (e.g., Α power decreases with increasing cognitive load), all effect sizes were transformed to absolute values when conducting overall heterogeneity testing.

Considering the expected diversity among the included studies and the study’s relatively small sample size for each frequency, using random effects values for calculating results for each index was deemed irresponsible due to the potential risk of overestimating effect size (Cooper et al., 2019). Therefore, when I2 > 50% during the assessment of each index, a sensitivity analysis was conducted. This involved using the leave-one-out method to exclude studies with excessive heterogeneity to determine the final results. However, when examining other moderating variables related to experimental treatments, we re-included previously excluded studies for moderating variable analyses because we wanted the analyses of other moderating variables to be used to guide the overall experimental design and data collection. Moderator variable analysis assesses whether a variable moderates the relationship between two other variables.

Publication bias analysis

Publication bias refers to the impact on the result of meta-analysis due to the tendency of studies to be published, resulting in the fact that some findings are more likely to be published, while others may be ignored or reported selectively (Vevea & Woods, 2005). First, we used egger regression for our analysis (Egger et al., 1997). Secondly, we use the Classic Fail-safe N to indicate how many studies before a study loses statistical significance. When this value is greater than 5k + 10 (k is the number of effect sizes), there is no significant publication bias (Viechtbauer, 2007). Finally, the trim-and-fill method allows the estimation of artificial effect sizes, symmetry of the funnel plot, and calculation of corrected effect sizes.

Results

Overall heterogeneity analysis

A total of 653 participants were involved in the studies, with a mean age of 24.09 ± 3.69, and 61% of the participants were female. A total of 58 effect sizes were calculated, including 19 from the θ band, 20 from the α band, and 10 from the β band, as well as several combined indicators. Heterogeneity analysis was performed, and the results showed an I2 of 47.02% with a Q-value of 107.58. This suggests heterogeneity in the overall analysis, which is expected, and therefore a random effects model was used in this study. Overall, EEG spectral power was generally valid in measuring learners’ cognitive load in a multimedia learning environment, r = 0.469 [0.401, 0.531], p < 0.001.

Publication bias test

The Egger’s linear regression analysis produced a significant result (B = 1.907, CI [1.58–3.67], t = 5.071, p = 0.001). However, the Classic Fail-safe N suggests an additional effect size of 4196 would be required to have a significant impact on the existing meta-analytic results, which is considerably higher than 300. In addition, the overall effect size adjusted by the construction procedure remained significantly above 0 (r = 0.354, CI [0.306–0.401], p < 0.05). Overall, we do not consider publication bias a serious problem in this meta-analysis.

Subgroup analysis of frequencies

Theta

The study measuring the θ band showed a significant random-effects model effect size (r = 0.47, CI [0.33–0.59], p < 0.05, k = 19), suggesting that θ power in learners’ EEG is higher under high cognitive load conditions compared to low cognitive load conditions. However, there was significant and high heterogeneity (I2 = 61.2% > 50%), as presented in Fig. 2. Sensitivity analysis using the leave-one-out method identified one study as an outlier (Gevins, 1997). After excluding this study (k = 18), the overall effect size became moderate and stable, with no significant heterogeneity (r = 0.40, CI [0.31–0.48], p < 0.01, I2 = 5.0%).

Fig. 2
figure 2

Forest Plot of the Effect Sizes of θ

In the moderator analysis for θ electrode region, the effect size for multiregional electrodes was greater (r = 0.46, CI [0.32,0.58], p < 0.001, k = 6) than for frontal region electrodes (r = 0.36, CI [0.25,0.46], p < 0.001, k = 12). However, the difference was insignificant (p = 0.364 > 0.05).

Alpha

For the α band, the study measuring showed a significant random-effects model effect size (r = -0.54,CI [-0.64,-0.41], p < 0.001, k = 20), suggesting that α power in learners’ EEG is lower under high cognitive load conditions compared to low cognitive load conditions. However, there was significant and high heterogeneity (I2 = 52.9% > 50%), as presented in Fig. 3. Sensitivity analysis using the leave-one-out method identified two studies as outliers (Dasari et al., 2017; Gevins, 1997). After excluding the two studies (k = 18), the effect size and heterogeneity reduced to acceptable (r = -0.50, CI [-0.58,-0.41], p > 0.05, I2 = 7.2%).Subgroup analysis for the α band did not reveal any significant differences between the broadband subgroup, high α (10–12 Hz) subgroup, and low α (8–10 Hz) subgroup.

Fig. 3
figure 3

Forest Plot of the Effect Sizes of α

In the moderator analysis of effect sizes for α electrode region, the effect size for multiregional electrodes (containing parietal and other regions) was greater (r = -0.55, CI [-0.68,-0.38], p < 0.001, k = 8) than for parietal regional electrodes (r = 0.48, CI [-0.59,-0.35], p < 0.001, k = 9). However, the difference was not significant (p = 0.47 > 0.05), excluding one frontal lobe study of α (Pi, Zhang, Liu, et al., 2023).

Beta

For the β band, the studies were small in number (k = 10) and showed very low heterogeneity. The effect size for the β band (r = 0.335, CI [0.22,0.44], p < 0.001, I2 = 0%) suggested that β power in learners’ EEG are higher under high cognitive load conditions compared to low cognitive load conditions. Subgroup analysis were not conducted for β1 and β2 because of the paucity of studies on them(k < 3), and broadband β showed a medium effect size (r = 0.36 CI [0.22,0.49], p < 0.01).

Frontal theta / parietal alpha

For the combined analysis of studies using θ in the frontal region and α in the parietal region, with lower heterogeneity in the data, the effect size was r = 0.61 (CI [0.44,0.74], p < 0.001, Q = 6.8, df = 3, p = 0.079, I2 = 55.9%). One study was excluded during sensitivity analysis (Negi & Mitra, 2022), and after exclusion, the analysis resulted in an effect size of r = 0.78 (CI [0.59,0.89], p < 0.001, Q = 4.8, df = 3, p = 0.502, I2 = 0%). This suggests that θ/α power in the high cognitive load task was significantly greater than in the low cognitive load condition.

To summarize the validated indicators for θ, α, β, and their subbands, as well as the effect values for θ / α, without discussing other indicators that have not been studied in sufficient numbers (k < 3), as presented in Fig. 4. Valid data were not derived from subgroup calculations of effect sizes at electrode sites for β band studies, primarily due to the limited number of studies available.

Fig. 4
figure 4

Forest Plot of Valid Measurement Indexes

Analysis of other moderating variables

In this section, the analysis of moderating variables provides insights into how specific experimental factors influence the assessment of cognitive load in multimedia learning. All studies, including those excluded from frequency subgroup analyses, were considered in these analyses, which were performed using a random-effects model.

Moderating variable analysis revealed that the midline electrode group (k = 18) had a larger effect size (r = 0.52, CI [0.43, 0.61], p < 0.001) than the other electrode group (k = 40) with an effect size of (r = 0.41 CI [0.35, 0.46], p < 0.001). The difference between the two groups was of borderline significance (Q = 3.89, p = 0.049), as presented in Table 2. This suggests that collecting data from the midline electrode may be more effective for measuring cognitive load.

Table 2 Moderator analyses for variables coded

As shown in Table 2, moderating variable analysis of the grouping of working electrodes as single versus multiple electrodes showed that the single electrode group (k = 18) had a smaller effect size (r = 0.39, CI [0.30, 0.48]) compared to the multiple electrodes group (k = 40) with an effect size (r = 0.51, CI [0.42, 0.59]). The between-group effect was of borderline significance (Q = 3.58, p = 0.059), indicating that collecting data from multiple electrodes might be more effective for measuring cognitive load.

Moderating variable analysis of single-task and multi-task revealed that the multi-tasking group (k = 16) had a significantly larger effect size (r = 0.69, CI [0.48, 0.89], p < 0.001) compared to the single-tasking group (k = 42) with an effect size (r = 0.41, CI [0.35, 0.46], p < 0.001). The between-group effect was significant (Q = 6.45, p = 0.011 < 0.05), as indicated in Table 2, suggesting that multitasking measures cognitive load more effectively in multimedia learning.

Moderating variable analysis of the interaction and non-interactive groups found that the interaction group (k = 17) had a significantly larger effect size (r = 0.61, CI [0.36, 0.78]) than the non-interactive group (k = 41) with an effect size of r = 0.40 (CI [0.35, 0.46]). The between-group effect was significant (Q = 7.7, p = 0.006 < 0.01), as indicated in Table 2, suggesting that task systems with interactivity measure cognitive load more effectively in multimedia learning.

A significant between-group effect was found in the grouping of picture versus video. Picture group (k = 35) had a significantly larger effect size (r = 0.52, CI [0.43, 0.60]) compared to video (k = 22) with an effect size (r = 0.36, CI [0.28, 0.45]). Excluding the one study in which the learning resource was audio (Hsu et al., 2017). The between-group effect was significant (Q = 6.6, p = 0.01), as presented in Table 2, indicating that the form of learning resource impacts measures of cognitive load in multimedia learning.

No significant differences were found in the results of analyzing gender (p = 0.462 > 0.05) based on male (r = 0.44, CI [0.32,0.55], k = 14), female (r = 0.40, CI [0.25,0.52], k = 6), and mixed groups (r = 0.49, CI [0.40,0.58], k = 36). Two studies that did not specify gender were removed from the analysis.

Meta-regression analysis using mean age as a covariate excluded two effect sizes from one study due to lack of explicit data (Lee, 2014), and found that the effect of mean age on EEG measures of cognitive load was either not significant or could not be explained by a simple linear model (B = 0.057, p < 0.01, R2 = 0.23), as presented in Table 3.

Table 3 Moderation analysis of continuously coded variables by Meta-Regression

Meta-regression analysis of sample size as a covariate revealed that a smaller sample size was associated with a higher effect size (B = -0.01, p < 0.01, R2 = 0.31). This indicated that the effect sizes in the study may be negatively correlated with sample size, but the correlation is relatively weak, as presented in Table 3.

Discussion

Frequencies

In this study, we quantitatively analyzed articles that used EEG spectral power to assess learners’ cognitive load in multimedia learning. To start, we conducted a subgroup analysis, with particular attention to the three frequently employed EEG bands: α, β, and θ. We also explored both broadband and subband features and considered regional characteristics. Furthermore, it’s worth noting that “frontal θ/parietal α” deserves using due to its substantial effect size. Our investigation unveiled several moderators and continuous moderators that have a significant impact on effect sizes. Notably, indexes lacking sufficient support for sample size are excluded from our discussion.

Theta

θ frequency is associated with working memory and executive function processes, typically showing a positive correlation with the volume of messages to be remembered (Jensen & Tesche, 2002). In the current study, it was found that θ demonstrates the highest sensitivity to increasing cognitive load, supported by moderate effect sizes. The consistent findings in the included studies strongly indicate that θ is the most influential indicator of cognitive load in EEG frequency domain analysis for learners.

Numerous studies have emphasized the significance of θ waves, particularly those measured by electrodes in regions near the frontal lobe, in responding to variations in cognitive load (Nigbur et al., 2011). However, our study has not found significant differences in the analysis of moderating variables between electrodes in multiple regions and those near the frontal lobe. Instead, the results showed a slight advantage for multiple regions over frontal lobe regions. This suggests that electrode data from multiple regions provide a more comprehensive characterization of cognitive load. For instance, frontal and central θ power increases with heightened cognitive demands (Castro-Meneses et al., 2020; Wang et al., 2020), while parietal and temporal lobe θ power increases during active maintenance of working memory representations (Sauseng et al., 2010). Regions near the frontal lobe alone may not fully capture the intricacies of cognitive load. Therefore, when feasible under experimental conditions, electrodes should be strategically placed across multiple regions to monitor activities in various brain regions, and interactions between electrodes from different regions can be examined through spatial filtering and other techniques to investigate changes in cognitive load (Babiloni et al., 2002; Gevins & Smith, 2000).

Alpha

α power tends to decrease with increasing cognitive load, although not universally. Many studies have found that a decrease in α power is related to an increase in cognitive load (Pergher et al., 2019; Grissmann et al., 2017; Rietschel et al., 2012). Nevertheless, the findings were inconsistent, showing substantial heterogeneity before sensitivity analysis. An explanation proposed by van Ede (2018) suggests that α power increases when encoding verbal material, even if the stimulus is visual, and decreases when encoding visual material. Consequently, this unique result cannot be solely attributed to the nature of visual or verbal stimuli. Hence, while α has the highest calculated effect size, its application in general multimedia learning environments may be subject to instability due to as-yet-undiscovered factors. As a conservative approach, it is advisable to prioritize the use of α after considering θ.

Subsequent analysis of α subbands indicated that broadband α did not significantly differ from high α or low α. Nevertheless, the analysis revealed that the subband frequencies of α offered a more robust explanation of cognitive load compared to broadband α. Moreover, Klimesch (1999) and Lobier et al. (2018) also thought it made sense to divide the α band into high α and low α subbands. Therefore, the appropriate α frequency for measuring cognitive load may vary across different learning tasks. This study recommends following the precedent set by previous studies by measuring and analyzing both α frequencies in subband analyses.

Similar to the findings regarding θ waves, the analysis of moderating effects of electrodes in multiregional and parietal regions within the α band revealed a slightly larger effect for the multiregional regions compared to the parietal region. Consequently, it is also advisable to consider electrode placement in multiregional regions when measuring cognitive load in the α band.

Beta

The findings regarding the β frequency range suggest that β power increases with cognitive load, and this relationship has a medium-sized effect. Although β frequency has been associated with various cognitive processes in previous studies, it is underutilized in measuring cognitive load. Additionally, it is worth noting that β1 and β2 sub bands have different functional roles (Plechawska-Wójcik et al., 2019; Pereira & Wang, 2015). However, this study couldn’t definitively determine their specific roles due to the insufficient sample size.

Frontal theta / parietal alpha

The results of our analyses suggest that frontal θ/parietal α has the highest effect size. The index is based on the assumption that an increase in mental load is associated with a decrease in α power and an increase in θ power. As mentioned above, the Frontal θ wave reflects executive function load, while the parietal α wave reflects visuospatial attention load. By calculating their ratios, more comprehensive information on cognitive load can be obtained, combining two key aspects of executive function and visuospatial attention, enabling complementary analyses. Related research has demonstrated that EEG measures using frontal θ/parietal α radio serve as a reliable cognitive load index (Fernandez Rojas et al., 2020; Fritz et al., 2014; Qin & Bulbul, 2022). In addition, Chik (2013) linked cross-frequency synchronization of θ and α to cognitive load, indicating changes in cognitive load, with significant measurement results. Therefore, frontal θ / parietal α should be used more often to measure multimedia learning cognitive load.

Other moderating variables

The analysis in this study revealed that EEG data from midline electrode placements tend to yield larger effect sizes compared to data from other electrode placements. This finding aligns with previous research indicating that EEG activity in the midline position of the brain is distinct and often associated with focused attention and autonomic activity (Aftanas & Golocheikine, 2001). In practical studies, electrode placement can be a resource-intensive process. Given this, when there are limitations in electrode placement or when seeking a practical approach, prioritizing midline electrode placement, particularly at positions such as Fz, FCz, Cz, CPz, and Pz, is recommended (Eschmann et al., 2018). These midline positions are associated with important neural activities related to frontal θ and parietal α, making them valuable for measuring cognitive load.

The study’s analysis also indicated that many previous studies used data from a single electrode in their measurements, sometimes with the aim of investigating that specific electrode’s function (Lee, 2014; Plechawska-Wójcik et al., 2019). However, the results showed that experiments with multiple electrodes tended to yield higher effect sizes. This difference may be due to the additional processing steps involved in multi-electrode data acquisition. In multi-electrode setups, data from all acquired electrodes within the valid experimental time period are processed through spatial filtering or source analysis (Sazgar & Young, 2019). This comprehensive processing allows for a more thorough extraction of the EEG signal and enhances the ability to interpret changes in cognitive load. Consequently, data obtained from multiple electrodes provide a more robust explanation of variations in cognitive load.

The study’s moderated effects analyze showed that multi-tasking, as opposed to single-tasking, resulted in significantly larger effect sizes. This suggests that multi-tasking imposes a higher demand on cognitive resources, leading to increased cognitive load. Changes in cognitive load within the primary task can be indirectly observed by monitoring performance in a multiple task. When cognitive load in the primary task intensifies, it may have an impact on performance in the multiple tasks. This combined approach allows for increased sensitivity to variations in cognitive load and provides insights into the dynamics of cognitive load. In multimedia learning contexts, individuals often need to manage multiple tasks simultaneously, making multi-tasking a more realistic simulation of the cognitive load experienced in such environments. This aligns with the idea that studying cognitive load in multi-tasking situations can better mirror the real-world cognitive demands people face when engaging with multimedia content (Brunken et al., 2003; Pashler, 1993; Schumacher et al., 2001).

The study’s moderated effects analyze indicated that studies using task systems with interactive feedback features had significantly larger effect sizes compared to those without interaction. This finding aligns with existing research that suggests increased learner interaction with the learning system can be an effective way to manage cognitive load, contingent on the task’s nature and the relevance of the interaction elements to the subject matter. In the included studies, the distinction between interaction and non-interaction was based on whether the task system provided learners with timely, real-time feedback on their responses and guided them in their next steps. The results suggest that when designing or selecting task systems for multimedia learning, providing learners with timely feedback is crucial to prevent “cognitive idleness” or attentional drift, which could otherwise compromise the accuracy of experimental results. Furthermore, presenting pre-problems before starting a task can enhance learners’ engagement, especially among highly motivated individuals, as it provides them with a cognitive framework to approach the task. This aligns with the idea that providing context or preparatory activities can help learners focus and better manage their cognitive load (Yang et al., 2021).

The effect size for dynamic video learning resource was found to be smaller compared to static picture-based learning resource. Static pictures typically adhere to well-established psychological research paradigms (e.g., N-back, oddball). In contrast, video often presents continuous and uncontrollable stimuli for learners’ cognition, leading to more noise during research. Consequently, the group working with static pictures is more likely to produce valid results. However, it’s essential to acknowledge that multimedia learning necessitates video. To address this challenge, researchers should consider adopting established psychological paradigms or developing new ones specific to video learning to enhance the reliability of experimental results.

The moderation analysis on gender indicates that there isn’t a significant difference in cognitive load performance on EEG between males and females. Despite previous studies showing waveform differences in certain frequencies between genders (Güntekin & Başar, 2007), the differences in average power are not substantial. Therefore, using average power as a measure of cognitive load is not affected by gender-related moderation effects.

Because the age span of the subjects included in the analysis was too small (all spread across the college academic spectrum) to be grouped, we attempted to construct a meta-regression model using age as a continuous variable, and found that age had little or no simple linear relationship with cognitive load as measured by the EEG. Brain waves of different frequencies may exhibit different patterns as a result of increasing or decreasing age, e.g., some studies have shown that the frequency and amplitude of slow waves (δ) may increase with age, while the frequency and amplitude of fast waves (α, β, etc.) may decrease (Cellier et al., 2021; Christov & Dushanova, 2016). Therefore, the relationship between age and cognitive load in EEG spectral power cannot be explained through simple linear analysis.

In the meta-regression analysis, we find that sample size as a covariate appears to be negatively correlated with effect size, and the strength of the relationship is weak. This correlation may be due to the “small sample effect.” " The “small sample effect” may be due to the fact that random variables are more likely to cause larger fluctuations in small samples, thus making the effect sizes appear more significant, as in Dasari et al. (2017) and Gevins et al. (1998)’s experiment that recruited fewer than 10 subjects but achieved high levels of effect sizes. Although the examination of outcome moderating effects found no significant impact, we still recommend that researchers try to recruit as many subjects as possible when conducting experiments in which EEG measures cognitive load.

Limitations and future directions

At the methodological level, several factors limit the results of this quantitative review.

Firstly, our meta-analysis did not exhaust all studies relevant to the topic. Not all studies quantified their results with sufficient data and presented them. Additionally, some studies meeting the inclusion criteria were not included due to lack of response from authors. Furthermore, some non-English literature was not included. Additionally, our publication bias test indicated a small risk of publication bias in this study, suggesting that our results should be interpreted cautiously.

Secondly, many researchers, although conducting studies measuring multiple EEG indicators, tend to report only statistically significant indicators. This leads to unreliable effect size estimates for indicators like the β sub-band due to lack of data support. Therefore, we hope that authors of relevant studies can supplement complete indicator results, regardless of their statistical significance.

Thirdly, in our analysis of the electrode position adjustment effect, we only distinguished between “midline positions” and “other positions”. This is because the electrode placements used in the studies included in our analysis typically span multiple brain regions, making it difficult to categorize the electrodes in each brain region individually. In the future, there is a need to include more studies and systematically compare the sensitivity to changes in cognitive load using differently positioned electrodes.

Finally, as mentioned earlier, the age range or educational stage included in our meta-analysis was too narrow. Directly treating age as a continuous moderator variable in the meta-regression analysis may introduce bias. Therefore, future meta-analyses should ideally include broader age ranges or educational stages to elucidate their effects on cognitive load measurement.

Given these considerations, it’s crucial to design cognitive load-inducing tests for future studies and utilize them to develop eye-movement metrics or even multimodal measures that effectively characterize learners’ cognitive load.

Conclusion

In this study, we conducted a meta-analysis to examine the sensitivity of learners’ EEG frequencies to cognitive load in multimedia learning environments, as well as other moderating variables that may affect the measurements. However, we cannot completely rule out the risk of publication bias, so the results should be treated with caution. In most cases, θ is most sensitive to increases in cognitive load, whereas α is most responsive to decreases in load. In contrast, β increases with increasing cognitive load. In addition, it is recommended that cognitive load be measured using subband frequencies in the α range. In addition, for both θ and α, combining data from multiple brain regions may be more beneficial for measurement. The combination of metrics such as frontal θ and parietal α has great potential for measuring cognitive load. During data collection, multiple electrode channels are preferred to collect signals. If experimental conditions limit the placement of electrodes, it is recommended that the midline position be prioritized. The use of multitasking and interactive task systems is beneficial. The use of static images as learning resources is also recommended.