Introduction

Modern technologies (e.g., computers, tablets, videos, online classes, apps, electronic books) have become increasingly common in instructional designs and curriculum (Martin et al. 2018). The use of educational videos (see Guo et al. 2014), homework assessment, tutoring programs (see Cole and Todd 2003; Ma et al. 2014), and online discussion forums (see Mazzolini and Maddison 2003; Thomas 2002) highlight how the use of technology is impacting our society today (Department of Education, Office of Educational Technology, and American Institutes for Research 2017). The affordances of such technologies have resulted in extensive use by both students and instructors (Comi et al. 2017). The variety of technology available and the frequency in which they are being used in education presents learners with an opportunity to learn in non-traditional learning environments and from alternative formats that promote multimedia learning.

Multimedia learning is learning that occurs with words and pictures (Mayer 2014a). Specifically, words can be presented either in spoken (i.e., a narration of a text) or print (i.e., text on a computer screen or a handout) formats, and pictures can be presented as static images (i.e., photos, graphs) or dynamic (i.e., videos, animations; Mayer 2017). Compelling evidence suggests that learning from both words and pictures is more beneficial than learning from words alone (Butcher 2014; Mayer 2014a) because the information presented in either format may complement each other, allowing learners to form better connections and representations of the information (Mayer 2014a). However, meaningful learning does not simply occur just because multimedia materials are used. The complexity of the learning material, design of the learning environment, and one’s level of prior knowledge are some of the factors that influence learning from multimedia materials. To enhance learning from multimedia materials, one might consider the use of signaling to guide learners’ attention to critical components of the material or highlighting central ideas to the learner (Horvath 2014; van Gog 2014). To understand why signals or cues aid learning from multimedia materials, an overview of both the Cognitive Load Theory and the Cognitive Theory of Multimedia Learning is provided in the section below.

Theoretical framework

Both the Cognitive Load Theory (CLT; Sweller 1988) and the Cognitive Theory of Multimedia Learning (CTML; Mayer 2014b) provide an understanding of how people learn and how instructional designs influence learning. The CLT comprises of three types of cognitive load: intrinsic, extraneous, and germane. The amount of cognitive load experienced by the learner is characterized by the extent that individual elements interact with each other (Sweller 2010). Intrinsic cognitive load arises due to the complexity of the learning material in relation to learners’ prior knowledge of the material. Extraneous cognitive load is imposed by factors external to the learning content such as poor instructional design or a confusing learning environment interface. Germane cognitive load, unlike intrinsic and extraneous cognitive loads, is independent of the design of the learning material and is concerned with how learners allocate their working memory resources to learn. Since intrinsic cognitive load is fixed given learners’ prior knowledge level, CLT is primarily concerned with instructional design techniques to reduce extraneous cognitive load (Sweller 2010).

The CTML, like the CLT, is also concerned with how instructional designs support meaningful learning. Specifically, the CTML is based on three assumptions on how cognitive processing occurs: (a) there are separate channels for processing visual and auditory information (Baddeley 1992; Paivio 1986); (b) there is a limited amount of information that can be processed in each channel at one time (Baddeley 1992); and (c) meaningful learning occurs when learners actively engage in selecting, organizing, and integrating incoming information with prior knowledge to form coherent mental representations (Mayer 2014b; Wittrock 1989). According to the CTML, there are three cognitive processing demands that can arise during learning: extraneous, essential, and generative (Mayer 2014b). These processing demands are analogous to the three cognitive loads (i.e., extraneous, intrinsic, and germane) described in CLT. Extraneous cognitive processing occurs when instructional procedures are less than optimal and the cognitive load required to process information exceeds the learners’ cognitive capacity (Mayer 2014b). During multimedia learning, essential processing occurs at the selection of relevant incoming information and representation of the information in the working memory. Generative processing, on the other hand, occurs when one attempts to make sense of the information with their prior knowledge (i.e., organization and integration of information). To ensure that meaningful learning occurs, learning strategies to support essential and generative processing demands should be identified and integrated into learning environments. In the context of this meta-analysis, we are particularly interested in how the use of signals or cues supports meaningful learning by directing learners’ attention to critical information, reducing extraneous processing demands for the learner (Kriz and Hegarty 2007; Mayer and Fiorella 2014). Specifically, reducing extraneous load may free up cognitive resources that can be devoted to the selection, organization, and integration of new knowledge with prior knowledge using signaling, thus, promoting essential and generative processing.

Signaling or cueing

Signaling or cueing is the means of using text, pictures, gestures, or others’ eye movements to guide learners’ attention to essential learning material. Specific examples of signals and cues include text-based cues such as sentences that precede the desired learning material (Hayes and Reinking 1991), picture-based cues, such as arrows or colors that point to or emphasize the focal point of the learning material (Lin and Atkinson 2011) and vocal cues that are intonations in spoken words (Mautone and Mayer 2001, Exp. 3).

In terms of multimedia learning, applying signals to presentations has three benefits for learners. First, the use of signals may be particularly beneficial for learners who struggle to identify critical information on their own. Since the selection of key information precedes the other processes in learning (Mayer 2014b), it is crucial that signals or cues are utilized to guide learners’ attention to the relevant material, thus, increasing essential processing and reducing extraneous processing of irrelevant material (Mayer and Fiorella 2014; van Gog 2014). Second, the use of signals is beneficial for supporting the organization and integration of relevant information with learners’ prior knowledge. According to Mautone and Mayer (2001), signals can help guide learners to relevant information concerning the overall structure of the material. This process allows learners to organize information better by ensuring that relations between concepts are explicit, allowing them to draw better inferences and conclusions about a topic. For example, text cues that outline steps in a process highlight the relationships between the steps and ensures that learners pay closer attention to the information (see Harp and Mayer 1998). Third, the use of signals is beneficial to decrease the cognitive load for learners. Evidence suggests that learning materials with signals and cues can ease cognitive demands placed on working memory, resulting in a lower perceived cognitive load (Moreno and Abercrombie 2010, Exp 2). In sum, the CLT and CTML provide theoretical support for the use of signals and cues in multimedia materials to improve learning.

Despite these benefits of using signals and cues, the usage of signaling has three limitations in practice (van Gog 2014). First, many variations of signals are available in the literature. As a result, selecting an appropriate signal for a specific topic and context of presentation might be difficult for instructional designers. Second, the signaling literature is unclear as to (a) when signals are needed, (b) which cues are more useful in practice, and (c) what material (e.g., subject domain) can benefit from using signals. Third, signaling may be detrimental for high prior knowledge students who already possess sufficient knowledge of the learning material. The inclusion of signals or cues in material for high prior knowledge learners may interrupt the process of learning or interfere with the learners’ mental representation of the information, hindering learning (Kalyuga et al. 2003). This limitation indicates that there may be differential advantages of the use of signals for different groups of learners (i.e., middle school, high school, post-secondary). Given the aforementioned limitations and the lack of guidance on the use of signals, an investigation is needed to examine the signaling literature to identify the conditions in which the use of signals in practice and research is beneficial.

One way to assess the effectiveness of signals and cues in learning materials is to evaluate students’ learning after their exposure to the materials. In the multimedia learning literature, transfer, retention, or a combination of transfer and retention are used as proxies for learning. Retention is a form of learning that focuses on retaining information presented during learning while the transfer is a form of learning that focuses on the ability of the student to not only remember what has been learned but also be able to apply it to other contexts (Salomon and Perkins 1989). Similarly, the current meta-analysis uses these as learning outcomes. The effects of signaling in learning have been examined. However, research on the effectiveness of signaling to promote learning appears inconsistent. On one end, some studies provide the support that the inclusion of signals in presentations increase learning outcomes when compared to non-signaled presentations (De Koning et al. 2007; Jamet et al. 2008; Tabbers et al. 2004). On the other end, studies show that presentations with signals are no more beneficial than the non-signaled presentation for increasing learning outcomes (Harp and Mayer 1998; Lin and Atkinson 2011; Ozcelik et al. 2010; Rey 2010). Scholars have emphasized that this difference for the effects of signaling in learning occurs because studies used different signals, methods, and samples (van Gog 2014). Given the mixed findings in the existing literature, the present study seeks to address van Gog’s (2014) recommendation that a systematic review (e.g., meta-analysis) should be conducted to robustly examine the different conditions under which signaling could be effective or not effective. Indeed, exploring moderating factors or conditions under which signaling is effective (or not) could help to improve the design and presentation of signaling to different people, in different contexts.

Objectives of the meta-analysis

Signaling can be used to help capture learners’ attention and overcome distractions in learning environments. Although signals and cues exist in many forms (e.g., verbal, text, or picture formats, see Mayer 2014a, b), we are only focused on the effects of visual signals and cues in multimedia learning for the following two reasons. First, verbal cues are often presented with visual information. However, processing both verbal and visual information in a novel learning context may be confusing for some learners who are unable to associate the verbal cue with the visual information, thus, making it challenging for learning (van Gog 2014). Second, since technology (e.g., computers, internet) is commonly used among students and instructors for educational purposes (Comi et al. 2017), investigating the effects of visual signals can inform practitioners on how to incorporate these signals into their practice. In sum, verbal signals are not examined in this meta-analysis.

Although research on signaling has been grounded in the cognitive theory of multimedia learning, the research on how and when to use these signals in the learning environment is mixed. In other words, the effectiveness of signals to improve learning depends on several factors which up till now has not been systematically and comprehensively examined through a meta-analysis. Hence, educators, instructional designers, and multimedia scholars lack a robust understanding of effective types of signals, and guides for using signals in multimedia learning environments. By addressing these needs, the present meta-analysis will help develop multimedia instructional materials with effective signals using evidence-based research.

Next, this meta-analysis seeks to resolve the inconclusive results from published research on signaling. These mixed findings for learning benefits with signals are not surprising because signaling research has been conducted with numerous subject matters, different outcome measures, diverse types of presentations (e.g., animated and static images), and various comparison treatments. For example, with regards to outcome measures, a previous meta-analysis by Richter et al. (2016) examined the effectiveness of multimedia integration signals on comprehension performance alone. The meta-analysis extracted 45 effect sizes from 27 studies involving 2464 participants. Although the study found that presentations with signals were more beneficial for comprehension performance than non-signaled presentations, the study focused only on multimedia integrated signals but not the manipulation of signals as an independent variable. More specifically, the included studies did not examine signaling as an experimental condition (i.e., the main effect). Due to this limitation, the present meta-analysis expands upon the previous meta-analysis by identifying experimental studies where studies individually examined materials that included signals against materials without signals.

The present meta-analysis seeks to resolve the mixed findings in signaling research and identify potential moderating variables. The results will provide empirical guidance toward a theory of how signaling influences learning processes. Specifically, the present meta-analysis seeks to answer the following research questions:

  1. 1.

    What are the effects of signaling in multimedia learning environments?

  2. 2.

    Are signaling effects moderated by participant and study features?

  3. 3.

    Are signaling effects moderated by presentation features (i.e., pacing, animation, images, headphones, setting, and topic) of the learning materials?

  4. 4.

    Are signaling effects moderated by methodological features of the research (e.g., methodological quality, randomization, etc.)?

Methods and data sources

Study selection criteria and search strategies

To be included in this meta-analysis, studies need to meet the following selection criteria: (a) examined the effect of signaling within a multimedia learning environment; (b) signals were presented in visual forms, such as arrows, pointing, highlighted text with color or image parts, and cued image moved to the foreground; (c) must have a signaling and non-signaling group; (d) measured learning outcomes (i.e., retention, transfer, etc.); (e) reported sufficient data for calculating an effect size; and (f) publicly available online or in library archives and written in English language or have a translated English version. Dissertations were included to minimize the potential for publication bias (Orwin 1983; Rosenthal 1979; Slavin 1995).

Studies were excluded when: (a) reported statistics were insufficient for calculating the effect size, (b) no control group was present, or (c) methods presented many confounds, hindering isolation of impact signaling had on the outcome(s), and (d) the condition of signaling was not operationally defined. The selection period for articles was not restricted because despite the advancement in technology, the purpose and function of signals remained the same.

We used the following electronic databases: (a) PsycINFO, (b) Dissertations & Theses at Washington State University, (c) ERIC, (d) ProQuest Dissertations & Theses A&I, (e) PsycARTICLES, and (f) Social Services Abstracts to search for eligible studies. The keywords used were Multimedia AND Signaling OR Cueing. The search produced a total of 1059 articles. A total of 82 studies were excluded because they (a) did not meet the selection criteria which required studies to be published in English or have an English translation (i.e., 59 out of 82 excluded studies) and (b) did not focus on learning outcomes (i.e., 23 out of 82 excluded studies). A total of 977 studies were included for screening.

Document retrieval, secondary review, and data extraction and analysis

Figure 1 shows the flowchart of how studies were filtered throughout the process of searching and selecting studies for this meta-analysis. Two filtered phases were implemented to evaluate whether the studies found should be included or excluded. In the initial filter phase, two researchers screened the abstracts for the 977 identified studies. The two researchers split the remaining abstracts in half (rater 1 screened 489 studies while rater 2 screened 488 investigations). Both researchers independently screened their abstracts. Studies identified as duplicates and did not meet the selection criteria were excluded. A total of 120 studies passed the first filtering phase, so full-text copies of those 120 articles were obtained. The second filtering phase began with two researchers reading the full-text copies of the 120 studies and applying the selection criteria. Additionally, we examined the reference sections for additional published material suitable for inclusion. At the end of this second phase, 29 articles met all selection criteria and were included in the meta-analysis.

Fig. 1
figure 1

Flowchart for selecting studies

We developed a coding form to enter the extracted information from selected articles. Once studies were finally selected, the two researchers extracted data from the 29 articles using the coding form. The researchers coded variables based on the following categories: (a) study information (e.g., authors, and country); (b) sample information (e.g., age groups, prior knowledge); (c) treatment and control conditions (e.g., usage of animation or picture, system pace or learner pace); (d) research design (e.g., random assignments, covariates, or control); (e) dependent variables (e.g., scoring, development and deliver method of measure); and (f) results (e.g., means, standard deviation, sample size per condition, and significance of findings).

These coded variables were developed and included in the analysis for four reasons. First, the coded variables aligned with the CTML. Specifically, all learning environments were multimedia learning environments that utilized both text and images intended to stimulate learning (Mayer 2014a, b). Such an instructional design supports learners as they attempt a mental representation of the material. Following the previous practices in meta-analysis related to multimedia learning (Adesope and Nesbit 2012), presentation features included delivery of material (e.g., pacing), usage of images, usage of animation, duration, content of presentation, location of administration (e.g., labs), usage of signals, and treatment and study duration. Furthermore, some of the coded variables (i.e., pacing) have been found to impact learning outcomes (e.g., Kalyuga et al. 2004). Hence, the coded variables related to presentations and treatment aimed to capture different critical elements of the studies’ presentations. Second, the coded variables related to participant characteristics that aligned with CTML. Research suggests that multimedia instruction may have different effects on learning for different learners (e.g., different levels of prior knowledge; Kalyuga 2005). Hence, the coded variables sought to capture some of the participant characteristics (e.g., levels of previous knowledge, grade level) from the included studies.

Third, variables related to methods (e.g., control, reporting effect size, and reliability) were included in rating the methodological quality of the included studies (Bernard et al. 2004). The signaling studies are experiments that must follow rigorous methodological practices. Conducting such investigations requires scholars to determine the focus of the study and the methodological design to achieve the intended objectives. Borrowing from the previous methods of meta-analysis related in the multimedia learning literature (Adesope and Nesbit 2012), most of the coded variables (i.e., “reported effect size,” “reported reliability,” “reported pretest,” “randomization” and “controlling prior knowledge” were dummy coded as ‘Yes’ for explicitly reported and ‘No’ for not explicitly reported. Such information was critical given that studies made claims based on their findings, but these claims varied depending on the degree of the methodological quality. Documentation of these variables provided a degree of control and allowed us to judge the methodological quality of the studies and investigate the impact it had on the effect sizes. Hence, these comparisons might help us explain the variability among studies included in this meta-analysis (Bernard et al. 2004). Specifically, by examining if primary studies that were coded explicitly reported effect sizes or not, it allowed us to determine if our calculated effect sizes match up with the author reported effect sizes. Moreover, following the previous meta-analysis of the multimedia literature (Adesope and Nesbit 2012), we coded for the document type (i.e., dissertation or journal) under methodological features.

Finally, we also coded for variables related to the studies’ features (e.g., location of study, and continent). We decided to examine these variables as moderators since they might inform the readers about the generalizability of the findings. In sum, all these coded variables might provide valuable information to scholars and practitioners. The average inter-rater agreement between two researchers on all the coded variables was about 85%.

Extraction and calculation of effect sizes

Some of the 29 included articles had multiple studies, resulting in the coding and extraction of 44 effect sizes with a total of 2726 participants. Specifically, Hedges’ g effect size was calculated for each outcome. To calculate Hedges’ g, first, we calculated Cohen’s d. Cohen’s d effect size is a standardized estimate of the difference in mean scores between participants who received a presentation with signals compared with those who did not, divided by the pooled standard deviation of the two groups. Although Cohen’s d represents an effect size, sample sizes across studies may bias the effect size obtained by such statistic (Hedges and Olkin 1985). Hence, Hedges’ g was estimated since it reduced the bias and provided an unbiased estimate for the obtained effect sizes (Hedges and Olkin 1985). After calculating Cohen’s d, Hedges’ g was estimated, using Eq. (1):

$${\text{g}} = \left(1 - \frac{3}{4N - 9}\right) {\text{d}},$$
(1)

where d is the effect size of Cohen’s d, and N is the total number of participants in the experimental condition (i.e., a presentation with signals) and comparison group (i.e., a presentation with no signals).

To interpret the estimated Hedges’ g, positive effect sizes indicated better performance on learning for the signaling presentation over non-signaling presentations. Moreover, two included effect sizes were missing standard deviations (SD). Because these values were missing at random, imputations were recommended (Idris and Robertson 2009). Specifically, the missing standard deviations were derived by algebraically recalculating the omitted SD using the reported parameters, such as test statistics or p value (Wiebe et al. 2006). Additional analyses showed that findings from Hedges’ g are similar to findings from Cohen’s d in this meta-analysis. Hence, a decision was made to use Cohen’s d throughout the rest of this paper.

Data analysis

Data were analyzed using Comprehensive Meta-Analysis 2.2.048 (Borenstein et al. 2008) and SPSS Version 22 for Windows. SPSS was employed for data screening and to obtain descriptive information. Comprehensive Meta-Analysis 2.2.04 was used to conduct the meta-analysis. First, this software examined the effects of outliers on the homogeneity output. To do this, data were analyzed with and without outliers. Next, the homogeneity results were compared. The removal of the outliers changed the homogeneity test. Two outliers were changed to the next highest scores of the distribution to minimize its influence on the findings. 44 independent effect sizes were included in all the following computations. Also, the forest plot was employed to graph the overall results, and publication bias was examined via the “Fail-safe N” test.

The current meta-analysis used the Q statistic to examine the homogeneity of variance of effect sizes. Because sample size might bias Q statistic (e.g., Gavaghan et al. 2000; Higgins et al. 2003), meta-analysis literature recommends the use of I2 statistic (Higgins and Thompson 2002). I2 represents variability percentage attributed to heterogeneity. Hence, both statistics (i.e., Q and I2) provides information on the amount of homogeneity of variance across studies and are estimated by Comprehensive Meta-Analysis software. To examine the presence of heterogeneity, a significant Q statistic (i.e., p < .05) indicates that the observed effect size is heterogeneous, and mediators might be impacting the study’s effect sizes. Complementing Q statistic, I2 is interpreted with the following recommended criteria (Higgins and Thompson 2002): (a) I2 = .25 (i.e., 25%) indicates low heterogeneity, (b) I2 = .50 (i.e., 50%) indicated medium heterogeneity, and (c) I2 = .75 (i.e., 75%) indicates high heterogeneity.

Results

A total of 29 articles yielding 44 independent effect sizes were analyzed on different features and outcomes. As mentioned in previous paragraphs, positive effect sizes indicated enhanced performance on learning outcomes for signaling presentations. Figure 2 shows the distribution of effect sizes and the forest plot. The result section is organized around our research questions. The results from this meta-analysis do not imply causality.

Fig. 2
figure 2

Forest plot of the 44 effect sizes obtained from 29 studies

Signaling literature

Table 1 shows the overall analysis, including the weighted mean of the 44 effect sizes. Positive weighted mean effect size (d+) indicated better performance on learning outcomes for signaling presentations over non-signaling presentations. Overall, Table 1 shows a moderately and statistically significant effect of learning with signaling presentations (d = .38). The distribution is heterogeneous, Q(43) = 163.07, p < .01, I2 = .73. Since there is a significant and high heterogeneity presented in these studies, the current meta-analysis further examines potential moderators.

Table 1 Overall weighted mean effect size

Study features

Table 2 describes the weighted mean effect sizes for the study features. In types of signaling, the following codes were implemented: (a) color contrast signals occurred when presentations used opposite colors to highlight the learned material (e.g., red to highlight learned material and grey the remaining material), (b) Focus by effect signals were for presentations that adjusted visual presentation (e.g., zoom in) so that the critical material is the main focus, (c) Geometric signals were presentations that used shapes (e.g., arrows, circles) to localize the learned/essential material, (d) Pedagogical agent signals occurred when presentations with animated avatar used gestures (e.g., pointing) to highlight learned material; (e) Text signals were presentations that applied highlighted text to emphasize the important material; and (f) Combination of signals occurred when presentations used multiple types (e.g., an arrow with highlighted test) to highlight essential material. The majority of signaling type categories produced statistically significant effect sizes. The between-levels difference was not statistically significant [QB (5) = 8.24, p = .14], indicating that signaling was beneficial for learning outcomes regardless of their types.

Table 2 Weighted average effect size for study and participant features

We also examined signaling levels as moderators. The signaling levels were categorized as high for dynamic and low for static signals. A not reported category was used for studies that did not report the signaling levels. The between-levels statistic [QB (2) = 6.36, p < .05] showed that the high signaling category had the highest weighted mean effect size (d = .50) and was significantly different than the low signaling category.

Participant features

Table 2 also describes the weighted mean effect sizes for participant features. The grade level was coded as middle school, high school, and undergraduates. Most of the participants were undergraduates. Results showed significant and positive effect sizes for middle school and undergraduate students. The between-levels statistic [QB (2) = 1.00, p = .61] was not significant, suggesting that signaling had positive effects regardless of the participants’ grade level. Next, we examined the continent where the studies were conducted. Overall, the continent categories obtained significant and positive effect sizes for Europe and North America. The between-levels difference was not statistically significant [QB (3) = 4.99, p = .17], again showing that signaling was beneficial regardless of the continent where participants performed the study.

We also examined prior knowledge as moderators. Prior knowledge was categorized as low, high, and mixed. A not reported category was used for studies that did not report prior knowledge. The prior knowledge categories produced positive effect sizes, yet it was only significant for low and not reported. The between-levels statistic [QB (3) = 8.72, p < .05] showed that prior knowledge was a moderator. This finding indicated that the low prior knowledge category had the highest weighted mean effect size (d = .47) and was significantly different from the high and mixed prior knowledge category.

Presentation features

Table 3 describes the weighted mean effect sizes for pacing, media presentation, images, animation, and subject domain. For pacing, a study was coded as system-paced when no control was granted to participants during the presentation. On the contrary, learner-paced was coded when the learned had control of the pace of learning during the presentation. Also, studies that did not report the type of pacing were coded as not reported. The pacing categories resulted in positive statistically significant effect sizes. The between-levels statistic [QB (2) = 11.99, p < .05] suggested that learner-pacing was associated with a higher weighted mean effect size and significantly different than the system-paced category. Next, media presentation was coded as paper, computer, or mixed. Signaling studies with computers and mixed media produced positive and statistically significant effect sizes. The between levels [QB (2) = 18.62, p < .01] showed that mixed media presentations (i.e., computer/paper plus paper) had the highest weighted mean effect size (d = .63), and it was significantly different from paper and computer presentations.

Table 3 Weighted average effect size for presentation features

Also, we examined images and animation as moderators. Image categories were coded as yes for presentation using images, no for presentations not using images, and not reported for studies that did not report if presentation included images or not. The image categories produced positive and statistically significant effect sizes. Further results revealed that signaling studies with no images produced a statistically significant weighted mean effect size (d = .50) that was higher and different than studies with images [QB (2) = 15.62, p < .01]. Additionally, animation levels were coded as yes for presentations that included animation, and no for presentations that did not include animation. Both categories produced statistically significant weighted mean effect sizes. The between-levels difference was not statistically significant [QB (1) = 2.62, p = .11], showing that regardless of the presence or absence of animations, signaling was beneficial on learning outcomes.

The subject domain variable indicates the topics or domains that participants learned in these studies. Topics were coded as (a) General/Human Sciences for human physiological topics (e.g., cardiac system), (b) Physical Sciences for topics related to plane engines, piano functions, etc., (c) Natural sciences for topics related to lighting, rock formation, etc., and (d) others for topics that did not relate to any of the above categories. All the studies conducted in science domains produced statistically significant effect sizes. The between-levels statistic [QB (3) = 14.64, p < .01] indicated that General/Human Sciences were associated with a higher weighted mean effect size (d = .57) and significantly different from the weighted mean effect size for Physical Science, Natural Science, and the other category. Also, the present meta-analysis examined the moderating effects of the setting. Setting groups were coded as lab, classroom, and not reported. The setting categories produced positive weighted mean effect sizes across all groups. The between-levels statistic [QB (2) = 5.18, p < .05] indicated that laboratory studies (lab) was associated with a higher weighted mean effect size (d = .45), and was significantly different from the weighted mean effect size for the classroom, and not reported.

Finally, we examined the treatment duration and the entire study duration as moderators. Treatment duration was coded as less than 3 min, between 3 to 8 min, more than 8 min, and not reported. The treatment duration categories produced statistically significant weighted mean effect sizes. The between-levels statistic [QB (3) = 10.09, p < .01] indicated that less than 3 min was associated with a higher weighted mean effect size (d = .59), and it was significantly different than the other categories’ (i.e., between 3 to 8 min, more than 8 min, and not reported) weighted mean effect size. Also, the entire study duration was coded as less than 1 h, more than 1 h, and not reported. Less than 1 h and not reported produced significant weighted mean effect sizes. The between-levels statistic [QB (2) = 33.27, p < .01] revealed that studies that were completed in less than 1 h were associated with a higher weighted mean effect size (d = .64) and significantly different from the other levels’ (i.e., more than 1 h, and not reported) the weighted mean effect size.

Methodological features

Table 4 shows the results under different methodological features of research. We coded for methodological quality using the following criteria: (a) control for prior knowledge, (b) high treatment fidelity, (c) reporting of analysis/statistics, and (d) transparency of the methods for replication of the study. Studies were coded as a high-quality category if they met at least 3 out of the 4 criteria. Studies were coded as low-quality if they only met 1 or 2 of the requirements. The high-quality category had a moderate weighted mean effect size. The between-levels difference was statistically significant for the quality of the study [QB (1) = 14.33, p < .05], indicating that high-quality studies produced a weighted mean effect size that is higher (d = .52), and that was significantly different from low-quality studies.

Table 4 Weighted average effect size for methods

Also, reported effect size and reliability were examined as moderators. Studies that reported effect size were coded as yes, and studies that did not report effect size were coded as no. Further analysis revealed that reporting effect size was associated with a higher weighted mean effect size (d = .71) and significantly different from the not reporting effect size category [QB (1) = 35.59, p < .01]. Reliability was coded as yes for reporting reliability and no for not reporting reliability. Studies that reported reliability of measures produced a larger effect size ( d= .65) and was significantly different from studies that did not report reliability [QB (1) = 27.26, p < .01].

Next, the moderators of pretest, randomization, and control were evaluated. Studies that included a pretest were coded as yes, and studies that did not include a pretest were coded as no. Follow-up analysis showed that using a pretest was associated with a higher weighted mean effect size (d = .69) and was significantly different from studies that did not use a pretest [QB (1) = 44.20, p < .01]. Studies that randomized their participants to different conditions were coded as yes and studies that did not randomized the participants were coded as no. The between-levels difference was not statistically significant [QB (1) = .16, p > .05], suggesting that regardless of the usage of randomization, signaling was beneficial on learning outcomes. The use of control for previous knowledge was coded as yes for studies that controlled for previous knowledge, and no for studies that did not control for prior knowledge. Further analysis revealed that controlling for previous knowledge was associated with a higher weighted mean effect size ( d = .57) and significantly different from studies that did not control for previous knowledge [QB (1) = 23.72, p < .01].

The outcome format was also examined as a moderator. The outcome format was coded based on the type of format used to measure the learning outcome. The following were the outcome formats: (a) Draw Diagram, (b) Mixed, (c) Multiple Choice, (d) Open-Ended, and (e) Problem-Solving. Positive and significant weighted mean effect sizes were obtained across all outcome formats. The between-levels difference was statistically significant [QB (1) = 18.698, p < .05], showing that the drawing diagram was associated with higher weighted mean effect size and significantly different from the rest of outcome formats.

Finally, the document type was examined. The document type was coded as a journal or dissertation. Most of the studies in this meta-analysis were published in journals. Signaling presentations were associated with statistically significant effects for studies reported in dissertations and journals. The between-levels statistic [QB (1) = 6.36, p < .01] indicated that dissertations were associated with a higher weighted mean effect size and significantly different from the weighted mean effect size for journals.

Examining publication bias

The current meta-analysis also examined the presence of publication bias. The literature has reported that studies with non-significant findings do not get published as much as those with significant results (Franco et al. 2014). For this reason, the current study utilized the funnel plot to evaluate the presence of publication bias. Figure 3 illustrates the funnel plot distribution. Complementing the funnel plot, publication bias was also statistically examined using the “Classic Fail-safe N.” This test revealed that 978 studies would be needed to yield a statistically non-significant overall effect. Overall, these analyses revealed that findings of this meta-analysis are valid and not threatened by publication bias.

Fig. 3
figure 3

Funnel plot. Funnel plot summarized the publication bias

Discussion

The current meta-analysis examines the effects of signaling with 44 effect sizes extracted from 29 articles. The present meta-analysis aims to resolve the mixed findings in signaling research and identify potential moderating variables by answering the following research questions.

What are the learning effects of presentations with signaling compared with presentations without signaling?

The present meta-analysis found that there is a significant learning benefit from using signaling in presentations. Our findings supported previous research and recommendations that signaling can direct the learner’s focus to the target material and potentially increase their learning outcomes (Horvath 2014; Mayer and Moreno 2003; Richter et al. 2016). Signaling is especially helpful in multimedia learning environments. Scholars, instructional designers, and teachers can use signals in their instructional and/or research presentations. Although research suggests that signaling is beneficial, our findings reveal the presence of heterogeneity. Hence, moderator analyses were conducted to examine the conditions under which signaling might be beneficial or not in learning.

Are signaling effects moderated by study features (e.g., type of signals)?

The present meta-analysis identified several study features as moderators. First, findings showed that types of signaling (e.g., color contrast) was not a moderator. Due to the multiple types of signals used in the literature, there is a possibility that each of these could influence learning. Given that the type of signals did not moderate the signaling effect on learning, the results suggested that the incorporation of any signals is helpful to improve learning outcomes. Second, the present study found that signaling presentations with dynamic signals (i.e., high signaling category) had significantly higher learning outcomes than static signals. Researchers have documented that dynamic systems lead to better learning than static systems (e.g., Boucheix and Schneider 2009). Likewise, our results suggest that the presence of dynamic signals in presentations can improve learning through the use of sequential cues to provide added emphasis on essential aspects of the learning material.

Are signaling effects moderated by participant features (e.g., grade level, etc.)?

The present meta-analysis found several participant features as moderators. Participants’ grade level is not a significant moderator, suggesting that signaling has positive effects on learning regardless of grade levels. This finding has direct implications for the field. Since signaling is beneficial across age groups, we recommend that educators across grade levels consider incorporating the use of signals in their teaching presentations to direct learners to critical learning information. At the same time, since most research on signaling included in this meta-analysis has been conducted with undergraduate college/university students more empirical research is needed to examine the robustness of the findings with K-12 students.

The present meta-analysis also indicated that continents (i.e., where studies were conducted) was not a significant moderator. This finding implies that regardless of continent, presentations with signaling enhanced learning outcomes. However, given that studies conducted in Asia were relatively fewer compared to those conducted in Europe and North America, this might have lowered the statistical power to detect meaningful differences between the continents, and the obtained estimates may not be the most accurate (e.g., Rubio-Aparicio et al. 2019). Therefore, we recommend interpreting the results with caution.

Finally, the present meta-analysis showed that low prior knowledge learners benefited the most from the use of signals in learning materials. The current finding is in line with results from Richter et al.’s (2016) signaling meta-analysis which also indicated that presentations with integrated signals led to better learning outcomes for low prior knowledge learners. A plausible explanation, supported by existing research on signaling, is that low prior knowledge learners benefit from the use of signals as guides to critical information, supporting the selection, organization, and integration of new knowledge with existing knowledge. This is further indication that signals serve as important instructional tools that help to learn among low prior knowledge learners. On the other hand, the findings suggest that signals are less helpful for high prior knowledge learners. Although existing research has yet to examine why signals may be less beneficial for high prior knowledge learners, there is the possibility that the presentation of material with signals for high prior knowledge learners creates an expertise reversal effect (Kalyuga et al. 2003). Specifically, high prior knowledge learners who already possess sufficient knowledge to navigate the learning material efficiently may struggle to reconcile what they understand to be important information with what the signals indicate as important. Perhaps, high prior knowledge learners might not need signals to be incorporated in their learning materials, unlike low prior knowledge learners who are more likely to benefit from the added guidance provided by cues.

Are signaling effects moderated by presentation features (e.g., pacing)?

The present meta-analysis identified several presentation features as moderators. First, the meta-analysis found that pacing was a moderator. The multimedia literature has documented that pacing impacts learning outcomes, more specifically, that learner-paced materials are more effective than system-paced materials (e.g., Kalyuga et al. 2004). In line with the general findings in multimedia learning research, we found that learner-paced learning environments were more beneficial than system-paced. Second, media presentations (i.e., paper, computer, or mixed) were found to impact the effects of presentation with signaling on learning outcomes. Specifically, the mixed presentations (i.e., computer/paper plus paper) were more beneficial than non-mixed presentations. Third, the inclusion of images influenced presentations with signaling and learning outcomes. Existing literature has recommended that images can be useful to attract the attention of the learner (e.g., Mayer and Moreno 2003). Contrary to this recommendation, the results of the present meta-analysis showed that presentations with no images were more beneficial than presentations with images. We note that studies did not provide a specific description of the type of images used in the presentations, indicating that the studies could have used decorative or relevant images. This distinction is particularly important to note for scholars, instructional designers, and teachers. On the one hand, the use of decorative images may be a distraction (e.g., seductive detail) that increases cognitive load and impairs one’s ability to learn the target topic. On the other hand, relevant images related to the lesson being learned can enhance learning among students. Based on this finding, we recommend that scholars, instructional designers, and teachers present images in their presentations that reify and complement the learning material. Indeed, more research is needed to explain further the effects of images (i.e., decorative or relevant images) on learning outcomes. Fourth, the animation was not a significant moderator. The current study did not support previous research that has documented the positive association between the use of animation and learning outcomes (e.g., Lowe 2003).

Fifth, the topic domain of presentation influenced the effect of signaling and learning outcomes. All science topics (i.e., general human science, physical science, and natural science) produced positive effects. Also, presentations with general human science were more beneficial than the other sciences (i.e., physical science, and natural science). The current results indicate those science topics, especially general human topics, can benefit from using presentations with signaling. Sixth, settings impacted the effects of presentation with signaling on learning outcomes. Specifically, studies conducted in laboratory settings were more beneficial for learning than those conducted in the classrooms. This result is not surprising considering that multimedia learning research has been conducted mostly in laboratory settings. Given that much of the evidence is based on studies from the laboratory settings, additional investigation is needed to examine the effects of signals on learning in the classroom. Such information can provide a more in-depth understanding of how to use signals or cues in the classroom. However, based on the current findings and following previous recommendations (Mayer and Fiorella 2014), the signaling principle may be applied for presentations with complex information or situations when the learners are overwhelmed with extraneous processing. Using signals in these situations might guide students to critical material of the lecture, class, or presentation.

Seventh, treatment duration and the entire study duration/completion also influenced the effects of presentation with signaling on learning outcomes. Short treatments (i.e., less than 3 min) and study completion times (i.e., less than 1 h) led to more beneficial learning outcomes than longer treatment durations (e.g., more than 3 min) and study completion (e.g., more than 1 h). Although short treatments produced a higher effect size than presentations with longer treatments, caution should be exercised in the interpretation of the results because it is more ecologically robust to learn from such longer presentations as they mirror studying activities in real life.

How are effect sizes moderated by the methodological features of the research?

Overall, beneficial effects were found when studies were high in quality, reported reliability of outcomes, reported effect sizes, used a pretest, and controlled for prior knowledge.

These findings supported previous research that rigorous experimental research that reported the reliability of outcome measures results in robust effect sizes in multimedia research (e.g., Adesope and Nesbit 2012). Given that variability might exist in designing future signaling studies, the current findings provide valuable information to consult and guide methodological practices for scholars interested in investigating signaling and improving the quality of their results.

Interestingly, the meta-analysis found that dissertations had a larger weighted mean effect size than journal publications, contradicting previous findings in extant multimedia literature (e.g., Adesope and Nesbit 2012). However, these results align with previous studies that evaluated published meta-analysis and systematic reviews (Ferguson and Brannick 2012; Hartling et al. 2017). Specifically, journals are expected to be more robust and produce more meaningful effect sizes than dissertations as the latter are not peer-reviewed, may have major methodological problems, or used unreliable measurements to assess constructs (Ferguson and Brannick 2012; Thomas and Skinner 2012). We also note that there were relatively fewer dissertations than journal studies included in this meta-analysis (i.e., 4 dissertations with 4 independent effect sizes and 25 journals with 40 independent effect sizes), which might impact the accuracy of the estimated effect size related to the dissertation category (e.g., Rubio-Aparicio et al. 2019). Thus, such results must be interpreted with caution.

Limitations and future direction

This meta-analysis contributes to documenting the benefits of using signaling in presentations. However, there are certain limitations of the present investigation that need to be discussed. During the filtering phase of the comprehensive and systematic review, we utilized a strict inclusion/exclusion criteria. As a result, many studies did not qualify for the current investigation. For example, studies that were quasi-experimental or had confounding variables did not meet our criteria. The exclusion criteria made it difficult for several studies to qualify, posing a possible limitation to the meta-analysis. However, our decision to deploy strict selection criteria is based purely on quality so that findings from this meta-analysis can be robust. Another limitation could be the available literature and publication bias. Although there is a possibility that there is the presence of publication bias, our statistical analysis indicated that current findings are not influenced by publication bias. Lastly, the generalizability of our findings is limited to visual cues or signals. Future studies should conduct a meta-analysis on the effectiveness of verbal cues. Such an investigation will provide valuable information about the effects of verbal cues on learning outcomes and its usage in practice. Additionally, future studies should implement a longitudinal design given that such investigations were lacking in the existing signaling literature. A longitudinal design will evaluate the effects of signaling across time and its long-effects.

Conclusion

The current meta-analysis indicates that presentations with signaling have benefits for learning. There are implications for theory and practice. Theoretical implications include examining the signaling principle more extensively with elementary and high school students; learners with high prior knowledge, subject domains underrepresented in the present meta-analysis as well as learning materials that incorporate images and animation. While future research will deepen our understanding of moderators that impact the effects of signaling, results from the present meta-analysis show preliminary support for continued use of signaling techniques in multimedia learning environments. Specifically, signaling seems effective with middle school and undergraduate students, especially those with low prior knowledge. Results from this meta-analysis can guide educators and instructional designers in their effective use of signaling techniques to facilitate optimal learning outcomes, especially in complex multimedia environments.