Concept maps are diagrams used in many educational settings to represent verbal or conceptual information (Fig. 1). In this review, we consider a concept map to be any node-link diagram in which each node represents a concept and each link identifies the relationship between the two concepts it connects. For example, if two nodes “ocean acidification” and “growth of coral reefs” are connected with a link labeled “slows” the assemblage can be read as the proposition “ocean acidification slows growth of coral reefs.” A concept map can include dozens of links and nodes, with each pair of linked nodes representing a proposition. Figure 1 shows a concept map created using CmapTools, one of the more widely used software tools for authoring concept maps (Cañas et al., 2004).

Fig. 1
figure 1

Simple concept map about raptors. Note that this map is not meant to conform to any particular researcher’s standards

Diagrams similar to concept maps have been used by philosophers and logicians for centuries (Nesbit & Adesope, 2013), but the term concept map and the modern idea of the concept map as a tool for learning originated with Joseph Novak and his colleagues in the 1970s (Novak & Gowin, 1984). Novak advocates concept mapping, the construction of concept maps by learners, as a way to promote meaningful learning. He explains that concepts should be hierarchically arranged with more general concepts placed higher on the map and linked to more specific concepts placed lower in the map (Novak & Cañas, 2008). Novak further recommends that concept maps include horizontal cross-links to depict relationships other than generality/specificity.

Knowledge maps, featured in research by Dansereau and colleagues (O’Donnell et al., 2002), are node-link diagrams we consider to be a type of concept map. Instead of representing a relation between concepts by a freely chosen word or phrase, links in knowledge maps must be selected from a fixed set of nine relational terms such as “type,” “example,” and “leads to.” Unlike Novak’s emphasis on concept mapping by learners, much of the research conducted by Dansereau examined the use of knowledge maps as a medium for presenting new information to learners.

This review investigated research on the instructional efficacy of using diagrams that met our definition of concept map regardless of how they were identified in the primary studies. The review examined all instructional uses including constructing, studying, and editing concept maps. In reviewing research, we gave particular attention to the mode of instruction used as a comparison treatment because we are interested in understanding the cognitive processes at work as learners use concept maps and how and under what conditions the use of concept maps should be selected instead of other means of learning.

Previous meta-analyses have found advantages for using concept maps in comparison with other modes of instruction. In a meta-analysis of 18 classroom-based studies, Horton et al. (1993) found that student concept mapping was associated with a mean effect size of 0.42 standard deviations and studying teacher-prepared concept maps was associated with a mean effect size of 0.59 standard deviations although with far fewer number of studies (three studies). In a comprehensive meta-analysis of 67 effect sizes that included classroom-based and laboratory-based studies, Nesbit and Adesope (2006) found that both concept mapping (g = 0.82)Footnote 1 and studying concept maps (g = 0.37) were associated with statistically significant advantages. They found that the advantage for constructing and studying concept maps extended over all grade levels and almost all school subjects investigated.

Perhaps the most interesting results presented by Nesbit and Adesope (2006) were for treatment comparisons appearing in only a small number of studies. They found a large effect size associated with studying animated concept maps (g = 0.74, k = 2).Footnote 2 In addition, they found an advantage for studying concept maps rather than outlines or lists (g = 0.28, k = 10), and, when engaged in cooperative learning with a partner, no significant advantage for studying concept maps rather than other materials (g = 0.19, k = 8).

Concept Maps as Learning Tools

Scholars have posited various reasons to explain why constructing and studying concept maps may be effective learning strategies, and we propose these reasons can be broadly categorized as promoting meaningful learning, reducing extraneous cognitive load, or both.

Meaningful Learning

Novak and Cañas (2008) credit David Ausubel with the distinction between rote learning and meaningful learning. Rote learning can be regarded as focusing on verbatim memorization of presented information with little effort to connect it to prior knowledge and understand its meaning (Novak, 2002). In contrast, meaningful learning occurs when new knowledge is created or assimilated into existing interconnected knowledge structures through cognitive elaboration. Meaningful learning is sometimes referred to as knowledge elaboration (Kalyuga, 2009) whereby learners use strategies such as self-explanation (Chi et al., 1989) and elaborative interrogation (Dunlosky et al., 2013) to connect the new information to existing knowledge structures. Learning strategies that “make meaning” are more effective than rote learning, and abundant research evidence demonstrates that cognitive elaboration is the process on which their success depends (Dunlosky et al., 2013).

According to Karpicke and Blunt (2011, p. 772), “concept mapping bears the defining characteristics of an elaborative study method: It requires students to enrich the material they are studying and encode meaningful relationships among concepts within an organized knowledge structure.” In order for meaningful learning to occur, Novak and Cañas (2008) describe three preconditions: the learning material must be relevant and conceptually understandable to the learners, the learners must have appropriate and relevant prior knowledge, and the learners must make an effort to meaningfully learn the materials. They argue that constructing concept maps meets the first criterion of meaningful learning as concept maps can both identify and properly sequence knowledge in relation to the learners’ prior knowledge. However, it is also the instructor’s responsibility to be conscious of what information the students conceptually understand before preparing their concept maps, and they must use instructional techniques and evaluation procedures that encourage meaningful learning rather than rote learning (Novak & Cañas, 2008).

Studying concept maps instead of text can also be theorized to promote cognitive elaboration and meaningful learning. O’Donnell et al. (2002, p. 75) claimed that “knowledge maps make the macrostructure of a body of information more salient.” If students more easily recognize in learning materials superordinate concepts they already know, then they are better able to subsume new subordinate concepts within their existing knowledge structures.

Reducing Extraneous Cognitive Load

Using concept maps may lower some barriers students face compared with carrying out an equivalent learning task by writing or studying text. The grammatical structure of concept maps tends to be much simpler than sentences in natural language, and may require less extraneous processing to generate and interpret. Studying concept maps has greater benefits for students with lower domain knowledge or lower verbal ability (Haugwitz et al., 2010; Nesbit & Adesope, 2006), a pattern that would be expected if concept maps reduce extraneous load. Researchers (e.g., Amadieu et al., 2009; O’Donnell et al., 2002) have often claimed concept maps offer more salient presentation of key semantic features such as relationships between concepts, hierarchical relationships, and centrality of concepts, and students do less extraneous cognitive processing to extract these features from concept maps than they do from texts.

Purpose of the Present Meta-Analysis

In the decade since the meta-analysis by Nesbit and Adesope (2006), much new primary research has appeared. Researchers have continued to investigate the efficacy of using concept maps for learning and in doing so have extended the instructional contexts, map features, and comparison conditions represented in the research base. More recently, a few studies (Blunt & Karpicke, 2014; Karpicke & Blunt, 2011) found that retrieval practice by writing text is more effective than concept mapping while viewing source text as a means of studying text passages. Furthermore, new research has been published in categories represented by small sample sizes in the earlier meta-analysis. For example, research has since been published on the effects of animated concept maps (Adesope & Nesbit, 2013). A new meta-analysis is needed to determine whether the effects in such categories are overturned or strengthened by the more recent research.

It has been our impression the cognate research is dominated by studies that evaluate the learning effectiveness of particular uses of concept maps rather than investigate theories explaining their effectiveness. While theoretical conclusions can certainly be drawn from some evaluation studies, research designed expressly to test theories is a far more efficient route. We hoped to assess through the meta-analysis whether studies comparing the use of concept maps to other modes of learning have advanced understanding of the cognitive processes that explain how concept maps can help students to learn.

The plan for this review was to combine the previously analyzed (see Nesbit & Adesope, 2006) and recent studies in a single database and conduct a new meta-analysis. The review incorporated all relevant studies from the inception of concept maps in 1972 (Novak, 1990) to 2014.

Research Question

The research questions investigated by the review stemmed from one overarching question:

What is the effect of using concept maps on learning?

We began by examining the overall influence of concept mapping and studying concept maps on learning outcomes compared to other instructional interventions. Next, we examined how this effect varied by (a) the mode of instruction used as the comparison treatment, (b) the subject area (science, technology, engineering, and math (STEM) or non-STEM), (c) the type of concept maps, (d) the duration of treatment, (e) whether the maps were studied or constructed, (f) where the research was conducted, (g) the level of schooling, and (g) whether learning was collaborative or individual. The examination of moderator variables can often indicate under which conditions an instructional treatment is effective or provide evidence for cognitive explanations of treatment effects, both of which have implications for designing more effective learning conditions. For example, it could be hypothesized that the visual complexity of static concept maps can be reduced by presenting the information in an animated format, resulting in the animated condition increasing learning due to the well-known limitations of working memory. If the moderator analysis revealed that the effect size for studying animated concept maps is greater than the effect size for studying static concept maps, it would support the hypothesis that animating concept maps to reduce the visual complexity is an effective instructional strategy.

Method

Literature Search

Our literature search focused on studies published since 2005 when Nesbit and Adesope (2006) conducted their literature search. The keywords used to search the concept mapping literature for relevant studies were “concept map* OR knowledge map* OR node-link map*” (Nesbit & Adesope, 2006, p. 422). On February 28, 2014 we searched the titles and abstracts of papers presented at the American Educational Research Association (AERA, 2006–2014)Footnote 3 and National Association for Research in Science Teaching (NARST, 2007–2013).Footnote 4 Together, this search revealed 107 studies (AERA, 54 studies; NARST, 53 studies).

On March 6, 2014 the following databases were searched for studies published from 2005 until the date the search was conducted (with the number of studies returned indicated in parenthesis): Web of Science (1083), ERIC (815), Academic Search Complete (513), PsychINFO (292), Dissertation Abstracts (131), and PsychARTICLES (7). In sum, our search located 2966 new studies for consideration.

Inclusion Criteria

This study used the same methodology as Nesbit and Adesope’s (2006) meta-analysis. In order to be considered for inclusion in this meta-analysis, studies must meet the following inclusion criteria:

(a) contrasted the effects of map study, construction, or manipulation with the effects of other learning activities; (b) measured cognitive or motivational outcomes such as recall, problem-solving transfer, learning skills, interest, and attitude; (c) reported sufficient data to allow an estimate of standardized mean difference effect size; (d) assigned participants to groups prior to differing treatments; (e) randomly assigned participants to groups, or used a pretest or other prior variable correlated with outcome to control for preexisting differences among groups. Studies reporting a pretest effect size outside the range −0.40 < d < 0.40 were excluded from the meta-analysis. (Nesbit & Adesope, 2006, p. 421–422).

Coding Procedures

Phase I Coding

In the first phase of the coding process, the titles and abstracts of each study were examined. Two coders examined the studies for possible inclusion. To ensure consistency, they completed a 300 study training set where they coded the same studies (as either retain or reject based on the abstract, i.e., a pseudo phase I coding) and then compared results. While there were a few discrepancies, the coders discussed the differences found, reached consensus, and no major concerns arose during the process. The coders then proceeded to code the remaining studies. As shown in Fig. 2, this screening process revealed 347 studies which met the inclusion criteria.

Fig. 2
figure 2

Results of study review through the phases of the meta-analysis

Phase II Coding

The second phase of the coding process consisted of the full-text examination of these studies and subsequent coding process. This process eliminated 284 studies, leaving 63 studies that met the inclusion criteria. These 63 studies were randomly distributed between the two coders for coding on the coding form.

Final Coding Form

The coding form and process were identical to Nesbit and Adesope’s (2006), but in some cases the variables for menu items were changed (e.g., rather than the domains specified by Nesbit and Adesope, we categorized studies as either science, technology, engineering, or mathematics (STEM) or non-STEMFootnote 5). In these cases, the coding was updated for studies in the previous analysis. In order to extract effect sizes, consistent with process used by Nesbit and Adesope, we coded the most delayed knowledge test present in the study when measures were reported from multiple points in time. For example, if test scores were available from a test taken immediately after learning with the instructional materials and also from 1 week later, we coded the results from the test that occurred 1 week after the intervention took place.

Coding the 63 studies produced 75 independent effect sizes which were combined with the 67 independent effect sizes analyzed by Nesbit and Adesope (2006). Thus, the present meta-analysis examined 142 independent effect sizes.

Analyses

During the coding process, we found that descriptive statistics were sometimes not reported, in which case effect sizes were calculated from the results of the reported statistical tests (e.g., a t statistic). In addition, if a study contained more than one treatment or control group relevant to the meta-analysis, we calculated weighted means and pooled standard deviations across the two groups to maintain statistical independence. For example, if a study contained two experimental concept mapping groups and one non-mapping control group, the weighted means and pooled standard deviations would be calculated for the two experimental conditions and used to calculate an effect size compared to the non-mapping control condition. Finally, the effect sizes extracted from two studies (g = 3.82; g = 5.94) were determined to be outliers (−3.3 ≥ Z ≥ 3.3; p < 0.001). The effect sizes were adjusted (g = 2.70; g = 2.75, respectively) to be closer to the next highest effect size (g = 2.67) as recommended by Tabachnick and Fidell (2013).

In the original analysis, Nesbit and Adesope (2006) reported an interrater agreement of 96.2%. In order to calculate the interrater reliability for the newly coded studies that met all of the inclusion criteria, IBM SPSS version 23 was used to calculate Cohen’s Kappa. We randomly selected 20.6% of the sample to be coded by both coders. The Kappa coefficient was found to be k = 0.88 (p < 0.001), indicating a very strong consistency between coders.

After all of the data were coded and the interrater reliability was found to be sufficient, we used the Comprehensive Meta-Analysis (version 2.2.064) software to analyze the data.

Results and Discussion

Overall Results

Analysis of the 142 independent effect sizes produced a moderate overall random effect of g = 0.58 (p < 0.001) across a large, diverse sample of participants (n = 11,814). Further analysis indicated that significant heterogeneity (Q B (141) = 1127.73, p < 0.001) existed and there was high variability within the sample (I 2 = 87.50). Accordingly, moderator analysis was conducted.

Alternative Treatments

As we explained earlier, focusing on the different types of comparison conditions can provide insights into why learning with concept maps can be an effective instructional technique. Our analysis revealed significant differences between the no concept map comparison conditions (Q B (5) = 28.40, p < 0.001). Learning with concept maps was found to be considerably more effective than learning through discussion or lecture-based treatment conditions (g = 1.05, p < 0.001) across a considerable number of studies (k = 37), and moderately more effective than creating or studying outlines or lists (Table 1). Learning with concept maps was also found to be more effective than both constructing and studying texts. Caution is warranted when interpreting the results in Table 1 because both the intervention and comparison conditions were highly varied across studies. The intervention conditions included constructing and studying concept maps, and within each type of control condition there were many variants. The discussion/lecture category was especially diverse as it included teacher-led discussions, non-interactive lectures, and indeed any kind of teacher-led, whole-class activity. What teaching styles were used and what students were actually doing in such whole-class activities was often not monitored or not reported.

Table 1 Effect of learning with concept maps compared to alternative treatments

Domain

Due to the consistent push towards improving teaching and learning in the STEM fields, we categorized the learning materials used within studies as either STEM or non-STEM relevant. Our analysis revealed no significant differences existed between groups (Q B (2) = 3.00, p = 0.22). Studies investigating both STEM and non-STEM relevant learning materials produced effect sizes consistent with the overall effect size found in this study (Table 2).

Table 2 Effect of learning with concept maps by knowledge domain

We hypothesized that learning in the STEM domains, where the content often contains complex hierarchical, linear, cyclical, and interacting processes, would show more benefits from learning with concept maps than non-STEM domains. However, the results did not support this hypothesis and the confidence intervals between STEM and non-STEM domains are quite similar. Hence, it appears concept mapping can be used effectively in a wide variety of content domains.

Map Type

Due to the variety of interactive software platforms that facilitate concept mapping, we were interested in the relative benefits of working with static, animated, or interactive concept maps (Table 3). We operationalized these different types of concept maps as follows: static concept maps did not move, nor were they interactive in anyway; animated concept maps typically had link(s) or node(s) appear as they were mentioned by accompanying narration or when the learner clicked a “next” button; and interactive concept maps required students to interact with the software in some way beyond a “next” button (e.g., the learner added or removed nodes or links). Our analysis revealed no significant differences between groups (Q B (3) = 1.34, p = 0.72).

Table 3 Effect of learning with concept maps by map type

Based on Nesbit and Adesope’s (2006) analysis, we hypothesized that animated or interactive concept maps would be more effective than static concept maps. While the analysis of the results did not support this hypothesis, this could be due to unequal sample sizes. The vast majority (k = 105) of the sample learned with static concept maps, while considerably fewer (k = 24), utilized interactive concept maps. However, given that the confidence intervals are similar, one can infer that learning with interactive or animated concept maps may provide no significant benefits compared to static concept maps. We note that the animated concept mapping conditions had a wide confidence interval. Hence, additional research is needed to understand under what conditions animated concept maps are more or less effective.

Duration

In this meta-analysis, we examined the effect of learning with concept maps depending on the duration of the intervention. Our analysis indicated significant differences existed between studies in which students worked with concept maps for differing durations (Q B (3) = 22.47, p < 0.001). As shown in Table 4, the longer a learner utilized concept maps, the more effective they were for learning outcomes. Concept mapping was found to have a large effect compared to non-mapping conditions when the study lasted for more than 4 weeks (g = 0.72, p < 0.001).

Table 4 Effect of learning with concept maps depending on the duration of concept map use

Nesbit and Adesope’s (2006) results showed that when learners constructed concept maps for less than 5 weeks the strategy was more effective than when learners constructed concept maps for longer durations. These results suggest a novelty effect, where concept maps become less effective over time as learners get more familiar with the technique and the novelty of using concept maps wears off. However, the results of the present meta-analysis do not support this notion. Rather, learning with concept maps was most effective when the intervention was longer than 4 weeks in duration. Due to similar sample sizes across all three duration groups examined (less than 1 week, one to 4 weeks, and greater than 4 weeks), we conclude that learning with concept maps retains its effectiveness as an instructional strategy for several weeks. However, it could be beneficial for future researchers to conduct longitudinal studies to examine the efficacy of using concept maps for longer durations (e.g., one semester, 1 year, etc.) and to examine how the effectiveness of using concept maps varies over the duration of the study.

Map Use

Nesbit and Adesope (2006) examined the results of their study by whether students constructed or studied concept maps. Analysis of our data revealed significant differences existed between studies in which students constructed concept maps compared to studies in which students studied concept maps (Q B (1) = 7.06, p = 0.01). Table 5 shows that studies in which students constructed concept maps averaged significantly higher effect sizes than those in which students studied concept maps.

Table 5 Effect of learning with concept maps by map use

Examining the data, we can see there are likely two factors which can account for these differences. The two factors are (a) the process involved in constructing concept maps compared to studying concept maps, and (b) the nature of the comparisons examined in the primary studies.

When creating a concept map, as when constructing a text, the learner must engage in elaborative cognitive processing by means such as self-questioning, reflection, and summarization. For example, the learner must not only know the major conceptual ideas, but also how they are related and how to best visually and spatially represent them. This process of deciding how to spatially distribute the links and nodes (i.e., connections and relationships between conceptual ideas) plausibly requires high levels of elaborative processing.

When one studies a concept map, they typically see a series of noun-verb-noun propositions without the contextual details one may see in a text presentation. This lack of context requires the learner to invoke similar meta-cognitive prompts as when one constructs a concept map, although perhaps not the same extent. For instance, the proposition elk calf-runs from-bear requires the learner to think, why would an elk calf run from a bear? Through self-questioning and self-explanation, the learner can come to the rationale that bears occasionally predate on elk. This process may require more elaborative processing in order to create an accurate mental model than reading an elaborate expository text on the subject. However, it may not require the same level of elaborative processing of the content compared to if the learner were constructing the concept map.

The differences found between studying and constructing concept maps may also be partially explained through the control conditions in the primary studies. When constructing concept maps, the most prevalent control condition was discussion or lecture, compared to which concept mapping was especially effective (discussed later, see Table 9). Constructing concept maps, an activity which requires a learner to cognitively engage with the content, was most frequently compared to listening to a discussion or lecture, an activity which often does not. There is evidence that learners are more successful in active learning tasks than passive ones. Freeman et al.’s (2014) recent meta-analysis showed the benefits of active learning in STEM courses. It is noteworthy that in our data set only 10 comparisons had learners construct a text compared to constructing a concept map (i.e., both active learning activities), and in these cases, the concept map was still moderately more effective. When studying concept maps, the most prevalent control condition was studying a text, in which case those in the concept map conditions outperformed those studying text. Hence, we interpret our results as supporting recent literature in relation to the benefits of active learning compared to passive learning, and acknowledge that it is possible that the control conditions within the primary studies could have influenced our understanding of the overall effectiveness of constructing compared to studying concept maps.

To summarize, it appears that both constructing and studying concept maps are effective learning strategies. We surmise that, as hypothesized by Nesbit and Adesope (2013), this may be due to the types of processing invoked by learning with concept maps. However, in order to identify concretely the rationale why concept maps are effective, purposefully designed experimental studies would need to be conducted.

Effect of Moderator Variables by Use of Concept Map (Constructed or Studied)

Next, we examined the influence of studying or constructing concept maps across various conditions and settings.

Region

The first variable under investigation was the region of the world in which the study took place. Our analyses indicated that significant differences existed between the regions of the world the studies were conducted in regardless if the concept maps were constructed (Q B (5) = 26.71, p < 0.001) or studied (Q B (4) = 26.46, p < 0.001).

As shown in Table 6, consistent with the findings of Nesbit and Adesope (2006), creating concept maps was found to be most effective in studies that took place in Africa; however, our literature search did not locate any additional studies where learners created concept maps in African countries. Yet, our search did increase the number of studies located for other world regions. For example, we found that constructing concept maps was associated with a large effect size in both Asian (g = 0.78, p = 0.01) and European countries (g = 0.82, p < 0.001). Similarly, creating concept maps was also associated with moderate to large effects in Middle Eastern countries (g = 0.75, p < 0.001). Our literature search nearly doubled the number of studies in which participants in the USA or Canada created concept maps, and the effect size was found to be nearly identical to that found in Nesbit and Adesope’s (2006) previous analysis (g = 0.49, p < 0.001).

Table 6 Influence of concept maps depending on the region of the world the study was conducted in

Our analysis also indicated a large effect for studying concept maps for learners in Asian countries (g = 1.04, p < 0.01). While the number of studies (k = 2) is small, large effects were also found for those studying concept maps in Middle Eastern countries (g = 0.96, p < 0.001). Studying concept maps was associated with a moderate effect in European countries (g = 0.46, p < 0.01), but again the number of studies was limited (k = 3). Finally, studying concept maps was associated with a modest effect in the USA and Canada (g = 0.25, p < 0.001); however, we note that this is still a small, positive effect robust across a large number of participants (n = 3667).

It is difficult to posit why concept mapping or studying concept maps may be more or less effective in different regions of the world based on this meta-analysis. Nesbit and Adesope (2006) reported personal communication which implies that, in some cases, the effectiveness may be due to the inherent advantage of comparing a constructive activity like concept mapping with a traditional method of teaching in a specific respective location. However, we hesitate to make broad generalizations based on the results presented here. Rather, we believe it would be more fruitful for researchers to undertake systematic lines of research to examine why concept mapping may be more effective in some regions of the world. Is the impact of concept mapping in each region related to the typical types of instruction students’ experience? Do different languages integrate into the concept map format more efficiently or effectively due to word length or grammatical variation?

Field of Instruction

Nesbit and Adesope (2006) examined the differences between constructing and studying concept maps in different fields of study. In our analysis, we classified studies as either within the STEM fields or non-STEM fields. Interestingly, we found no significant differences in effectiveness depending on the field of study regardless if concept maps were constructed (Q B (1) = 0.33, p = 0.57) or studied (Q B (2) = 1.29, p = 0.53). As shown in Table 7, and consistent with our overall analysis of constructing compared to studying concept maps, constructing concept maps was associated with moderate to large effects, while studying concept maps was associated with moderate effects.

Table 7 Influence of constructing or studying concept maps depending on the field of instruction

Due to the plethora of discipline-based education research fields, we did not examine the influence of constructing or studying concept maps by specific subfields. Hence, it is possible that particular subfields may find concept mapping more or less effective than others.

Type of Concept Map

Nesbit and Adesope (2006) found that studying animated concept maps was more effective than studying static concept maps; however, there were only two animated concept map studies included in their analysis. Animated concept maps use the signaling principle (van Gog, 2014) to guide learners through a complex map. They have been theorized to eliminate some of the extraneous processing demanded as learners navigate through a complex static map (Nesbit & Adesope, 2011). Alternatively, it is plausible that when constructing concept maps, an interactive format could be more complex as the learner would need to know how to not only create the concept map, but also manipulate a software program effectively. Hence, we sought to further investigate the differences between constructing or studying interactive, animated, and static concept maps.

When learners constructed concept maps, there was no significant difference between if the map was interactive, static, or a mixture of the two (Q B (2) = 0.02, p = 0.99). While small sample sizes are a limitation, examination of the effect sizes indicates moderate to large effects regardless of the type of concept map being constructed (Table 8). Similarly, when concept maps were studied by learners, there was no significant difference between groups depending on whether they were studying interactive, animated, static concept maps, or any combination of the above (Q B (3) = 0.77, p = 0.86).

Table 8 The influence of constructing or studying concept maps depending on the type of concept map used

These results stand in contrast to the hypotheses of cognitive load theory. Research is needed to understand the conditions in which, and for whom, static, animated, or interactive concept maps may be most effectively employed.

Comparison Treatment

A very important comparison is examining the influence of concept mapping compared to non-mapping conditions. Nesbit and Adesope (2006) found that constructing concept maps was more effective than lecture (g = 0.74, p < 0.05) and creating texts or outlines (g = 0.19, p < 0.05). Similarly, studying concept maps was more effective than studying texts (g = 0.39, p < 0.05) and studying outlines or lists (g = 0.28, p < 0.05).

Including the new studies in our analysis presents a more sharply defined picture of these results and also shows the continuing strength of concept mapping as a learning strategy (Table 9). Significant differences were found between studies depending on the control condition when learners created concept maps (Q B (4) = 15.25, p < 0.01). Constructing concept maps was highly effective compared to attending a discussion or lecture (g = 1.05, p < 0.001), and was associated with moderate effects when compared to studying or constructing outlines (g = 0.40, p = 0.04), constructing texts (g = 0.48, p < 0.01), or other interventions (g = 0.47, p < 0.001). All of these effects are considerably stronger than those extracted in Nesbit and Adesope’s (2006) meta-analysis, which, in part, demonstrates the benefit of updating meta-analyses.

Table 9 The influence of constructing or studying concept maps depending on the control condition

Significant differences were also found between control conditions when learners studied concept maps (Q B (5) = 11.38, p = 0.04). Most of these studies compared studying concept maps to studying a text (k = 39), in which case studying concept maps was associated with a small, positive effect (g = 0.29, p = 0.001). When studying concept maps was compared to studying or constructing lists, we found a small to moderate effect favoring the concept map condition (g = 0.43, p = 0.02). While studying concept maps was considerably more effective than studying or constructing outlines (g = 0.72, p = 0.03) and discussion or lectures (g = 1.09, p < 0.001), the sample sizes were small (k = 2 and k = 5, respectively).

One classification of learning strategies that has recently been investigated in relation to concept mapping is retrieval practice. Retrieval practice can be defined as “having learners set aside the material they are learning and practice actively reconstructing it on their own” (Karpicke et al., 2014, p. 198). There are many types of retrieval practice, such as creating concept maps, free recall, or cued recall (Blunt & Karpicke, 2014; Karpicke et al., 2014). In our analysis, few studies contained retrieval practice activities in comparison to concept mapping. Those that did were coded into the appropriate comparison conditions based on the nature of the intervention (e.g., constructed text if they wrote a summary). Future research is needed to explore for whom and under what conditions concept mapping can be an effective form of retrieval practice compared to other retrieval practice activities.

Grade Level of the Learner

Nesbit and Adesope’s (2006) analysis showed that constructing concept maps was more effective for intermediate level students in grades 4 to 8 (g = 0.91, p < 0.05) and postsecondary students (g = 0.77, p < 0.05) than secondary students in grades 9 to 12 (g = 0.17, p < 0.05). However, Nesbit and Adesope found that studying concept maps was more effective for intermediate level students (g = 0.52, p < 0.05) than postsecondary students (g = 0.36, p < 0.05).

We sought to replicate this analysis with our expanded data set. Our analysis indicated that no significant differences existed between age levels when learners created concept maps (Q B (2) = 0.08, p = 0.96). As seen in Table 10, constructing concept maps was associated with moderate to large effects regardless of whether the learner was in intermediate, secondary, or postsecondary education. These results speak to the efficacy of creating concept maps as an instructional strategy.

Table 10 The influence of constructing or studying concept maps depending on the learners’ grade level

Our analysis also examined the effects of studying concept maps depending on the learners’ grade level. We found that significant differences existed between the grade levels (Q B (2) = 25.30, p < 0.001). Studying concept maps was found to be very effective for secondary students (g = 1.24, p < 0.001) and intermediate level students (g = 0.82, p < 0.001), although sample sizes were relatively limited for both groups (k = 4 and k = 7, respectively). We hypothesize that the spatially contiguous nature of concept maps may aid younger students by clearly delineating the relationships between concepts, without the need for selecting and organizing the information from an expository text. Based on this hypothesis, it makes sense that studying concept maps produced smaller, yet statistically significant effects (g = 0.32, p < 0.001) for postsecondary students since they are more experienced learners, as well as more experienced readers.

Level of Collaboration

In Nesbit and Adesope’s (2006) study, creating concept maps in a group while also having time to work individually produced a stronger effect (g = 0.96, p < 0.05) than only working alone (g = 0.12, p > 0.05). Interestingly, the opposite pattern was found when learners studied concept maps; individually studying was more effective than studying in dyads (Nesbit & Adesope, 2006).

Our analysis found the advantage of having learners construct concept maps was not moderated by collaboration (Q B (4) = 8.79, p = 0.07). As shown in Table 11, constructing concept maps was associated with moderate to large effects regardless of whether learners worked in groups, by themselves, or with some combination of the two.

Table 11 The influence of constructing or studying concept maps depending on the level of collaboration between learners

A similar pattern was seen when learners studied concept maps. Again, no differences were found between groups (Q B (3) = 1.47, p = 0.69). The data show that studying concept maps was associated with moderate effects regardless of whether learners learned alone or in groups.

Due to the nature of meta-analysis, it is not possible to fully explain why collaborative use of concept maps was not significantly more effective than using them independently, but these findings raise important research questions. For example, does the nature of learning with concept maps require similar meta-cognitive strategies as collaborative learning entails? Future research can explore this type of question.

Duration of the Study

According to Nesbit and Adesope’s (2006) analysis, constructing concept maps was found to be most effective (g = 0.70, p < 0.05) when the study lasted less than 5 weeks. Longer duration studies were associated with smaller effects (g = 0.36, p < 0.05). We sought to examine if this trend continued with the larger sample size.

We found that there were significant differences depending on the length of the study when learners constructed concept maps (Q B (3) = 22.32, p < 0.001). Interestingly, results from the present meta-analysis (see Table 12) contradicted Nesbit and Adesope’s (2006), who found that shorter duration interventions were associated with higher effects. With the larger sample size, our results show that studies 1 to 4 weeks in duration (g = 0.95, p < 0.001) and more than 4 weeks in duration (g = 0.72, p < 0.001) were considerably more effective than studies that lasted less than 1 week (g = 0.40, p < 0.01). We hypothesize that as learners gain experience concept mapping, the cognitive load associated with the format of the activity itself decreases, thus allowing the learner to focus their cognitive processing on the learning material rather than the format. This would explain why longer duration studies find greater benefits from concept mapping; however, it would not explain why the studies exceeding 4 weeks have a slightly lower effect size than studies that lasted from 1 to 4 weeks. More research is needed to understand if the influence of concept mapping increases, decreases, or remains stable over longer periods of time.

Table 12 The influence of constructing or studying concept maps depending on the duration of the intervention

When learners studied concept maps, no significant differences were found depending on the duration of the study (Q B (2) = 5.22, p = 0.07). Overall, moderate effects were associated with studying concept maps. However, while only four studies examined the impact of studying concept maps for more than 4 weeks, these studies showed a considerably high effect size (g = 0.70, p < 0.001), similar to that of studies where learners constructed concept maps. We question if this effect size is true or simply an artifact of sample size. In other words, would the effect size be more consistent with the other studies in which learners studied concept maps had there been a larger sample?

Publication Bias

Publication bias is an ongoing concern in meta-analysis (Rosenthal, 1979). Accordingly, we conducted two tests to statistically examine the influence of publication bias on our results. First, we conducted the Classic fail-safe N test to examine how many null effect studies would be needed to raise the p value greater than 0.05. The results indicated 8394 studies would be needed. The result of the classic fail-safe statistical test shows that the number of null or additional studies needed to nullify the overall effect size found in this meta-analysis is larger than the 5k + 10 limit suggested by Rosenthal (1995). Next, we examined the results of Egger’s linear regression test (Egger et al., 1997). The results showed that publication bias was not present in the sample (intercept = 0.48, t(140) = .63, p = 0.53) at a level that would influence the interpretation of our results.

Conclusion

Our meta-analysis, like those published earlier, supports the conclusion that constructing and studying concept maps are effective learning activities relative to a variety of other teaching and learning strategies. Constructing and studying concept maps are effective in group and individual activities, in STEM and non-STEM subjects, and at all levels of schooling. Although Nesbit and Adesope (2006) found differences in the efficacy of using concept maps in individual and cooperative tasks, and across different map types, the present review analyzed far more studies and found no such differences.

There are markers that distinguish theory-oriented and evaluation-oriented research. Theory-oriented research devotes space in the introductory section to alternative theories that explain an observed phenomenon. The design of theory-oriented research aims to ensure that all treatment conditions are the same except for the feature that distinguishes competing theories. In contrast, evaluation-oriented research typically discusses only a single theory thought to account for the efficacy of a treatment. The design of evaluation-oriented research often compares an intervention of interest to a control treatment presumed to represent common practice and has many unspecified features (often in the form of confounding variables) that differ from the intervention of interest. Much of the research included in this meta-analysis is evaluation-oriented research designed to investigate if using concept maps is effective under some set of conditions, and very little is theory-oriented research designed to investigate why using concept maps may be effective.

Nesbit and Adesope (2013) proposed seven cognitively oriented hypotheses that could explain the advantages of using concept maps for teaching and learning in comparison with reading text, listening to lectures, participating in discussions, writing summaries, and other instructional activities. First, using concept maps may enable dual coding of information in verbal and visual components of longer-term memory and thereby support more effective retrieval. Second, in comparison with text, they may allow cognitive load to be distributed across the visual and verbal channels of working memory, thus avoiding an overload of verbal working memory. Third, concept maps tend to consolidate multiple references to a concept at a single point in space, while in text, audio or other sequential formats the references would be spread over the sequence. Consolidating all relationships to a concept around a single point, a kind of spatial contiguity, may promote a more semantically integrated understanding of the concept. Fourth, in some types of concept maps, particularly those specified by Novak and Cañas (2008), superordinate and subordinate semantic relationships (e.g., mammal-squirrel) are signaled more strongly than they typically are in text. Fifth, the noun-verb-noun syntax used to express propositions in concept maps is much simpler and more accessible to poor readers and writers than the typical prose of expository text. Sixth, the decisions required to construct a concept map (e.g., determining which nodes should be placed close together) entail greater elaborative or germane processing than the decisions required to construct expository text. Finally, because concept maps take up more space than text, they may demand a greater degree of concision or summarization which in turn prompts greater elaborative processing.

All the forgoing hypotheses are amenable to investigation by theory-oriented research. For example, to investigate whether the simple syntax of concept maps accounts for their efficacy, researchers could compare studying (a) expository text, (b) a concept map representing the text, and (c) list of simple noun-verb-noun propositions semantically equivalent to the concept map. The simple syntax hypothesis predicts learners with lower verbal ability would benefit more from studying concept maps and lists of simple propositions, and they would receive equal benefit from these two representations.

Although the general principles of fostering meaningful learning and reducing extraneous cognitive load are often put forward to explain the beneficial effects of using concept maps, the empirical support for these explanations is sparse. There is very little research on the specific features of concept maps that promote cognitive elaboration or reduce extraneous load. To advance understanding of the cognitive processes underlying learning from concept maps, much more theory-oriented research that examines these features is needed. Our meta-analysis only reviewed research that compared learning with concept maps to learning without them, but possibly the most illuminating theory-oriented research would compare the effects of learning with different types of concept maps. For example, research could compare the effects of learning with (1) a typical map design that consolidates all connections to a concept at a single node and (2) a degraded map design that has a separate node for each reference to a concept. Learning outcomes favoring the first type of design may be the best evidence that consolidating all references to a concept at a single point is a crucial feature explaining the advantages of using concept maps.

Research comparing different map designs may also be the best way to create more advanced types of concept maps. For example, visually signaling learners to attend to immediately relevant information can aid in learning (Mayer & Fiorella, 2014; van Gog, 2014), and map comparison research may be able to demonstrate advantages for concept maps that signal content using color, animation, or other visual cues.

In summary, this meta-analysis synthesizes 42 years of research around the efficacy of learning with concept maps compared to other instructional interventions. Analysis of the data highlights the continuing strengths of learning with concept maps across a variety of instructional contexts and in comparison to many instructional conditions. Research is needed to better understand the cognitive processes involved in learning with concept maps, as well as how to design more effective concept maps in order to create even more effective instructional interventions.