1 Introduction

People engage in comparison every day, whether it is comparing different routes to drive home from work or comparing different products to buy. Comparison is a fundamental cognitive process that can support learning in a variety of domains, including mathematics (Alfieri, Nokes-Malach, & Schunn, 2013; Gentner, Loewenstein, & Thompson, 2003; Gick & Holyoak, 1983). In fact, research and recommendations in mathematics education laud comparison as an important and effective learning process for mathematics (Common Core State Standards in Mathematics, 2010; Silver, Ghousseini, Gosen, Charalambous, & Strawhun, 2005). Additionally, expert teachers in multiple countries use comparison in mathematics classrooms (Lampert, 1990; Richland, Zur, & Holyoak, 2007). However, until relatively recently, previous experimental work had not been conducted in mathematics classrooms with school-age children, and the effects of comparison in the classroom on mathematics knowledge was unclear. The current paper presents empirical findings that support recommendations on using comparison of multiple strategies in mathematics classrooms.

The current paper outlines the proposed mechanisms of comparison and how those mechanisms could lead to improved mathematics learning. We then discuss a broad review of past empirical research on comparison, followed by a description of an illustrative line of research on comparing multiple strategies in mathematics classrooms. This research leads to recommendations for educators that are easy to use and a discussion of issues to be pursued in future research on comparing multiple strategies.

2 Explanation of the instructional design principle and its theoretical underpinnings

2.1 Defining comparison of multiple strategies

Comparison involves looking between multiple things (objects, examples, ideas, etc.) and noting the similarities and differences between them (Oxford English Dictionary, 2016). Comparison can improve learning in many domains and for many age groups, ranging from preschoolers learning new words (Namy & Gentner, 2002) to business school students learning contract negotiation strategies (Gentner et al., 2003). For instance, infants can learn the distinguishing characteristics of cats and dogs by comparing a picture of a cat and a dog side-by-side (Oakes & Ribar, 2005). Comparing examples can help people recognize more abstract, high-level commonalities between them, which can help learners better understand important principles (e.g., Kotovsky & Gentner, 1996). This kind of analogical reasoning has been studied for decades (e.g., Catrambone & Holyoak, 1989; Gentner, 1983), and suggests that comparison of examples can be useful for problem-solving in many domains because it highlights important common structures (Gentner, 1989). When studying examples sequentially, it can be difficult to determine what features are important to deeper structural aspects of the problem. The structural alignment that occurs between examples during comparison can resolve some of this difficulty. For example, students who were learning contract negotiation strategies were more than twice as likely to transfer an important principle to a novel test case when comparing two negotiation cases rather than studying those cases separately (Gentner et al., 2003). Recognizing similarities and differences between examples can help learners discern necessary aspects for learning, and this is a central tenet of variation theory. In fact, variation theory suggests that what someone learns is directly dependent on the kind of variation to which that person is exposed (Kullberg, Runesson, & Marton, 2017). By intentionally varying examples on important structural features, learners can better attend to targeted aspects of those examples that are important to learn (Watson & Mason, 2006).

While comparison is thought to be a useful practice for improving learning, explicit prompts to compare greatly improved the benefits of studying multiple examples as people do not always spontaneously compare (Catrambone & Holyoak. 1989; Gentner et al.. 2003). For example, college students who were told to compare two problems that differed in surface features but could be solved using the same procedure were more likely to notice the convergent procedure than students who were not prompted to compare the two problems (Catrambone & Holyoak, 1989). Together, these studies suggest that learners benefit from comparison, especially when explicitly prompted to compare.

In mathematics education, there are different kinds of comparison that can be used with problems. For instance, students can compare isomorphic problems (i.e., similar problems solved with the same strategy), compare problem types (i.e., problems with different underlying structure solved with the same strategy), or compare multiple strategies (i.e., the same problem solved with different strategies) (e.g., Rittle-Johnson & Star, 2009). In the current paper, we will focus on comparing multiple strategies, which is the type of comparison often used by expert mathematics teachers and recommended in mathematics education standards (Ball, 1993; Common Core State Standards in Mathematics, 2010; Lampert, 1990).

2.2 Proposed mechanisms of comparison

One of the primary benefits of comparison is that it allows learners to see the underlying structure of both examples (e.g., Loewenstein, Thompson, & Gentner, 1999). Students often focus on unimportant, surface features that are not relevant to the target concepts and procedures (Gentner, 1989). For example, when solving word problems, students often focused on the cover story as opposed to the underlying structure of the problems (Catrambone & Holyoak, 1989). Making direct comparisons between two examples can lead to important structural alignment that highlights the shared relational structure between two examples for students (Gentner, 1983). In turn, this can facilitate students’ transfer of knowledge to novel problems with the same underlying structure, but different surface features. Comparisons can lead students to notice important, deep structural aspects of examples instead of merely notice surface features, and students can then more easily identify meaningful similarities and differences (Loewenstein et al., 1999).

While comparison helps students identify important similarities and differences, it can be taxing on working memory and requires a lot of cognitive processing (Cho, Holyoak, & Cannon, 2007; Morrison et al., 2004; Richland, Morrison, & Holyoak, 2006). Consequently, prior knowledge is important when considering the effectiveness of comparison (Rittle-Johnson, Star, & Durkin, 2009). If students have low prior knowledge, the concepts and procedures illustrated in examples they compare are often unfamiliar. It can be difficult for students to identify relevant similarities and differences in two unfamiliar examples because this unfamiliarity makes it hard for students to recognize the aspects to which they should attend (e.g., Gentner et al., 2003; Schwartz & Bransford, 1998). This can make it challenging for students to understand the importance of the similarities and differences. Consequently, comparison may be too overwhelming to significantly improve learning without sufficient prior knowledge or adequate scaffolding. On the other hand, comparison can be an effective instructional process if students have sufficient prior knowledge. Alternatively, teachers can provide more scaffolding to help students successfully compare examples, such as focusing on only one or two problem types in a lesson, beginning with easier explanation prompts, and/or highlighting some of the important similarities and differences (Rittle-Johnson, Star, & Durkin, 2012). In our work, we have focused on the potential benefits of comparing multiple strategies and when such comparison is helpful depending on prior knowledge. Comparing two or more strategies can help learners recognize important structural features of solutions, which in turn can help them more accurately solve problems, understand the concepts that explain why a strategy works, and identify which strategy is most efficient.

3 Review of empirical research on comparison in mathematics learning

Research on comparison in mathematics learning has often focused on comparing different problems types and comparing multiple solution methods. Experimental research indicates that comparing different problem types has the potential to promote mathematics learning. First, comparing easily confusable problem types helps learners distinguish between the problem types and solve more problems correctly (Cummins, 1992; VanderStoep & Seifert, 1993). For example, comparison of algebraic addition and multiplication examples supported better problem-solving accuracy than sequential study of addition examples, followed by multiplication examples (Ziegler & Stern, 2014, 2016). Second, comparing positive and negative examples of key ideas may improve conceptual knowledge. Students who compared problems that were positive and negative examples of each key idea (e.g., a line segment that was versus was not the altitude of a triangle) gained greater conceptual knowledge than students who studied only positive examples (Guo & Pang, 2011). Note that the control condition was not exposed to negative examples, making it impossible to know whether comparison was critical. Overall, comparing problems may be particularly useful in helping people recognize important problem features that differ between carefully selected problems.

In addition, research from case-studies indicates that expert mathematics teachers recognize the importance of comparing multiple strategies (e.g., Ball, 1993; Fraivillig, Murphy, & Fuson, 1999; Lampert, 1990; Silver et al., 2005), including teachers from high-performing countries such as Hong Kong and Japan (Richland et al., 2007; Stigler & Hiebert, 1999). Comparing multiple strategies is considered a best practice in mathematics education. Having students share and compare multiple strategies is part of reform mathematics pedagogy in several countries (Australian Education Ministers, 2006; Common Core State Standards in Mathematics, 2010; Kultusministerkonferenz, 2004; Singapore Ministry of Education, 2006; Treffers, 1991). Further, a Practice Guide from the US Department of Education identified comparing multiple strategies as one of five recommendations for improving mathematical problem solving in the middle grades (Woodward et al., 2012).

Although research indicates that comparing multiple strategies is a practice used by expert mathematics teachers, there is also research to indicate that many teachers have trouble using comparison of multiple strategies effectively in their classrooms, particularly in the US (Richland, Holyoak, & Stigler, 2004; Richland et al., 2007; Stein, Engle, Smith, & Hughes, 2008). This difficulty is due in part to teachers struggling to make appropriate connections between students’ strategies and explanations and engaging students in a productive discussion (Stein et al., 2008). However, up until our research team’s studies on comparison of multiple strategies, little empirical work had examined the benefits of comparing multiple strategies in mathematics classrooms.

4 An illustrative line of research on comparison of multiple strategies

Our research team has conducted researcher-led short classroom studies as well as a year-long teacher-led classroom study, and each has provided insight into how and when comparison of multiple strategies promotes mathematics learning. We focused on comparing multiple strategies throughout this line of research because, as previously mentioned, it is thought to be a best practice in mathematics education but little empirical work had addressed why and when comparing multiple strategies could benefit learning.

4.1 Researcher-led studies

The initial goal of this line of research was to investigate the benefits of comparison in mathematics classrooms with school-age children experimentally, which had not been done previously. In five experiments, we redesigned 2 or 3 lessons on a topic and implemented these lessons during mathematics classes. Before and after the lessons, students’ knowledge in the relevant domain was assessed in three areas: conceptual knowledge, procedural knowledge, and procedural flexibility. Conceptual knowledge is “an integrated and functional grasp of mathematical ideas” (Kilpatrick, Swafford, & Findell, 2001, p. 118). Procedural knowledge is the ability to execute action sequences to solve problems (Rittle-Johnson, Siegler, & Alibali, 2001). Procedural flexibility is knowing how to solve a problem in multiple ways and when each way is most efficient (Kilpatrick et al., 2001; Star, 2005). In the created lessons, we used packets of worked examples and prompted students to compare and explain the different examples illustrated in pairs. Working with a partner provided a familiar context for students to generate comparisons, and students who collaborate with a partner tend to learn more than those who work alone (Johnson & Johnson, 1994; Webb, 1991). The studies differed in the domains used and in the types of examples and comparison presented for the condition manipulation. In all studies, the condition manipulation occurred at the partner level so that all conditions were present in all classrooms. We briefly describe the different studies below.

In our first study (Rittle-Johnson & Star, 2007), we worked with seventh-grade students (N = 70) learning about solving multi-step algebraic linear equations (e.g., 2(x + 1) + 3(x + 1) = 10). For these problems, a conventional strategy of distributing could be used or a shortcut, nonconventional strategy could be used. In the previous example, the nonconventional strategy could involve combining composites (5(x + 1) = 10), dividing both sides by 5, and then solving for x. During the lessons, students and their partners were randomly assigned to one of two conditions: comparing multiple strategies or sequentially viewing examples. Students who compared multiple strategies saw the same problem solved two different ways on each page (see Fig. 1). They were first prompted to fill-in missing step labels in the worked examples, and they then answered questions prompting comparison (e.g., Describe two ways these students’ solution steps are similar; On a timed test, I would use ____’s way because…). Students who studied examples sequentially saw one example on each page and were prompted to explain that individual problem after filling in missing step labels. Students who compared strategies had greater procedural knowledge (d = 0.53) and procedural flexibility (d = 0.38) than students who studied examples sequentially. There was no difference between conditions on conceptual knowledge (d = −0.14), but the conceptual knowledge measure had poor reliability. Coding of students’ explanations revealed that students who compared multiple strategies often made explicit comparisons evaluating strategies’ efficiency and accuracy, and students who compared were also more likely to use alternative strategies when solving practice problems during the lesson. Both of these practices were predictive of improved mathematics knowledge.

Fig. 1
figure 1

Sample worked example pair from Rittle-Johnson & Star (2007)

Our second study (Star & Rittle-Johnson, 2009) involved fifth- and sixth-grade students (N = 157) learning how to estimate answers to multiplication problems (e.g., About how much is 37 × 29?). The design was similar to the first study with pairs of students randomly assigned to either compare multiple strategies or study examples sequentially. Again, comparing multiple strategies led to greater procedural flexibility (d = 0.47), and in this case also led to better retention of conceptual knowledge two weeks later if students had above-average knowledge of estimation at pretest (η p 2 = 0.04). These results were remarkably similar to the earlier study findings and provided support that comparing multiple strategies could improve mathematics learning in very different domains. This led our research team to investigate what factors might affect when comparing multiple strategies was an effective instructional tool.

In our third study (Rittle-Johnson & Star, 2009), we worked with seventh- and eighth-grade students (N = 162) who had prior knowledge about equation solving. The design was similar to the first study, but students were randomly assigned to one of three conditions: comparing multiple strategies, comparing problem types (i.e., seeing two different problems solved the same way), or comparing equivalent problems (i.e., seeing two similar problems solved the same way). Comparing multiple strategies led to greater conceptual knowledge and procedural flexibility than the other two comparison conditions (η p 2 = 0.07 and η p 2 = 0.06, respectively), and procedural knowledge was similar for all conditions (η p 2 = 0.01). This evidence suggested that comparing multiple strategies could be the most beneficial of these comparison types for learning mathematics.

As previously mentioned, familiarity with examples can affect how easily students can align examples and recognize the aspects to which they should attend (e.g., Gentner et al., 2003). Consequently, our fourth study tested the importance of prior knowledge when using comparison (Rittle-Johnson et al., 2009). We worked with seventh- and eighth-grade students (N = 236) who had limited experience solving algebraic equations. The design was similar to the first study, although here students were assigned to one of three conditions: comparing multiple strategies, comparing problem types, or studying examples sequentially. Students who were novices (i.e., did not attempt algebraic methods at pretest) learned best from either comparing problem types or studying examples sequentially. This seemed to be because they found comparing multiple strategies to be more difficult—they got through fewer examples and were less accurate when using nonconventional strategies when prompted to do so. On the other hand, students who had some prior knowledge of algebraic methods learned best from comparing multiple strategies.

We next wanted to see whether slowing the pace of instruction to provide novices with additional support might improve learning from comparing multiple strategies. Our fifth study (Rittle-Johnson et al., 2012) involved eighth-grade students (N = 198) who also had little prior experience with equation solving. We adapted the materials from the fourth study to focus on fewer problem types, reduced the numbers of examples, and made the lessons 30 min longer. Students were randomly assigned to one of three conditions: immediate comparison of multiple strategies, delayed comparison of multiple strategies (i.e., students only studied one strategy on the first day and compared that method to alternative methods on the second day), or sequentially studying examples. Under these conditions, students who compared multiple strategies immediately had higher procedural flexibility than those who had a delayed comparison of strategies or who sequentially studied examples (d = 0.39 and d = 0.35, respectively), even one-month later (d = 0.50 and d = 0.32, respectively), regardless of prior knowledge. Thus, when the pace of instruction was slowed, immediately comparing multiple strategies was beneficial for students with low prior knowledge. Aligning examples can be difficult for students with low prior knowledge, but comparing multiple strategies can be helpful if students are provided with appropriate support.

These five studies indicate that comparing multiple strategies can improve students’ procedural flexibility and sometimes conceptual knowledge (Rittle-Johnson & Star, 2009; Rittle-Johnson et al., 2009; Star & Rittle-Johnson, 2009) and procedural knowledge (Rittle-Johnson & Star, 2007; Rittle-Johnson et al., 2009), but prior knowledge must be taken into consideration. Novices can compare multiple strategies early in the learning process, but scaffolds and supports should be provided to aid in aligning examples. Throughout these studies, students who compared multiple strategies often spent more time discussing the relative efficiency of different strategies rather than discussing surface differences between examples, and they spent more time focusing on the methods and individual solutions steps during the lesson (Rittle-Johnson & Star, 2007, 2009; Rittle-Johnson et al., 2009). This suggests that comparing multiple strategies did help students better attend to important structural features of examples (e.g., the relative efficiency of steps) rather than focusing on surface features of the problem (e.g., what variables were used). More time spent discussing these important aspects of the examples likely improved learning. Teachers may need guidance in deciding when to use comparison in their lessons and what to compare, as different types of comparison with multiple strategies can further different learning goals. Depending on the learning goal of a lesson (e.g., understanding a common error), different strategies to compare may be more beneficial than others.

4.2 Year-long teacher-led study

We next wanted to investigate whether materials could be developed for mathematics teachers that would prompt comparison of multiple strategies with explanations as a supplemental curriculum. We developed and evaluated a set of supplementary materials for supporting comparison in Algebra I instruction with a team of researchers, mathematicians, and Algebra I teachers. The materials were developed by identifying important concepts, common student difficulties, and key misconceptions throughout a typical Algebra I course, and then creating comparison materials to attempt to address them.

At the core of the supplemental curriculum were 141 worked example pairs (WEPs). Each WEP showed the mathematical work and dialogue of two hypothetical students, Alex and Morgan, as they attempted to solve one or more algebra problems (see Fig. 2). The curriculum contained four types of WEPs, with the types varying in what was being compared and the instructional goal of the comparison. Which is better? WEPs showed the same problem solved using two different, correct strategies, with the goal of understanding when and why one strategy is more efficient or easier than another strategy for a given problem (e.g., Rittle-Johnson & Star, 2007). Why does it work? WEPs showed the same problem solved with two different correct strategies, but with the goal of illuminating the conceptual rationale in one strategy that is less apparent in the other strategy (Newton, Star, & Lynch, 2010). Which is correct? WEPs showed the same problem solved with a correct and incorrect strategy, with the goal of understanding and avoiding common errors (e.g., Durkin & Rittle-Johnson, 2012). How do they differ? WEPs showed two different problems solved in related ways, with an interest in illustrating what the relationship between problems and answers of the two problems revealed about an underlying mathematical concept (Newton et al., 2010). Carefully chosen prompts for explanation accompanied each WEP, and we supplemented each WEP with an additional, “take-away” page. This take-away page was provided to make an explicit summary statement of the instructional goal of the WEP. Prior research suggests that direct instruction is needed to supplement student-generated comparisons (Schwartz & Bransford, 1998), and this additional scaffold of an explicit take-away page could help learners with varying prior knowledge benefit from our materials. We purposely gave teachers a large amount of freedom to choose which WEPs to use and when to use them, and simply asked them to use our materials at least once per week. With this freedom, teachers were allowed to adapt the time spent on each WEP, depending on the prior knowledge of the students in their class.

Fig. 2
figure 2

Sample worked example pair of a Which is better? comparison

To prepare teachers to use our supplemental curriculum, they participated in a 1-week, 35-hour professional development institute that we designed and administered during the summer (Newton & Star, 2013). During this summer institute, teachers were given the opportunity to read through the supplemental curriculum materials, view videotaped exemplars of other teachers using the curriculum, and plan and teach sample lessons using the materials to their peers. Teachers were also given detailed guidance on the desired implementation model for the curriculum materials. During this training, teachers were encouraged to cover up one of the strategies when first introducing the WEP to have students explain one strategy before seeing the second strategy and comparing the two. This was suggested to reduce cognitive load and improve learning for students with varying prior knowledge. Furthermore, teachers evaluated their own and their peers’ sample lessons for adherence to the desired implementation model, using the instrument designed to assess implementation fidelity.

During the year, we explored the feasibility of implementation of our Algebra I supplemental curriculum and its impact on teachers’ instruction and students’ mathematical knowledge with a randomized control trial (Star, Pollack, et al., 2015b). Initially, 141 Algebra I teachers were randomly assigned to either implement the comparison curriculum as a supplement to their regular curriculum or to be a “business as usual” control. Attrition caused by a range of factors led to 76 teachers and their students completing the study. Professional reasons were the most common reasons for teacher attrition (e.g., teachers were no longer teaching Algebra I by the time the school year began). Some teachers also left the study due to personal reasons, such as extenuating family circumstances or life changes, although this was less common. There was also a small number of teachers who discontinued contact with the research team without specifying a reason. At the beginning and end of the school year, students completed a researcher-designed assessment of algebra knowledge based on items used in our researcher-led studies and items from national and state standardized assessments. The assessment consisted of 36 multiple-choice items: 12 procedural knowledge items, 11 procedural flexibility items, and 13 conceptual knowledge items.

Observations, surveys, and interviews indicated that the professional development was successful in familiarizing teachers with the supplemental curriculum (Lynch & Star, 2014a) and that students enjoyed and found valuable the emphasis on multiple strategies (Lynch & Star, 2014b). In addition, teachers implemented the materials with reasonable fidelity (Star, Pollack, et al., 2015). On average, teachers asked questions from all types of reflection prompts that we recommended (83% of the time), used the prompts in the correct order prescribed (96% of the time), engaged students in whole-class discussion (89% of the time), and displayed the learning objective (86% of the time). Also, control teachers rarely implemented important instructional practices related to comparison of multiple strategies when using their regular classroom materials. They exposed students to multiple strategies an average of only 38% of the time, presented multiple strategies side by side an average of 12% of the time, and explicitly compared multiple strategies an average of 9% of the time.

However, the results of the randomized controlled trial indicated that there was no main effect of condition on student achievement, in large part because use of the supplemental curriculum was much less frequent than requested (Star, Pollack, et al., 2015). Strikingly, almost half of the teachers used our materials on 5 or fewer occasions during the entire school year. A subsequent dose–response analysis, controlling for covariates such as students’ prior knowledge, suggested that the more often teachers used the curriculum, the higher students’ procedural knowledge at the end of the year (Star, Pollack, et al., 2015). This could be due to the fact that using our materials more frequently caused more learning to occur, or it could be due to more effective teachers using the curriculum more frequently. Getting the teachers to use the curriculum frequently was a challenge. Teachers likely need additional supports to integrate our comparison materials into their instruction, such as guidance on which worked example pairs to use when in conjunction with their curriculum. Based on this research, we have several recommendations for educators when comparing multiple strategies in their mathematics classrooms.

5 Recommendations for mathematics educators on using comparison of multiple strategies

Lab-based and classroom-based studies on comparison of multiple strategies suggest four recommendations educators can follow to improve learning with this instructional method.

5.1 Regular and frequent comparison of multiple strategies

To effectively use comparison in the classroom, educators should have students compare multiple strategies on a regular basis. This was the goal of our year-long supplemental curriculum (Star, Pollack, et al., 2015), but unfortunately many teachers did not use our materials very frequently. However, our dose–response analysis indicated that using comparison of multiple strategies more frequently was related to better student procedural knowledge. Comparing multiple strategies frequently can be important for several reasons. By regularly engaging in these comparisons, students may become more familiar with the method of comparison, making mutual alignment of strategies easier over time. This alignment is important for noticing the important underlying structure of problems (rather than surface features), and students may be able to recognize deeper structural similarities and differences between strategies with practice. Additionally, engaging students in effective discussions of the important take-away points of a comparison can be difficult (Stein et al., 2008) and may not be something students have engaged in before due to many teachers struggling to use comparison effectively (e.g., Richland et al., 2007). Students may need frequent opportunities to compare to reap the full benefits of improved understanding of underlying concepts and transferring procedures to novel problems.

Although regular comparison of strategies can improve learning, educators should be mindful of when during instruction they introduce alternative strategies. Previous evidence suggests that comparing two unfamiliar strategies may be overwhelming for students (Rittle-Johnson et al., 2009), although providing a slower instructional pace and additional scaffolding can help (Rittle-Johnson et al., 2012). It may be best to have students develop some fluency with a strategy and then compare that strategy to other, alternative strategies that might differ in efficiency or conceptual transparency. This can make the alignment of examples easier for students than if they are aligning two unfamiliar examples, which can lead them to notice more important structural concepts (Gentner et al., 2003). This can also help teachers use comparison of multiple strategies regularly in their classrooms as newly introduced strategies can be compared to strategies used in previous lessons.

5.2 Judicious selection of strategies and problems to compare

There are many different types of comparison educators can use when having students compare multiple strategies. As previously mentioned, we created WEPs that focused on four different types of comparison: Which is better?, Why does it work?, Which is correct?, and How do they differ?. Each of these types of comparison are thought to be particularly useful for improving specific knowledge types (Rittle-Johnson & Star, 2011). For instance, Which-is-better? comparisons where students compare two different strategies to solve the same problem emphasize the efficiency of procedures and can particularly improve procedural flexibility. On the other hand, Why does it work? comparisons where students compare two strategies to solve the sample problem but with the goal of illuminating the conceptual rationale in one strategy that is less apparent in the other may particularly improve conceptual knowledge. Which is correct? comparisons show the same problem solved with a correct strategy and with an incorrect strategy to help students understand and avoid common misconceptions. This may particularly improve conceptual and procedural knowledge. Finally, How do they differ? comparisons show two different problems solved in related ways, with the relationship between the problems and answers revealing an important underlying concept. Consequently, educators should think carefully about the specific learning goals for their lessons when selecting the type of comparison and what prompts to use to encourage students to notice the aspects of most importance. In addition to considering learning goals, educators should think carefully about students’ prior knowledge when selecting strategies to compare. Educators may not want to select two unfamiliar strategies to compare without providing additional scaffolds for students, such as reducing the number of problem types used, providing more time to work with materials, and adding explicit take-away messages.

5.3 Carefully-designed visual presentation of problems to compare

Educators should also be thoughtful in how they visually present examples to compare. First, examples should be presented side-by-side to help students align them (e.g., find the similarities in the examples) and facilitate students noticing important structural features (Gentner, 1983). Also, the solution steps should be labeled using common labels because common labels facilitate alignment of examples and can improve learning from comparison (e.g., Namy & Gentner, 2002). Additionally, spatial cues can be used to help students map the appropriate solution steps that will help them notice important similarities and differences (Richland et al., 2007). By making sure solution steps visually align, educators can lower the difficulty for students to be able to recognize the important comparison points. Finally, explicit, written explanation prompts asking students to identify similarities and differences and make connections should be included (e.g., What are the similarities and differences between the ways? Which strategy do you think is more efficient for this problem?). Such prompts are encouraged by expert mathematics teachers (Fraivillig et al., 1999; Lampert, 1990) and have been shown to increase comparison and improve learning outcomes (Catrambone & Holyoak, 1989; Gentner et al., 2003).

5.4 Using discussions around the comparison of multiple strategies

Explanation and discussion is an important aspect of using comparison of multiple strategies to promote mathematics learning. In small groups or with the whole class, educators can guide discussions around the comparison of multiple strategies, particularly on the similarities, differences, affordances, and constraints of the different strategies. Encouraging students to generate explanations can improve integration of new information with prior knowledge, guide attention to relevant structural features, and help students resolve cognitive conflicts between new information and incorrect prior knowledge (e.g., Chi, 2000; Siegler & Chen, 2008). Generating explanations can help students actively process information, which can improve learning and transfer across many different mathematics topics (Aleven & Koedinger, 2002; Hodds, Alcock, & Inglis, 2014; Rittle-Johnson, 2006; Rittle-Johnson, Loehr, & Durkin, 2017; Rittle-Johnson, Saylor, & Swygert, 2008; Webb et al., 2014). In turn, teachers should facilitate discussion of different students’ explanations, helping them build upon each other’s reasoning (Lampert, 1990; Silver et al., 2005; Stein et al., 2008).

Such classroom discussions are important when learning from comparison, but teachers often struggle to facilitate high-quality discussion. Results from our year-long study indicated that teachers who used our materials engaged their students in comparison but usually did so in a teacher-centered way that involved little student discussion (Star, Newton, et al., 2015; Star, Pollack, et al., 2015). This is not unusual. Studies have found that teachers in the US often ask their students to explain the simple parts of a problem, such as identifying a similarity or difference, but then explain the difficult pieces themselves. In contrast, teachers from Asian countries often ask their students to explain difficult aspects when making comparisons (Hiebert et al., 2003; Richland, Stigler, & Holyoak, 2012). It is important to ask students to compare ideas and encourage students to think about their own and each other’s understanding (Webb et al., 2014).

Teachers need to learn to ask more open-ended, high-level questions. For example, teachers in our study sometimes asked open-ended questions such as “What did you just learn from this?” and “So what’s the general rule?” (Star, Newton, et al., 2015). Teachers who asked more open-ended questions to emphasize the main ideas of the comparisons had students who had the largest procedural flexibility gains. While we tried to make student discussion an integral piece of our supplemental curriculum, we underestimated how difficult it would be for some teachers to effectively solicit and coordinate students’ explanations in class. Educators should create open-ended prompts to encourage comparison of multiple strategies, and they should anticipate, monitor, select, sequence, and make connections between students’ strategies and responses for effective discussions (Stein et al., 2008). Discussion prompts should promote making connections between the different strategies to direct students towards the main learning goal of that comparison type. For example, if students are explaining a Which is better? comparison, in addition to asking for similarities and differences, educators can prompt students to think about what the takeaway message is here about the different strategies and when to use them. This should prompt students to think about the relative efficiency of each strategy and improve their procedural flexibility.

6 Issues for future research on comparing multiple strategies

Future research is needed to develop and rigorously evaluate curriculum materials and specific instructional techniques that effectively incorporate comparison of multiple strategies in mathematics classrooms.

6.1 Evaluating teacher-led comparison and its effect on student outcomes

Prior research has rarely experimentally evaluated the impact of using comparison during classroom instruction on student learning. Our one study that was teacher-led for an entire school year did not show an effect of our comparison curriculum on learning above “business-as-usual” practices; however, this was potentially due to the infrequency with which many teachers used the supplemental curriculum (Star, Pollack, et al., 2015). Consequently, future research should examine the potential effects of teacher-led comparison on student outcomes when comparison happens regularly throughout the school year. Under what classroom conditions might such comparison be most beneficial? For example, how often must comparison be used to improve learning over and above typical instructional practices? Does comparison help both low- and high-achieving students learn from classroom instruction? What scaffolds are needed to help low-knowledge students learn effectively from comparing multiple strategies?

6.2 Improving effectiveness of discussion when comparing multiple strategies

It is also clear from past research that discussion is an important piece of learning in mathematics classrooms. What is less clear is how to improve the effectiveness of discussions in classrooms when comparing multiple strategies. Recent work in mathematics education and practice literatures have suggested methods for improving the quality of mathematical discussions in classrooms, such as developing teachers’ goal setting for a lesson and helping them connect those goals to planned prompts used during the discussion (Tyminski, Zambak, Drake, & Land, 2014). In the context of comparison, teachers need to build discussion prompts that match their instructional goals (e.g., improving procedural flexibility with prompts emphasizing efficiency). Additionally, teachers can improve the quality of discussions by increasing student participation in the more difficult step of making connections, and teachers should keep track of who is contributing reasoning in the class and be thoughtful about how they prompt for and respond to that reasoning (e.g., Conner, Singletary, Smith, Wagner, & Francisco, 2014). Teachers also need to ask appropriate follow-up questions that help students build on one another’s thinking (Franke et al., 2015). While these suggested methods for improving discussion quality show promise, future research will be needed to investigate whether and how these methods might be best used to improve discussion when comparing multiple strategies in mathematics classrooms.

6.3 Providing support to encourage frequent use of comparison in classrooms

As previously mentioned, a challenge in our teacher-led study was getting teachers to use our materials frequently and engage in comparison of multiple strategies on a regular basis (Star, Pollack, et al., 2015b). These findings suggested that teachers needed more support and guidance to use these comparison materials regularly in their classrooms, and may need more help in determining when certain WEPs may be best used depending on students’ prior knowledge. There are several ways to provide such support that should be studied in future research (Star, Rittle-Johnson, & Durkin, 2016). Teachers may need clearer guidance on when certain kinds of comparison and WEPs may be best used in their lessons. Making certain WEPs an integral part of a lesson unit, rather than as something to be supplemented with the existing unit, may help teachers feel more comfortable knowing that that is a recommended time to use that type of comparison. Incorporating explicit comparison of multiple strategies directly into existing lessons may be one way to increase teacher usage of comparison. Additionally, it is possible that ongoing professional development and frequent check-ins with teachers throughout the school year could help teachers feel more comfortable using comparison of multiple strategies in their classrooms and provide teachers with additional opportunities to ask questions about how to best use comparison with discussion in their classrooms. These opportunities could provide a time for teachers to ask about additional scaffolds to include in lessons to support students with lower prior knowledge. Future research should evaluate whether the added resource cost of these supports would improve the quantity and quality of comparison in the classroom enough to be worth the investment.

7 Conclusion

Comparing multiple strategies can improve students’ mathematics learning, particularly if students have the appropriate prior knowledge and scaffolds to align examples and notice important structural components. Our empirical findings were some of the first to support the idea that comparing multiple strategies can benefit school-aged children in mathematics classrooms. While comparing multiple strategies has long been considered a best practice in mathematics education, little empirical work had supported this point and little research had been conducted in classroom settings. From these findings with our carefully designed materials, several recommendations emerged to leverage the benefits of structural alignment with comparison. Mathematics educators should use comparison of multiple strategies frequently, particularly after students have fluency in one strategy. One reason our teacher-led study may not have shown benefits for comparing multiple strategies is that teachers did not use the curriculum as frequently as requested. Educators should carefully select strategies to compare and design the visual presentation of these comparisons to encourage alignment and maximize attention to important structural features. Finally, educators should engage students in a discussion of not only the similarities and differences between examples, but also what those similarities and differences reveal about the benefits and constraints of the different strategies. Comparison can help students recognize important structural features and principles, and discussion of these ideas can further improve learning. Comparison of multiple strategies remains an important instructional process that can lead to learning gains in mathematics, but future research needs to be done to continue the development and evaluation of curriculum materials and techniques that can be realistically implemented by teachers to effectively incorporate comparison into their mathematics classrooms.