1 Introduction

Word-problem solving, which represents the interplay between mathematics and reality, is a staple of mathematical school tasks beginning in early grades, for several reasons. First, word problems can develop students’ understanding of the meaning of operations involved in the problem, and consequently, their proficiency with whole number arithmetic (Verschaffel et al. 2007). Second, word-problem solving promotes critical thinking skills (e.g., reasoning and analysis, argument construction), which are important for school success (e.g., Boonen et al. 2013; Depaepe et al. 2010; Hickendorff 2013).

Most theoretical models of word-problem solving suggest that it comprises two phases—problem representation/comprehension and problem solution (e.g., Riley et al. 1983; Mayer 1999). Problem representation requires understanding the text of the word problem to be able to identify and represent the problem situation (problem schemata knowledge), including specifying the relevant numerical and linguistic information and quantitative relations in the problem. Problem solution entails not only the planning and execution of the required mathematical computations (action schemata and strategic knowledge), but also aligning the solution with the original problem situation, determining the reasonableness of the mathematical outcome, and communicating the solution (Depaepe et al. 2010; Mayer and Hegarty 1996). The development of word problem solving skills and mathematical communication are considered problematic areas for many students as well as their teachers (EACEA/Eurydice 2011; OECD 2010).

There is evidence that some children, particularly those with mathematics disabilities (MD), struggle with word-problem solving despite being competent in computations required to solve word problems (Fuchs et al. 2008; Schumacher and Fuchs 2012). Most student difficulties may be due to failure to understand the problem text, in order to be able to construct a coherent representation of the problem situation (Boonen et al. 2013; Peake et al. 2015; van Garderen 2006). For students with MD, deficits in working memory capacity, processing speed, language, attentive behavior, and organizational skills are likely to interfere with their ability to “infer the correct relations between the solution-relevant elements from the text base of the word problem and integrate them into a coherent visualization of the problem situation” (Boonen et al. 2013, p. 16).

Research also indicates that many teachers in elementary grades are inadequately prepared to teach problem solving. Teachers often do not engage in a discussion of students’ possible solution strategies or have students justify their solutions (Woodward et al. 2012). Instead, conventional problem-solving strategies widely used in U.S. elementary classrooms include the key word approach (Bruun 2013; Riccomini et al. 2016), which teaches students to use a particular operation whenever a word problem contains certain words or phrases (e.g., addition is the operation whenever the question in the word problem includes in all). The key word does not promote mathematical reasoning, which is at the core of contemporary approaches to word-problem solving (Karp et al. 2014). Furthermore, this approach does not have empirical evidence to support its use (Riccomini et al. 2016). Another commonly emphasized instructional strategy involves having students draw a picture (Bruun 2013). However, on the one hand, many students with MD demonstrate difficulties in using or generating visual representations to express their mathematical thinking. On the other hand, teachers may not be aware that “visual-schematic representations should be used to support the first phase of the word problem solving process (i.e., problem comprehension) and that arithmetical representations are only appropriate in the problem solution phase” (Boonen et al. 2016, p. 60).

The purpose of the current study was to explore the effectiveness of a research-based intervention, schema-based instruction (SBI), designed to help elementary school students with MD make sense of their reasoning related to multiplicative word-problem solving. Building mathematical proficiency with whole number multiplication and division is a key focus of mathematics instruction in Grades 3–5 as reflected in their presence in the Operations and Algebraic Thinking domain in the Common Core State Standards for Mathematics [CCSS-M; National Governors Association Center for Best Practices (NGA) and Council of Chief State School Officers (CCSSO) 2010]. Mathematical knowledge involving whole number multiplication and division is integral to understanding multiplicative structures such as ratios, slope, rate of change, and proportions, which are important in subsequent mathematical study (Siemon et al. 2005).

2 Theoretical background

In the following, we examine literature on problem types involving the multiplicative structure to understand the differences among problem types and why they are important. We also review the theoretical framework for SBI in terms of the key components for improving word problem solving performance of students with MD.

Researchers and policy documents (e.g., CCSS-M) have classified multiplication and division problems based on their semantic structures (i.e., relations among quantities expressed in words) as either asymmetrical or symmetrical (Carpenter et al. 2015; Chapin and Johnson 2000). In asymmetrical multiplication problems, the role of quantities (i.e., factors) is not interchangeable, whereas quantities have interchangeable roles in symmetrical multiplication problems. Asymmetrical multiplication problems often involve equal grouping and rate situations, in which the two factors (i.e., 3 and 2 in a multiplication equation 3 × 2 = 6) refer to different entities (number of equal-size groups/number of units and the number of things in a group/unit rate) (see Greer 1992). The number of groups or units acts as a multiplier. When the number of groups/units or number in each group/unit rate is unknown, the situation results in a problem being classified as an equal groups division problem. Symmetrical multiplication problems often involve arrays where either the number of rows or number of columns can be the multiplier. The two factors also provide a decontextualized row-by-column composite to illustrate the commutative property of multiplication, thus providing the foundation for understanding the calculation of area (see Battista et al. 1998). Researchers note that helping students shift from additive to multiplicative thinking can be promoted by conceptualizing multiplication as a rectangular array (Cullen et al. 2018; Downton and Sullivan 2017).

SBI is an evidence-based practice for improving the word-problem solving performance of students struggling in mathematics (Jitendra et al. 2015a, 2016). It is a multicomponent intervention grounded in schema theory (see Marshall 1995). From schema theory, understanding arithmetic word problems involves recognizing the underlying problem (semantic) structure (e.g., change, compare). Knowledge of the semantic network structures, which consist of elements and relations between those elements, is critical to constructing a representation that is coherent and complete. The SBI approach to solving word problems is also guided by cognitive models of mathematical problem solving that focus on knowledge of procedures (e.g., problem representation, planning) for a given class of problems (see Marshall 1995; Mayer 1999). Teachers are guided to use instructional practices (e.g., guided questions to engage students in conversations about their thinking and problem solving) to help students recognize common underlying problem structures, represent problems using visual-schematic diagrams, plan how to solve problems, and solve and check the reasonableness of answers. In addition, SBI incorporates several instructional features (e.g., explicit instruction and opportunities for feedback and practice) that are known to promote problem solving for students with MD (Gersten et al. 2009). For example, through modeling and think-aloud, teachers draw students’ attention to important features of problems and aspects of proficient problem solving. Furthermore, integral to SBI is an emphasis on metacognition strategy knowledge; students are prompted to “think about what they are doing and why they are doing it, evaluate the steps they are taking to solve the problem, and connect new concepts to what they already know” (Woodward et al. 2012, p. 17). These practices are highlighted in the What Works Clearinghouse’s (WWC) research synthesis on improving students’ mathematical problem-solving performance (Woodward et al. 2012) and the CCSS-M standards for mathematical practice (e.g., look for and make use of structure). The significance of these practices in developing students’ problem solving and mathematical reasoning skills is irrefutable (see Hulbert et al. 2017).

In this study, we leveraged our work on SBI that has focused primarily on teaching students to solve additive whole number word problems (e.g., Jitendra et al. 2013) and word problems involving ratios, proportions, and percentages (e.g., Jitendra et al. 2015 b) to develop the SBI program for teaching multiplicative word-problem solving.

3 Prior intervention research on whole number multiplicative word-problem solving

The majority of research on multiplicative word-problem solving with elementary school students struggling in mathematics has been conducted by Xin and colleagues. They designed the Conceptual Model-Based Problem Solving (COMPS) program that emphasizes a mathematical model (factor × factor = product) for representing multiplicative word problems involving equal groups and multiplicative comparison. For both problem types, the COMPS instructional paradigm included a model-lead-test procedure with explicit modeling and explanation in conjunction with teacher-student interaction and on-going performance monitoring with corrective feedback. Introductory lessons used story situations without any unknowns to identify the problem type and represent the quantities and relations in each problem type (e.g., unit rate × number of units = product) using the conceptual model diagrams. Next, students were taught to use a problem solving checklist to represent and solve problems with unknowns. Word-problem story grammar questions (e.g., Which sentence or question tells about the unit rate, number of units, total or product?) cued students to focus on the three key features in each problem type.

Three single subject multiple probe across participants design studies by Xin and colleagues demonstrated largely positive results on multiplicative word problem solving measures for fourth- and fifth-grade students with learning problems after receiving the COMPS intervention (Xin 2008; Xin et al. 2008; Xin and Zhang 2009). Despite the positive outcomes for students in these single-case design (SCD) studies, findings are limited for several reasons. First, given that the intervention phase in all three studies included fewer than three data points, this phase cannot be used to demonstrate existence or lack of an effect (Kratochwill et al. 2013). Second, Xin et al. (2008) included only two baseline conditions and therefore did not establish three attempts to demonstrate an intervention effect at three different points in time, which is essential to meet the WWC SCD standards (2014 ). Third, the inclusion criterion for study eligibility in Xin and Zhang (2009) was a score below the 30th percentile on a standardized measure of problem solving; however, one participant performed within the average range (50th percentile) making it difficult to determine whether the participant needed remedial instruction. Fourth, maintenance of skills was measured only between 4 days to 1 week after intervention, and maintenance of skills was not consistently demonstrated on maintenance measures. Last, none of the studies documented experimental control for the dependent variable (e.g., no inter-rater agreement was assessed for each case on each outcome variable).

By contrast, the next two studies employed randomized controlled trials (RCTs) to address the external validity of findings that were limited in the SCD studies. Xin et al. (2011) randomly assigned 29 students with learning disabilities or those at risk for mathematics difficulties to control or COMPS intervention. Intervention students improved significantly more than control on an experimental word problem-solving measure at immediate posttest and on 1- to 2-week follow-up tests. On a standardized measure of mathematics problem solving, the difference between conditions was not significant.

Xin et al. (2017) extended prior work on COMPS by designing an intelligent tutor-assisted intervention program, Please Go Bring Me (PGBM), which provided the foundation for understanding whole number multiplication and division before students received word-problem solving instruction. They randomly assigned 17 students to one-on-one PGBM-COMPS tutoring or whole group control (traditional mathematics instruction) conditions. Results indicated that students in the COMPS condition outperformed students in the control condition on immediate and delayed posttests (1–2 weeks later). On a standardized measure of problem solving, only intervention students’ scores improved significantly from pretest to posttest on a standardized measure of problem solving. These RCT studies provide the basis for tentatively concluding that COMPS enhances multiplicative word problem-solving performance of students with MD. The sample size in each study was small; researchers did not address differential attrition across conditions; and they assessed maintenance of problem-solving effects only 1–2 weeks following the end of the intervention.

4 The present study

Given the growing evidence base for interventions such as SBI that incorporate practices (identifying the problem structure and using visual representations) articulated in contemporary approaches to word-problem solving (Boonen et al. 2013; Carpenter et al. 2015), the goal of this study was to advance this line of research for students with MD. We highlight four features of our study that merit justification. First, our intervention was remedial in nature and targeted fifth-grade students with MD, who have persistent and intractable difficulties in multiplicative word-problem solving. Evidence indicates that the multiplication skills of middle school students with MD is similar to typically achieving third-graders (Mabbott and Bisanz 2008) and that without instruction focused on both conceptual and procedural understanding, these students’ difficulties will persist into later years.

Second, in addition to participants in the study being district-identified as having specific learning disabilities in mathematics using the IQ-achievement discrepancy criteria, we operationalized MD as scores below the 10th percentile on a standardized mathematics achievement test. We chose this lower cut score to include only students with MD who have persistent underachievement in mathematics. This operationalization of MD is more reliable than the sole reliance on the IQ-achievement discrepancy that might result in including students with heterogeneous abilities in mathematics in general (Mazzocco and Myers 2003) and word-problem solving in particular (Tolar et al. 2016). Third, unlike Xin and colleagues, we focused on both asymmetrical (i.e., equal groups, rates) and symmetrical multiplicative word-problems (i.e., arrays). These problem types are foundational multiplicative structures emphasized in the CCSS-M in early grades that participants in this study had not yet mastered. Fourth, we used single-case research design to focus on individual differences given the severity of MD for the three students in this study.

Through the use of a multiple probe across participants design, which focuses on individual performance and provides “strong evidence of causal relations between variables” (Barlow and Nock 2009, p. 20), we aimed to answer the following four questions: (a) Is there a functional relation between the SBI intervention and multiplicative word problem-solving performance of fifth-grade students with MD? (b) Do fifth-grade students with MD maintain their improved word problem-solving performance 2–3 weeks after the end of the intervention? (c) To what extent do fifth-grade students with MD apply SBI representational strategies (e.g., drawing diagrams, writing a number sentence)? (d) To what extent do fifth-grade students with MD view SBI as beneficial in learning to solve multiplicative word problems?

5 Method

5.1 Participants

Following approval from the University Institutional Review Board, parental written consent, and student verbal assent, we recruited students from an elementary school located in a suburban area outside a major Upper Midwest city in the United States. A special education teacher nominated 10 students with specific learning disabilities in mathematics. Of these students, we selected four fifth-grade students who met the following criteria: (a) low mathematics performance as evidenced by scores below the 10th percentile on the mathematics subtest of the state assessment and (b) demonstrated competency on a multiplication facts preassessment (scores of 80% or higher), but scored 50% or lower on a multiplicative word problem-solving preassessment created by researchers. One student withdrew from the study prior to receiving the intervention due to behavioral issues. Table 1 provides a summary of student demographic information.

Table 1 Student demographic information

5.2 Setting

All sessions of this study were conducted in a resource classroom and occurred during the students’ mathematics instructional time. The classroom was partitioned into three semi-private teaching spaces. Three special education teachers provided instruction in these spaces to small groups of 2–4 students, whereas the first author provided instruction to each participating student in the larger, open space of the classroom.

5.3 Materials

5.3.1 Preassessment

Prior to participating in the study, students completed a preassessment that consisted of 20 multiplication facts (i.e., multiples of 2 s through 10 s) and nine word problems involving whole number multiplication and division (see Assessments). If a student scored 80% or more on the multiplication facts test and scored 50% or less on the word-problem solving, they qualified to participate in the study.

5.3.2 Assessments

We developed multiple alternate forms of a word problem-solving test (WPS) that included nine one-step single- and two-digit multiplication and division word problems consisting of multiples of 2 s through 10 s as factors, which varied in terms of problem type (i.e., equal groups, unit rate, and array) and the unknown position in a problem (see Table 2). Baseline, intervention, and maintenance tests included the same three problem types, but were unique in that the story contexts and quantities were not repeated across or within phases.

Table 2 Sample problems on the word problem-solving tests

5.3.3 Instructional lessons

The SBI intervention program used in this study includes a 6-lesson instructional unit on solving equal groups, unit rate, and array word problems. Although the three types of word problems share the same underlying problem structure (factor × factor = product), they differ in their semantic features. Equal groups word problems include the number of groups (factor), size of a group or number of objects in one group (factor), and total number of objects (product). Unit rate word problems include the following features: number of units (factor), unit rate (factor), and total (product). Specifically, the unit rate (i.e., composed unit) involves recognizable quantities (e.g., miles per hour). Array problems represent a model for equal groups structure, in which a set of objects is arranged in a rectangular grouping (equal groups of rows and columns). Key features of array problems include the number of rows (factor), number of columns or number of objects in each row (factor), and total number of objects (product).

The lessons in the SBI program are highly specified such that a detailed teacher guide supports teachers in implementing tasks to develop students’ problem-solving skills (see Appendix A in the online supplementary materials for a sample excerpt of script from Lesson 2 for solving an equal groups problem). Furthermore, the relations among problem types in terms of semantic variations are made explicit across lessons (see Appendix A for sample excerpts of scripts from introductory Lessons 3 and 5 focusing on the meaning of rates and arrays). Student materials include a workbook with whole number multiplicative story situations (with no unknown information) and word problems, visual schematic diagrams (one for each problem type) to help with organization of information in the word problem, and problem-solving checklists (FOPS; F—Find the problem type, O—Organize the information in the problem using a diagram, P—Plan to solve the problem, S—Solve the problem) to prompt students to monitor and reflect on the problem-solving process.

5.4 Dependent variable

The dependent variable was percentage accuracy on the WPS test, measured by the percentage of correctly solved problems in each testing session. Using an answer key, responses were scored for both a correct number sentence or equation (e.g., 5 × 4 = n) and an answer that included the numerical value and appropriate unit (n = 20 brownies), with one point awarded for each response. Partial credit (0.5 points) for the answer was possible if the numerical value was correct and the unit was not appropriate (or not included) and vice versa. A total of 18 points were possible in each testing session across nine word problems. The criterion for mastery of word-problem solving for a student was achieving at least 80% or more correct on the WPS test.

We also examined student’s written work on tests in all phases to determine the extent to which they employed representational strategies such as drawing a diagram and writing a number sentence. Percentage of each type of representation was measured by recording the number of times the specific representation was used in each phase divided by the total number of possible times. Furthermore, we examined the types of diagrams used.

5.5 Experimental design

A single subject multiple test across participants design was used to examine the functional relation between the SBI intervention and students’ word-problem solving performance. We chose a single-case research design because the approach is methodologically well suited to investigate single cases with respect to understanding how each student with MD responds to the intervention (Gast and Ledford 2014). The implementation of the design adhered to the SCD standards for methodological rigor established by the WWC (2014). The staggered introduction of participants to the intervention is a defining feature of the multiple probe design (Gast and Ledford 2014). The study included three experimental phases of baseline, intervention, and maintenance. Following a stable baseline of problem-solving performance, one student was introduced to the SBI intervention and the other two students remained in baseline condition (see Sect. 5.6.2 Baseline), thus serving as a control for the first student. The second student was introduced to the intervention after the first participant showed an accelerating trend for the percentage of correctly solved problems in the intervention phase, whereas the third student continued in baseline. The same procedure was followed until the last student entered the intervention condition. The design ended with a maintenance phase.

5.6 Procedures

5.6.1 Testing

Test sessions followed identical procedures during the baseline, intervention, and maintenance phases. During the test sessions, students were provided with a test booklet and pencil. The test directions and items were read to the students. Students were asked to show their work for each item and instructed to write not only the number sentence or equation but also the answer (see dependent variable). All test sessions were conducted individually and no prompting or feedback was provided. Each student had the opportunity to receive a sticker as a reward at the end of each test session and select a reinforcer from a menu (e.g., gel pens) in exchange for five stickers.

5.6.2 Baseline

Participants received core mathematics instruction in their inclusive fifth-grade classroom using the district-mandated textbook, Everyday Mathematics 5 (Bell et al. 2015) and completed supplementary mathematics activities delivered by the special education teacher during their resource-room time. During baseline, each participant completed the WPS tests and did not receive any instruction related to this study. All participants completed a minimum of five baseline testing sessions.

5.6.3 Intervention

The first author implemented the lesson plans with each participant individually four times a week, with each instructional session lasting approximately 30 min. Each lesson took a minimum of two instructional sessions to complete so that students received a total of 12–14 instructional sessions. The SBI approach in this study included explicit instruction with the teacher modeling a think-aloud procedure, as well as interactive discussions with the student. The first lesson that introduced each problem type focused on the meaning of equal groups, unit rate, or arrays, which are critical to understanding and applying these concepts to solving word problems in subsequent lessons (see Appendix A). This lesson also used story situations with no unknown information with the aim of helping students understand and identify the problem structure by highlighting the key elements and the mathematical relations between these elements (factor × factor = product). Multiple representations were used to help students understand the mathematical relations between quantities in the story, with the Equal Groups Diagram illustrating how to model the mathematical equation by representing the three quantities in the diagram (see Appendix B, Fig. 1 in the online supplementary materials). Next, students learned to check whether the equation was true using a two-pan balance scale as an artefact related to the meaning of an equal sign.

Fig. 1
figure 1

Percentage correct of WPS test scores across the baseline, intervention, and maintenance phases for participants

The goal of subsequent SBI lessons was applying problem-solving procedures for a given class of problems, including checks to monitor and reflect on the problem-solving process. Through discussions and questions to scaffold a solution process, students learned to do the following: (a) identify the type of problem (e.g., equal groups, array) by reading, retelling, and examining information in the problem as well as thinking about how problems within and across types are similar or different, thus connecting the problem to already solved problems, (b) represent critical information in the problem using an appropriate representation that illustrates the relations between relevant quantities in the problem, (c) determine what strategies to use to solve the mathematical equation, (d) solve and check the reasonableness of the solution as well as check whether the equation was true (see Appendix A for a sample excerpt of script for Lesson 2 for solving an equal groups problem). During the intervention phase, a WPS test was administered after students completed a lesson on solving word problems with unknowns. No testing was done at the end of the first lesson (two instructional sessions) when students learned to identify the problem type in story situations with no unknowns.

5.6.4 Maintenance

Procedures in maintenance sessions were the same as those in baseline. All students completed two testing sessions in the maintenance phase, with Luke and Sean administered the WPS tests at 2 and 3 weeks after the termination of the intervention. Dustin was administered the WPS tests 1 and 2 weeks after the final intervention phase assessment.

5.7 Interrater agreement

Two researchers collected interrater agreement (IRA) data for the scoring of WPS tests in all three phases of the study. IRA was established based on the percentage accuracy and percentage representations (diagrams and number sentences) for 30% of all baseline, intervention, and maintenance sessions. We calculated IRA using Cohen’s kappa, which adjusts for chance agreement between raters (Cohen 1960). For percentage accuracy, Cohen’s kappa was 0.90 for baseline, 0.93 for intervention, and 0.86 for maintenance. For the two types of representations, Cohen’s kappa was 1 for all three phases.

5.8 Treatment integrity

Treatment integrity was collected for 30% of the instructional sessions during the intervention phase. The third author assessed the instructor’s adherence to the SBI intervention using a checklist that consisted of 13 items corresponding to critical elements of SBI (e.g., identifies the problem type, discusses how the problem is similar to or different from previously solved problems) and seven items measuring general instructional behaviors (e.g., sets the purpose for the lesson, provides feedback). The mean treatment integrity was 94% (range = 80%–100%).

5.9 Social validity

Students completed a survey following the completion of the intervention to indicate their level of agreement with statements about the intervention and materials (e.g., use of diagrams, problem solving checklists) in helping them understand and solve word problems and also whether they would recommend the intervention to others and continue to use it. Students responded to eight items using a 4-point scale (1 = strongly disagree, 2 = somewhat disagree, 3 = somewhat agree, 4 = strongly agree). In addition, two open-ended questions provided an opportunity for students to report what they liked the most and least about the intervention.

5.10 Data analysis

We used visual analysis and two effect size indices to analyze the data. We visually inspected the graphed data for level and trend (Gast and Ledford 2014). For effect size measures, we calculated the percentage of nonoverlapping data (PND; Scruggs et al. 1987) and Tau-U (Parker et al. 2011) to determine the strength of the intervention. PND was calculated as the number of data points in the intervention phase that exceeded the highest data point in the baseline phase divided by the total number of data points in the intervention phase and multiplied by 100. PND scores ranging from 90 to 100% represent a very effective treatment; 70%–89% an effective treatment; 50%–69% a questionable treatment; and below 50% an ineffective treatment (Scruggs et al. 1987). Because PND does not assess trend between phases, we also computed Tau-U, a non-parametrical statistical evaluation of effect size, which combines nonverlap between two phases with intervention phase trend. Tau-U scores between 0.93 and 1 are interpreted as a large effect; 0.66–0.92 a medium effect; and 0–0.65 a small effect (Parker et al. 2011).

6 Results

6.1 Mathematical problem solving performance

Figure 1 shows the effect of SBI on word problem-solving performance for the three students. All three students demonstrated markedly higher performance solving multiplicative word problems when compared to their baseline scores. Visual analysis of the graph shows a functional relationship between SBI and the percentage accuracy for solving multiplicative word problems with intervention staggered across students in a typical multiple probe design.

6.1.1 Luke

During baseline, Luke showed low levels of responding and earned an average accuracy score of 34.7% (range = 29.1%–37.5%). After exhibiting a stable baseline, Luke’s intervention scores increased above his baseline levels, illustrating an immediate change in level between conditions, and also showed an increasing trend. The average accuracy was 91.9% (range = 72.2%–100%), an increase of 57.2% from baseline. The PND between Luke’s baseline and intervention performance was 100% and Tau-U for the intervention was 1.0, CI90 [0.54, 1.45], indicating that the intervention was highly effective. Luke scored 90% of problems correct on two consecutive sessions of maintenance following intervention (2 and 3 weeks later).

6.1.2 Sean

Sean’s performance during baseline was relatively low, with a mean accuracy of 43% (range = 34.7%–55.5%). Upon beginning intervention, his scores rose above baseline levels, indicating an immediate change in level between the two conditions, and showed an increasing trend. Sean scored 88.7% average accuracy (range = 80.5%–97.2%) across five sessions of intervention, an increase of 45.7% from baseline. The PND between Sean’s baseline and intervention scores was 100% and Tau-U was 1.0, CI90 [0.58, 1.45], classifying the intervention as highly effective. Sean earned an average accuracy of 92.4% across two consecutive sessions of maintenance (2 and 3 weeks following the final intervention assessment), illustrating improved performance from his intervention sessions.

6.1.3 Dustin

During baseline, Dustin earned an average accuracy of 10.2% (range = 0%–16.6%). Following baseline, Dustin’s intervention scores increased above his baseline scores, illustrating an immediate change in level between conditions, and also showed an increasing trend. On average, his accuracy score was 72.4% (range = 40.2%–93%) during intervention, an increase of 62.2% from baseline. The PND between Dustin’s baseline and intervention scores was 100% and Tau-U was 1.0, CI90 [0.54, 1.45], indicating that the intervention was highly effective. Dustin earned an average accuracy score of 84.7% across two consecutive sessions of maintenance, illustrating improved performance from his intervention sessions.

6.2 Representational strategy use

Table 3 displays the mean percentage of representational strategy use during baseline, intervention, and maintenance phases. Data on drawing a diagram showed an increase from an average of 0% for Luke, 4.4% (range = 0%–11%) for Sean, and 20.0% (range = 0%–44%) for Dustin during baseline to 95.6% (range = 89%–100%) for Luke, 62.2% (range = 0%–100%) for Sean, and 89.0% (range = 67%–100%) for Dustin by the end of intervention. During two consecutive sessions of maintenance following the intervention, Luke and Dustin scored 100% and 83.5% respectively, whereas Sean’s mean score decreased from intervention to 11%.

Table 3 Percentage of representational strategy use across phases by student

Data on writing a number sentence showed an average score of 100% for Luke in the three phases. It is interesting that only about one-third of the number sentences were accurate during baseline compared to an average accuracy of 93% and 89% during intervention and maintenance sessions. In contrast, Sean and Dustin each earned an average score of 2.2% during baseline. Sean’s average intervention performance increased to 100%, and he scored 100% on two consecutive sessions of maintenance. Dustin’s mean intervention and maintenance scores also increased to 93.5% (range = 67%–100%) and 100%.

6.3 Social validity

Results of the social validity survey indicated that students rated the diagrams item as strongly agree or somewhat agree (M = 3.8), indicating they felt the diagrams were very helpful in organizing information and solving multiplication and division word problems. The mean rating for acceptability of the diagrams used in the intervention (i.e., continue to use the diagrams, recommend using the diagrams when teaching word-problem solving to other students) was 3.3. With regard to the problem-solving checklists, the mean rating was 3.2, indicating that they were helpful in checking students’ understanding of how to solve word problems. The mean rating for acceptability of the problem-solving checklists was 3.0.

On the open-ended questions, two students reported liking the diagrams and problem-solving checklists the most and another reported liking learning to solve word problems. Responses to what they liked the least varied from having to complete multiple WPS tests in the study, to discriminating between the size of a group and the number of groups in equal groups problems, and learning about array problems.

7 Discussion

In this study, we evaluated the effectiveness of SBI to teach whole number multiplication and division within the context of word problems. Through visual analysis and two effect size measures, we demonstrated a functional relation between the SBI intervention and the three students’ percentage accuracy scores solving multiplicative word problems. All three students also demonstrated retention of these skills (scoring 85% or greater average accuracy across two sessions of maintenance) 1–3 weeks following the termination of the intervention. Moreover, percentage of representational strategy use increased over the course of the study. All students began with a limited understanding of how to use a diagram to represent the problem situation. After intervention, all students were regularly using diagrams and writing number sentences correctly to solve multiplicative word problems.

The results of this study are consistent with previous research regarding the benefits of SBI for teaching students with MD (see Jitendra et al. 2015b), and offer evidence demonstrating the value of SBI for fifth-grade students with MD in mathematics in learning to solve multiplicative word problems following one-on-one instruction. In addition, our findings extend the work of Xin and colleagues (e.g., Xin et al. 2008, 2011) suggesting that SBI can produce comparable outcomes to the COMPS intervention for older elementary students with MD who struggle with whole number multiplication and division word-problem solving. Students in our study successfully demonstrated the ability to represent the problem situation and model the mathematics on the assessments, and maintained their problem solving performance 1–3 weeks after the end of the intervention; this approach illustrates the design principles of the SBI approach used successfully with students at risk for MD and their not at-risk peers (Jitendra et al. 2013, 2015b).

7.1 Relation to theory, research, and practice

This exploratory study provides preliminary evidence of the potentially positive effects of SBI for students with MD based on elements of best practice from mathematics education and special education. What are some possible reasons for improved problem-solving performance of students with MD? First, SBI with its focus on the problem structure required students to categorize problems into a few problems types by discerning the relevant quantities and their relations, which possibly reduced working memory load, allowing for more efficient and effective learning (Kalyuga 2009). Second, visual-schematic diagrams in SBI may have helped students organize information in the problem, with the outcome of further reducing the cognitive memory demands and enabling the learner to focus on problem solution. Evidence suggests that visual representational approaches improve mathematical problem solving (Rellensmann et al. 2017). Third, SBI promoted meaningful learning in that explicit instruction and appropriate guidance (e.g., teacher think-aloud procedures illustrating how the problem can be solved), which were provided throughout the learning process, may have enabled the learner to understand and solve multiplicative word problems. Prior research also supports the importance of making mathematical practices explicit to promote student learning (e.g., Clarke et al. 2016; Selling 2016). Integrating these mathematical practices in SBI with mathematics content (whole number multiplication and division) is also supported by the CCSS-M (NGA and CCSSO 2010).

An examination of representation data in Table 3 indicates a marked improvement in participants’ use of diagrams and number sentences. Prior to SBI, participants either did not use visual models to represent the problem situation or used a grouping model, in which multiplication is conceptualized as the joining of equal groups. However, this model was often incomplete or applied inaccurately, especially for situations involving division problems, indicating that students did not have complete understanding of the grouping model. When students modeled the mathematics using number sentences, the majority of the sentences were not accurate. Following the intervention, students showed increased use of effective and accurate visual-schematic representations. These findings are encouraging in light of past research indicating that students with MD are less likely to construct viable representations on their own (van Garderen and Montague 2003). Although all three participants maintained the strategy of writing number sentences to model the mathematics, only two students continued to use visual-schematic diagrams. One possible explanation for the substantial decrease (about 50%) in visual representations during the maintenance phase for Sean is that he internalized the problem schema so that generating a diagram was not necessary as he was able to solve the problem accurately and efficiently using the arithmetical representation. This finding was corroborated by his accurate word-problem solving average score of 92.4% in the maintenance phase. An alternative explanation is that the strategy of using a diagram when solving a problem in early grades may be perceived as inappropriate in later grades, which is a common disposition among older students with MD (van Garderen and Scheuermann 2014).

SBI encouraged students to make sense of their reasoning related to whole number arithmetic problem solving. Reasoning required understanding the quantities and their relationships in problem situations and representing the information using coherent visual-schematic diagrams, as well as “considering the units involved; attending to the meaning of quantities, not just how to compute them; and knowing and flexibly using different properties of operations” (NGA and CCSSO 2010, p. 6). In this study, the instructor provided the necessary scaffolding (e.g., explanations and prompts, visual-schematic diagrams to reduce cognitive memory load) to support these students as they independently solved word problems. This type of instruction warrants further validation as possible instructional practice for students with MD served in special education, where supplemental instructional supports for this population are not well defined (Fuchs et al. 2012).

The improved student outcomes in this study provide encouraging indications of the feasibility and potential effectiveness of teaching problem solving with SBI. Implementation of SBI in this study required relatively few resources. Although Dustin progressed more slowly than the other two participants and took 14 intervention sessions of 30 min each, all three students reached mastery (80% accuracy on WPS tests) in a relatively short period of time (four to eight sessions following SBI implementation). These findings suggest that SBI provided an effective means for students with MD to learn problem solving and that 12–14 sessions of 30 min each is a feasible goal for these students in typical school contexts. Findings also suggest that implementation of SBI does not require additional resources beyond the teacher and student materials used in the study. Furthermore, participants’ perceptions indicated a high acceptability of the SBI approach, supporting its feasibility.

7.2 Limitations and future directions

There are several limitations associated with this study that may affect its implications for research and practice. One limitation of this study has to do with the nature of the student participants and the types of problems taught. The three participants were fifth-grade students with MD, who received instruction in solving word problems involving equal groups, unit rate, and array problems due to their novice skill level in solving these problems at the start of the study. However, these problems are typically taught in third grade in the CCSS-M. Thus the results in this study may be limited to this subgroup of students. Second, the researcher rather than the classroom teacher provided all of the instruction, which limits evidence of the effects of the intervention when implemented by other instructors. Third, this study did not document the transfer of skills to problems found in typical textbooks (multistep problems) or standardized tests. Fourth, some word problems in the study were not realistic and possibly scoring of their intended solutions ignored realistic considerations. For example, in the problem about determining the total number of feet of wood used when making 7 shelves that are 2 feet long each, the accepted answer of 14 feet may not be realistic when considering the loss of wood that occurs with the process of sawing. Future research studies should address these limitations by ensuring that word problems and scoring of their solutions account for realistic considerations, teaching developmentally appropriate content to students with MD, replicating the positive effects in this study with several different instructors, and ensuring student transfer of skills to more complex skills and advanced mathematics. Also an important aspect for future research is the collection of data on student–teacher interactions during instruction and analysis of student answers to provide insights into students’ mathematical reasoning and sense-making.

In summary, SBI is a promising approach for enhancing the word problem-solving performance of students with MD. Although this study shows promise that students with MD can learn to solve word problems, more research is needed to replicate the findings and to evaluate the intervention for other mathematical skills.