Introduction

Longitudinal evidence suggests mathematics performance upon entering formal schooling often predicts mathematics performance later in the curriculum (Duncan et al. 2007; Geary 2013; Gersten et al. 2005; Jordan et al. 2003). The National Assessment of Educational Progress (NAEP) (2018) reports that by fourth grade, only 40% of students scored at or above proficient in mathematics. By eighth grade, the number of students considered at or above proficient decreased to 34% and further declined to 23% by the end of grade 12 (NAEP 2015). Students with disabilities fared much worse. Only 16% of fourth-grade students and 9% of eighth-grade students scored at or above proficient (NAEP 2018). By 12th grade, only 6% of students scored at or above proficient (NAEP 2015). Fittingly, a significant number of adults have difficulty performing basic quantitative and problem-solving tasks that incorporate whole numbers, fractions, and simple algebra required to compete in today's economy (Geary 2013; US Department of Education 2007).

Computational Fluency

Computational fluency is a significant factor contributing to overall mathematics achievement and sustaining long-term general mathematical knowledge (e.g., Bahrick and Hall 1991; Bryant et al. 2008; Jordan et al. 2003; National Mathematics Advisory Panel, (NMAP) 2008; Siegler and Shrager 1984). National initiatives note appropriate practice to build computational fluency should occur in earlier grades to support conceptual understanding, procedural fluency, and problem-solving processes later in the mathematics curriculum. (NMAP 2008; National Research Council 2001). The Common Core State Standards Initiative for Mathematics (CCSS; National Governors Association Center for Best Practices and Council of Chief State School Officers 2010) recommends students fluently add two 1-digit numbers by the end of second grade and all products of two 1-digit numbers by the end of third grade. Fluency standards continue through seventh grade encompassing a wide range of multidigit complex computations to solving for rational numbers.

Elementary students who do not meet earlier fluency standards for simple computation often rely on inferior counting strategies when attending to more complex computations (Rivera and Bryant 1992; Lin and Kubina 2005; Stocker et al. 2018a, b). For example, finger-counting to solve one-digit plus one-digit addition facts (e.g., 8 + 3 = ?) persists while solving multidigit computations (e.g., 158 + 273 = ?). Observable consequences include taking “too much time” on tasks, incomplete problems and assignments, and falling behind pace of instruction (Biancarosa and Shanley 2016; Clarke et al. 2016). The problem further compounds when learning and computing procedures involving decimals, fractions, and integers (Rivera and Bryant 1992; Stocker et al. 2018a). By the end of middle school, students must apply standard algorithmic knowledge and procedures to engage ratios, proportional relationships, expressions and equations, and functions that support algebraic readiness (CCSSI 2010).

In a survey completed for NMAP (2008), high school algebra teachers have expressed concern over nonfluent performance with simple computation and standard algorithms. The teachers recommended earlier grades place more instructional emphasis on (a) basic skills including arithmetic versus prematurely moving onto higher-level math concepts, (b) rational numbers including order of operations, positive and negative numbers, and fractions and decimals, and (c) limiting, if not, eliminating the use of calculators, especially fraction calculators (Hoffer et al. 2007). The results of the survey reflect similar concerns articulated in the precision teaching literature indicating students move too quickly to next-level concepts without reaching fluency or mastery and thus remain in the acquisition phase of learning (e.g., Binder 1996; Kubina and Yurich 2012). The implications suggest more attention to fluency occurs in elementary and middle school to better prepare students for high school algebra, postsecondary employment, and college matriculation. (Adelman 2006).

Role of Fluency in Quantitative Reasoning

Quantitative reasoning refers to the capacity to analyze numerical information and make decisions on which skillsets and procedures to apply to problem solutions (National Council of Teachers of Mathematics (NCTM) 2000). And although many factors contribute to quantitative reasoning, fluent and flexible application of whole numbers plays a fundamental role in the mathematics curriculum (Cirino et al. 2016; NMAP 2008; National Research Council (NRC) 2001). At the primary level, students develop conceptual understanding with whole numbers and practice through a progression of different computing approaches such as plus one, counting on, counting backward, skip counting, missing number, and part–part–whole (NMAP 2008; Powell and Fuchs 2012). Purposeful and varied learning activities provide a deeper conceptual understanding of how numbers operate and interact (Baroody 2006; Canobi 2005; Gilmore and Papadatou-Pastou 2009).

As early as first grade, students learn the conceptual nature of inverse operations (e.g., 3 + 5 = 8; 8–5 = 3) and the commutative property through concrete and pictorial representations (CCSS 2010; Gilmore and Papadatou-Pastou 2009). Additional shortcut strategies (e.g., a + b − b = a) support flexible procedural knowledge and deepen conceptual understanding (Baroody et al. 2009; Powell and Fuchs, 2012). Focused instruction that incorporates memorization of number bonds (e.g., fact families) plays an important role in primary and elementary mathematics curriculum (CCSS 2010). However, as students advance and computational accuracy increases, conceptual understanding of the commutative property often lags behind (Canobi 2005). Evidence suggests struggling with the inverse concept of addition and subtraction extends to multiplication and division which in turn hinders conceptual understanding and performance in problem-solving situations and topics (Baroody et al. 2009). The inability to automatically and flexibly recall whole number combinations compounds the problem (NMAP 2008). Nonfluency and lack of conceptual understanding can have a deleterious effect on number sense (e.g., Jordan et al. 2010, 2006), estimation (e.g., Booth and Siegler 2006; Dowker 2003), number patterns and relationships (Gilmore and Papadatou-Pastou 2009), and more advanced reasoning involving proportions and rational numbers in middle school that effects algebraic readiness for high school (e.g., Hecht et al. 2003; Jordan et al. 2013).

“Real-world” consequences of mathematics illiteracy manifest in problem-solving procedures and reasoning when calculating interest on a loan, miles per gallon on a trip, and gratuity for services (Phillips 2007). The impact extends into skillsets required to navigate the complexities of personal finance and health care (Price and Ansari 2013). Research stresses that many students lose interest in careers that require mathematics competency such as in STEM fields and health care by the end of middle school which coincides with the first significant drop in national mathematics scores (NAEP 2018). Citizens who do not acquire essential skills in mathematics have an increased probability of experiencing difficulty contributing to and benefiting from an advanced knowledge-based, data-driven society (Atkinson and Mayo, 2011).

Fluency Intervention Research

Evidence disseminated through the Institute of Educational Sciences (IES) recommends teachers provide “about 10 min each session” of fluency instruction to build computational fluency (Gersten et al. 2009, p. 38). Despite the importance of automaticity, fluency instruction is often overlooked, underutilized, or implemented ineffectively (NMAP 2008). The quality of instructional materials, appropriate leveling, and issues that relate to skill generalization have also raised concerns (Daly et al. 2007). With the significant decrease in mathematics performance occurring as students move through the mathematics curriculum and the majority of fluency research taking place at the primary and elementary levels, research suggests computational knowledge at the middle grades level should not be minimized or overlooked (Gersten et al. 2009).

Traditional fluency intervention activities typically involve timed practice where students rapidly compute from one isolated and unrelated math fact to the next (e.g., 5 + 3 = ___ then 4 + 2 = ___). Researchers critical of timed practice suggest traditional methods inhibit conceptual understanding, number sense, number flexibility, problem-solving, and quantitative reasoning (e.g., Baroody 2006; Boaler 2015). A review panel for IES (Gersten et al. 2009) reported fluency practice activities with fact families which simultaneously teach inverse operations (e.g., 3 × 4 = 12, 4 × 3 = 12, 12/4 = 3, and 12/3 = 4), and the commutative property shows promise in addressing critical appraisals of conventional fluency building methods. However, the studies in the review employed fact family practice as one element within an intervention package (Beirne-Smith 1991; Bryant et al. 2008; Fuchs et al. 2005, 2006; Fuchs, Seethaler, et al. 2008a, b; Woodward 2006). The panel recommended more empirical research to isolate the effect(s) fact family fluency has on mathematics achievement (Gersten et al. 2009).

In the most recent component analysis of mathematics fluency interventions, Codding, Burns, and Lukito (2011) concluded that interventions having either a drill or practice with modeling component produce the largest effect sizes. The meta-analysis also determined that interventions with three or more treatment components produce a large effect size. Practice operates most effectively when the student has multiple opportunities to respond in timed trials to increase the frequency of responding (Daly et al. 2007). As frequency of accurate responding increases, so does the level of reinforcement which increases the likelihood of maintenance and long-term retention (Binder 1996; Cooper et al. 2007; Stocker et al. 2018b; Sutherland et al. 2003).

In the precision teaching literature, frequency building refers to a systematic method of repeated practice that uses modeling, explicit timings, and immediate feedback to increase the speed and accuracy of the presented stimuli (Binder 1996; Johnson and Layng 1996; Kubina and Yurich 2012). Modeling can include flashcards or other evidence-based intervention used for acquisition and accuracy. Explicit timings occur in specified time periods and yield performance measurements in digits or correct responses per minute. Immediate feedback reinforces correct responding versus incorrect responding (Daly et al. 2007; Fuchs et al. 2008a, b; Hattie and Timperley 2007; Rivera and Bryant 1992). Studies that have incorporated frequency building as the primary intervention have also reported an increase in retention and application of component math skills in new complex skills (e.g., Brady and Kubina 2010; Bullara et al. 1993; Chiesa and Robertson 2000; McTiernan et al. 2016; Stocker et al. 2018a; Stocker et al. 2018b; Stromgren et al. 2014).

Present Study

Considering the paucity of fluency research at the middle school level, critical appraisals of practicing math facts in isolation, and the call for examining the effects of fluency instruction with inverse operations, the present study investigated the application of frequency building with fact families to determine its effects on simple computational fluency and quantitative reasoning. We predicted that approximately ten minutes of fluency instruction as recommended by Gersten et al. (2009) that focused on repeated and rapid practice of inverse operations could ameliorate some of the learning concerns associated with the commutative property, quantitative reasoning and math facts fluency. The research questions included:

  1. 1.

    What are the effects of a fact family frequency building intervention on quantitative reasoning?

  2. 2.

    What are the effects of a fact family frequency building intervention on mathematics fact fluency?

Method

Context of the Study

The study occurred at a charter middle school (grades 5–8) in suburban Pennsylvania. The school staff identified fluency as an area of critical need and had not adopted a fluency instruction program. The teachers considered the proposed fact family fluency intervention as a pilot for future curricular programming. Hence, the study would assist in programmatic decision making for the following academic year. Concerns over the sacrifice of instructional time and duration of the intervention influenced the number of days of implementation and the order of teachers who wanted to participate in the investigation (Johnson et al. 2012; Long et al. 2016).

The school relied on a variety of computer applications, teacher-created materials, and older school district mathematics curriculum instead of following one specific curriculum. The school assigned students struggling or with disabilities to small class sizes for mathematics instruction (< 10 students per group). Teachers provided additional support to individual students at different time points during the day; however, the school did not establish a formal response-to-intervention framework.

Participants

The school enrolled 95 students in Fall of 2016. One eighth grade teacher declined to participate in the investigation or use the intervention which decreased the possible pool of participants to 83 students. Sixty-seven students submitted consent forms; four students left the school during the academic year. Thus, sixty-three students were included in the analysis. The participants included 35 male and 28 female students. Fifty-six students identified as Caucasian (89%), three African American (5%), two students Hispanic or Latino (3%), and one student Asian American (2%). School records identified fourteen students with either a specific learning disability (SLD; n = 8), autism spectrum disorder (ASD; n = 5), or other health impairments (n = 1). Four of the eight students with an SLD had a math disability. Another 15 students received small group instruction in regularly scheduled mathematics instruction.

Scores from Wave 1 of the Woodcock–Johnson IV Math Facts Fluency and Number Matrices subtests informed decision making on which classes received treatment first. Per research team request, the teachers independently assigned individual, intact classes (nine classes, nine math teachers) to create two similar, nonequivalent groups with comparable means and standard deviations. Group 1 (five classes) and Group 2 (four classes) consisted of 35 students and 28 students, respectively.

Group 1

Group 1 had 20 students who did not receive additional support or have a disability, four students with ASD, two students with an SLD, and nine students receiving additional support in mathematics but no disability. Group 1 had a mean age of 12 years.

Group 2

Group 2 had 14 students who did not receive additional support or have a disability, one student with ASD, six students with an SLD, one student with an OHI, and six students receiving additional support in mathematics but no disability. Group 2 had a mean age of 11.8 years.

Materials

The researchers adapted the intervention from the Morningside Mathematics Fluency: Math Facts curriculum (Johnson 2008). Students received a stapled packet of practice sheets and two sharpened pencils with erasers. Teachers led the activity with a digital timer. After fluency-building practice concluded, the students entered scores into a digital computer application on their laptop to self-monitor progress. Due to intermittent outages and slow speed of the Internet connection, the researchers decided to abandon the digital application midway through Wave 1 and Wave 2.

Curriculum

The Morningside Mathematics Fluency: Math Facts (Johnson 2008) curriculum uses frequency building with additive and multiplicative fact families. The curriculum consists of two sets of practice materials: one set has two volumes for additive fact families, and the other has three volumes for multiplicative fact families. Each set has 16 levels or “slices” with each slice introducing two or three new fact families at a time. The student learns the number combinations of one family (e.g., 3, 4, 12) and the four related math facts (i.e., 3 × 4 = 12, 4 × 3 = 12, 12 ÷ 4 = 3, 12 ÷ 3 = 4). Separate practice sheets allow students to (a) write and recite fact families accurately and fluently, (b) write in the missing numbers to fact family number combinations without the symbols (e.g., 3 __ 12), and (c) solve for the traditional display of the targeted math fact families. The curriculum also provides review practice sheets embedded in each slice that accumulate fact families learned from previous slices. The Morningside curriculum incorporates peer-assisted learning (e.g., Fuchs and Fuchs 2001) and duration of each practice session lasts approximately 20 min. Two recent studies have applied the Morningside mathematics curriculum to investigate (a) learning outcomes that stem from behavioral fluency (McTiernan et al. 2016) and (b) math facts fluency, complex computation, and competing social behavior (Greene et al. 2018).

Intervention Components

The present study deviated from the Morningside protocol in that students did not engage in peer-assisted learning and applied a different structure of explicit timings to meet the feasibility of the intervention fitting in the 15 min allocated toward fluency instruction. The intervention consisted of the following evidence-based components: modeling, timed practice, feedback, and positive praise (Codding et al. 2011). The teacher-led modeling occurred at the beginning of each session. Teachers introduced a new “slice” or level that included two or three new fact families from each set every three to five days, depending on difficulty level (e.g., 2 × 2 = 4 versus 9 × 6 = 54). The researchers made the decision to omit the first two slices (× 0, × 1; + 0, + 1) to ensure practice occurred in the 50-day window with more challenging fact families in later slices, leaving 14 slices in each set for the intervention. Frequency building represented the timed practice component that occurred every intervention day. Between each timing, the students self-managed feedback and counted correct responses. The teacher provided performance feedback on accuracy. Teachers and students provided positive praise to encourage better performance on the next timing.

To investigate the effects of the intervention on quantitative reasoning, students learned fact family combinations from both additive (e.g., 2, 3, 5; 5, 3, 2) and multiplicative (e.g., 3, 4, 12; 4, 3, 12) groups to support knowledge of inverse operations. The students participated in two types of timed practice activities included in the Morningside curriculum. The first consisted of fact families in a missing number format without symbols (e.g., 3 ___ 12; __ 4 12) and the other a traditional display of math facts related to the fact families (e.g., 3 × 4 = ___; 4 × 3 = ___). Fact families and traditional display of math facts had separate practice sheets. Students practiced a combination of 72 additive (sums to 20) and multiplicative (products to 81) fact families over a 50-day intervention window that corresponded with the middle school’s academic calendar. Of the 15 min designated toward daily intervention (e.g., transitions, organizing materials, modeling), the students engaged in nine minutes of timed practice and up to three minutes of self-managed feedback to meet the daily fluency practice recommendations approximated by Gersten et al. (2009). Modeling new fact families on 14 of the 50 intervention days typically added an additional three to five additional minutes. Teachers reported completing activities on most days within 15 min.

Procedures

The university institutional review board approved the study. For research purposes, the teachers sent home consent forms to parents and assent forms for students.

The lead researchers and trained research assistants, which included doctoral students in special education and educational psychology, administered the WJ-IV Number Matrices subtest to individual students in conference rooms and unused classrooms. The lead researchers and a trained special education doctoral student conducted the WJ-IV Math Facts Fluency subtest as a group independent of teachers during the first 10 min of mathematics instruction.

The teacher wrote the additive fact family (e.g., 5, 4, 9) by explicitly and concisely modeling the patterns (i.e., 5, 4, 9; 4, 5, 9; 9, 5, 4; 9, 4, 5) that mirrored additive operations (e.g., 5, 4, 9 represent 5 + 4 = 9). The group then engaged in unison responding. Students then demonstrated written accuracy (only the first day of introducing a new fact family) using the designated practice sheet included in each of slice of the curriculum. After teacher-led modeling, the students engaged in timed practice with the missing number activity in a fact family pattern (e.g., 4, ___, 9; 9, ___, 5) on a practice sheet divided into columns for three, 30-s timed trials. Each practice sheet had a fully worked fact family at the top of the page for students to reference during timed practice. After each 30-s timed trial, the students' self-managed feedback for 20 s by checking their responses to the fully worked fact families at the top of the practice sheet and tallying correct answers.

Teachers walked around the room to provide positive praise, feedback on accuracy, and ensured the integrity of the intervention. Students engaged in verbal praise, pats on the back, and high-fives. Students then completed two 60-s timings with mixed addition and subtraction sentences representing the target fact families (e.g., 4 + 5 =). Students again referred to the worked solution at the top of page during practice and then proceeded to self-managed feedback and counted correct responses for 20 s. Next, the students attended to the multiplicative fact families replicating the same process. After timed practice, the class completed two 60-s timings of mixed additive and multiplicative mathematics facts where the students did not have access to the worked solutions at the top of the page.

Dependent Variables

The research team selected the Woodcock–Johnson IV Tests of Achievement for Mathematics (WJ-IV) (Schrank et al. 2014). The WJ-IV is a widely used achievement test nationally standardized on over 7,000 individuals ranging in age from 2 years to 90 + years (Schrank et al. 2014). The researchers employed three different forms (A, B, and C), making the WJ-IV compatible with the three waves of measurement required by the switching replications design. For the present study, the investigation collected standards scores from the WJ-IV Math Facts Fluency and Number Matrices subtests. The WJ-IV Math Facts Fluency subtest consists of a three-minute timed test. Examinees compute a series of math facts in a response booklet and complete as many as possible within the time limit. WJ-IV Number Matrices subtest measures quantitative reasoning in which the test administrator presents a matrix of numbers, and the student must identify missing numbers based on observed number patterns. Both subtests include problems that increase in difficulty.

Data Collection

To measure the effects of fact family practice on quantitative reasoning, the students needed to have exposure to a wide range of number combinations over the 50-day intervention window to engage the WJ-IV Number Matrices subtest. Thus, the researchers followed a dosage model that included daily practice with a set number of practice opportunities and timings. Each student completed the WJ-IV Math Facts Fluency and Number Matrices subtests on three separate occasions (i.e., Wave 1, Wave 2, and Wave 3).

Procedural Integrity

For procedural integrity, the first and second author trained teachers and doctoral students on the fluency intervention protocol. All trainees received scripts, procedural checklists, and intervention packets for rehearsal. Each packet included highly detailed visual prompts to avoid procedural drift. For instance, on the missing number fluency activity, the page had "3 × 30 s" and "no skipping" written at the top of the page as well as vertical arrows placed above each column to note the direction students would compute. On the traditional display fact family fluency activity, the page had "2 × 60 s" and horizontal arrows written at the top of the page, and "no skipping" printed on the bottom of the page. Student intervention packets included the same visual prompts.

The research team observed and supported each teacher for the first five days of intervention to reinforce procedural integrity using checklists. Teachers continued to independently use the procedural integrity checklist and follow the visual prompts in the intervention packet to scaffold daily implementation. During the treatment phase, the first author gathered completed materials from the teachers on a daily basis to receive feedback on implementation. Every two weeks, the first author visited and observed each classroom using the same procedural checklist. A total of ten procedural integrity checks occurred, and calculations from the checklists indicated 92% adherence to the intervention protocol.

Interscorer Agreement

Accuracy denotes the quality to which experimental values represent a precise measure of behavior that occurred during an experiment. With accuracy, researchers provide more evidence than the interobserver agreement by calculating and recording the exact values of investigational data (Johnston and Pennypacker 2009; Kostewicz et al. 2016). In the present experiment, the researchers used answer keys to correct the WJ-IV subtests. Two teachers and two evaluators checking and rechecking each assessment resulted in 100% interscorer agreement.

Experimental Design

The researchers employed a quasi-experimental, switching replications design (SRD) to compare the intervention to standard business-as-usual math practice. A two-group experiment occurs with a treatment group and a control group. After the first group receives treatment, the roles reverse with the control becoming the treatment group and the initial treatment group becoming the control. Considered a robust interrupted time-series design, each time series operates similar in length and controls for preexisting individual differences and extraneous variables through the use of within-subject analyses (Trochim and Donnelly 2008). Since one group starts without the intervention and then later receives the intervention, a between-group comparison serves as a secondary analysis (Cook et al. 2002). Intervention replication with the second group provides additional evidence of treatment efficacy. Researchers suggest a switching replications design has advantages over a randomized design by (a) controlling for threats to internal validity, (b) enhancing external and construct validity, and (c) ensuring all participants have an opportunity to receive treatment (Cook et al. 2002; Edmonds and Kennedy 2013; Trochim and Donnelly 2008).

For the present study, two pre- and post-assessment designs overlap with three waves of assessment (see Fig. 1). For Wave 1, the researchers delivered the WJ-IV Math Facts Fluency and Number Matrices subtests to all the participants to (a) establish baseline scores and (b) assign individual math classes to create two equivalent groups. Between the Wave 1 and Wave 2 assessments, Group 1 receives treatment while Group 2 does not receive treatment. Between Wave 2 and Wave 3 assessments, Group 2 receives treatment, while Group 1 does not receive treatment. After Group 1 completes the intervention cycle, Wave 2 assessment serves as a post-intervention assessment for Group 1 and a pre-intervention assessment for Group 2, the new treatment group. Wave 3 functions as a delayed post-assessment to measure retention for Group 1 and a post-intervention assessment for Group 2.

Fig. 1
figure 1

Switching replications design

Experimental Analysis

SPSS, version # 26, was used for the analyses. Descriptive statistics are presented for the measures. To evaluate the impact of the intervention on math facts fluency and quantitative reasoning, the researchers used a paired-samples t test. The paired-samples t test determines whether the mean difference between paired observations differs significantly from zero. For each t test, the assumptions were examined (Meyers et al. 2006). A Welch ANOVA test was used to confirm the equality of means (Alekseyenko 2016; Jan and Shieh 2014; Ruxton 2006) per observation in Waves 1, 2, and 3. For the Math Facts Fluency subtest, the results were F (1, 49.91) = 0.48, p = 0.49 for Wave 1, F (1, 57.99) = 3.94, p = 0.052 for Wave 2, and F (1, 54.97) = 1.11, p = 0.296 for Wave 3. For the Number Matrices subtest, the results were F (1, 54.88) = 0.46, p = 0.50 for Wave 1, F (1, 44.61) = 1.17, p = 0.28 for Wave 2, and F (1, 44.88) = 0.33, p = 0.57 for Wave 3. In each case, equal variances were assumed.

For each paired-samples t test, the test statistic, degrees of freedom, and probability were reported (Meyers et al. 2006). An alpha level of 0.05 was used to determine statistical significance. To measure the magnitude of the difference between the two means, the effect size, the researchers used Cohen's d (Cohen 1988). Paired-samples analyses provided the following within-groups comparisons: Group 1 pre-intervention assessment versus Group 1 post-intervention assessment (treatment); Group 1 post-intervention assessment versus Group 1 delayed post intervention assessment (no treatment, retention); Group 2 pre-assessment versus Group 2 pre-intervention assessment (no treatment); and Group 2 pre-intervention assessment versus Group 2 post-intervention assessment (treatment).

Results

Table 1 provides a summary of the paired samples t test analyses and associated descriptive statistics for the WJ-IV distal measures. Group 1 pre-intervention assessment versus Group 1 post-intervention assessment analysis produced a large effect size (d = 0.814) on the WJ-IV Math Facts Fluency subtest and a medium effect size (d = 0.578) on the WJ-IV Number Matrices subtest for quantitative reasoning. Group 1 met the assumption of normality for the WJ-IV Math Facts Fluency (p = 0.053) and Number Matrices (p = 0.641) subtests as assessed by Shapiro–Wilk test and did not display outliers upon visual inspection of boxplots.

Table 1 WJ-IV results of paired samples t tests and descriptive statistics

The Group 1 post-intervention assessment versus Group 1 delayed post-intervention assessment comparison produced statistically insignificant effect sizes on the WJ-IV Math Facts Fluency (d = − 0.147) and Number Matrices (d = 0.153) subtests suggesting students retained gains in performance. Group 1 did not display outliers and met the assumption of normality on the WJ-IV Math Facts Fluency subtest (p = 0.881). However, Group 1 violated the assumption of normality (p = 0.009) and displayed six outliers upon visual inspection of boxplots for the WJ IV Number Matrices subtests. Paired-samples analyses conducted with and without the outliers both produced statistically insignificant results.

Group 2 pre-assessment versus Group 2 post-assessment comparison in the nontreatment condition produced insignificant effect sizes on the WJ-IV Math Facts Fluency (d = 0.129) and Number Matrices (d = 0.106) subtests. Group 2 met the assumption of normality for the WJ-IV Math Facts Fluency (p = 0.304) and Number Matrices (p = 0.316) subtests and visual inspection of boxplots did not indicate outliers. After receiving treatment, the Group 2 pre-intervention assessment versus Group 2 post-intervention assessment yielded a high-medium effect size (d = 0.700) on the WJ IV Math Facts Fluency subtest and a medium effect size (d = 0.523) on the WJ IV Number Matrices subtest. Group 2 met the assumption of normality on both the WJ-IV Math Facts Fluency (p = 0.605) and Number Matrices (p = 0.127) and did not display outliers.

Discussion

The investigation provides evidence that systematic fluency instruction with fact families delivered on a daily basis over a 50-day window of intervention can have a statistically significant effect on both quantitative reasoning and math facts fluency. The interrupted time series design (i.e., switching replications) demonstrated experimental control where level of performance increased contingent upon the group receiving treatment. Replication increased validity and reliability that a confound did not alter effectiveness of the intervention. The three waves of assessment permitted the researchers to collect within subject data from 63 students at each time point. Outcomes also confirm the research team’s hypothesis that systematic practice with fact families can plausibly serve as a valid and reliable alternative to fluency instruction with isolated, unrelated math facts.

The results of the current investigation support the findings of the Codding et al (2011) fluency intervention component meta-analysis where practice with modeling and three treatment components generates significant effect sizes (Codding et al. 2011). Although students did start the intervention highly accurate (M = 32 DCPM and 98% correct), the teachers continued to reinforce accuracy by modeling new fact families, observing written evidence, and leading in unison responding. The lead researchers and teachers also attributed accuracy to availability of the fully worked fact families displayed at the top of the practice sheets. Members of the research team confirmed accuracy when cross-checking responses on assessments to answer keys when recording data.

The frequency-building intervention contained high levels of responding which in turn increased performance accuracy, speed, and retention (Codding et al. 2011; Daly et al. 2007; Stocker et al. 2018b). The researchers observed momentum and resistance to distraction through repetitive movement cycles initiated by the timed trials (Lee 2006). The Morningside curriculum strategically levels materials by introducing two or three facts families at time which prevented frustration and reinforced accurate and rapid responding. The practice sheets contained more problems than a student could respond to ensure the intervention would not place a ceiling on student performance and halt opportunities to respond before time elapsed (Johnson and Layng 1996; Kubina and Yurich 2012). As a result, the students completed a total of nine minutes of uninterrupted, intensive timed practice per intervention day.

Performance feedback available during and after each timing reinforced correct versus incorrect responding (Hattie and Timperley 2007; Siegler and Shrager 1984). Self-managing feedback for 20 s between each timing allowed the students to take responsibility for their own learning versus relying on teacher mediation. Teachers reported students rarely relied on the fully worked fact families at the top of the page beyond the first days of introducing the new fact family indicating a large number of students entered the intervention accurate but did not reach a level of fluency. Teachers also found that self-checking and scoring took less than the 20-s built in between each timing. Findings suggest the visual representation of problems and solutions involved in the self-managed feedback cycle reflect the effectiveness that occurs in mathematics fluency research in interventions such as cover, copy, compare and detect–practice–compare (Codding et al. 2009; Poncy et al. 2006; Stocker and Kubina 2017).

A novel approach to the fact family intervention involved the students self-monitoring speed when counting up correct responses. Feedback occurred a total of six times between the 30-s timed trials and another four times between the 60-s timings for a total of ten opportunities. Therefore, the students had six opportunities to “beat their previous score”. Students and teachers delivered verbal praise and discouraged sharing who had the “highest score”. The researchers and teachers observed a high level of motivation with students “trying to beat their last score” versus competing for attention that occurs when bragging, comparing, and ranking performances. Allowing students to focus on their own short-term achievement provided a context for continued motivation to engage content while still meeting the needs for attention and competition (Burnett 2002; Kubina and Yurich 2012). Students did express some boredom on the easier fact families (e.g., 2, 2, 4; 2, 5, 10) and wanted to move to the next slice.

Students practiced 72 fact families within an intervention window of 50 academic calendar days. The final analysis did not account for student absences and tardiness to provide an evaluation of performance realistic of a school environment. Nevertheless, statistically significant outcomes occurred on the distal measures. The authors hypothesize that guaranteeing each student received 50 daily “doses” of intervention could have plausibly increased the effect size on the WJ-IV Math Facts Fluency subtest (Burns et al. 2015; Skinner 2010).

The significant growth across a wide range of fact families and quantitative reasoning could, also in part, be attributed to the efficiency of practicing inverse relationships on the missing number practice activity which signaled students to bring the four separate stimuli into a larger equivalence class (Cooper et al. 2007; Sidman 2000). Only nine minutes of timed practice occurred each day to increase the frequency of correct responding with additive and multiplicative families. The practice of inverse relationships may also have had a positive effect on retention. When Group 1 completed the WJ-IV subtests in Wave 3, the students decreased by 5.48 correct problems over 3 minutes on the Math Facts Fluency subtest and increased insignificantly by 2.03 standard score points on the Number Matrices subtest which suggests Group 1 demonstrated a high level of retention.

The WJ-IV Number Matrices subtest assessed for quantitative reasoning through problem-solving the missing pattern number within a matrix. A successful performance hinges on the ability of the student to inductively and deductively reason with numbers. Students had to apply strategic mathematics thinking and conceptual learning through seeking out patterns and relationships and apply reasoning to emit an accurate response (Schrank et al. 2014). Emphasis placed on the missing number fact family intervention practice strategy suggests frequency-building stimulated response generalization to the more complex analogous number pattern recognition task occurred on the Number Matrices subtest (Cooper, et al. 2007). Thus, the medium effect size yielded by both groups provides growing evidence that associates a more direct link between practice with fact families and quantitative reasoning versus practice with isolated and unrelated math facts.

To record a correct response on the Number Matrices subtest, a student has to problem-solve for the missing pattern vertically and horizontally. Automatic recall of fact family combinations plausibly allowed students to allocate cognitive resources toward analyzing the relationship among numbers and map or project the equation to complete the analogy on the matrix versus relying on inferior counting strategies which would divert attention from problem-solving (Baddeley 2003, 2012; Donahoe and Palmer 2004; Schrank et al. 2017). Perhaps frequency building can function as a tool that simultaneously combines a number of smaller skillsets to leverage a wider range of problem-solving capabilities.

Social Validity

The researchers communicated with teachers at least twice per week to check-in and receive feedback on the intervention. The teachers reported that students enjoyed participating and the concept of bettering their last score. Students reported to the teachers that the intervention “made them better at math” and increased their confidence. Students recognized the value of daily practice, and teachers continued the intervention in the next school year.

Since each group had a 50-day window to receive the intervention, the researchers made the decision to deliver fact families in doses. Notably, because of inclusive environments and a range of ability and achievement levels, a dosage approach may increase the appeal of the intervention at the middle school level as sharp discrepancies in student performance are less likely pronounced. The smaller 50-day window versus conducting a full-year intervention on math facts fluency also addressed concerns of middle school teachers who recognized fluency as a critical need but entered the investigation skeptical if the 15 min per day sacrifice would represent the best use of instructional time.

After engaging the intervention protocol with and without guidance over the first couple weeks, teacher direction started to decrease and students took more control of the process. As a result, the daily pace of intervention increased. Because the intervention packets included multiple scaffolds (i.e., arrows, “no skipping”, timings) that supported procedural integrity, the teachers reported that the intervention became highly routinized and found the students anticipating the subsequent step(s) of the daily process. Teachers reported easier transitions and complimented on the interest students took in mathematics. The school adopted the fact family intervention for the following school year.

Limitations and Future Research

A number of limitations warrant attention. First, the small sample did not allow for a cross-categorical analysis. Second, although the teachers made the decision on which classes would first receive intervention based on researcher request for equivalent groups, future research should include a larger randomized sample to closely evaluate the impact of the intervention on performance of students across tiered systems of support and disability groups. As with any intervention research, study replication and refinement increase the body of evidence to support the reliability and validity of the findings. From a search conducted in the literature, the current study may represent one of the first peer-reviewed investigations that isolates the effects of a fact family fluency intervention on quantitative reasoning using a group design with middle school students.

The researchers recognized the value of a school piloting a new math fluency program. Given the possible consequences of using only a novel approach (i.e., missing number fact family) that would encumber the first 15 min of math instruction over 50 days, the protocol included both missing number fact families and traditional display of fact families to increase fluency as well as quantitative reasoning. Future research would benefit from an investigation that isolates the effects of fact family practice activities versus traditional methods on fluency and quantitative reasoning.

The research team collected procedural integrity data for 20% of the interventions days and did not collect systematic data from the nontreatment group to document business as usual, as the researchers allocated resources to ensuring fidelity of treatment through regular classroom visits and accuracy of assessment in the treatment group. The two lead authors had a daily presence in the school, and through general observation and regular discussion with teachers, the students in Group 2 did not receive fact family fluency practice in the nontreatment condition.

Six students in one class who either struggled in mathematics or had a disability in Group 2 during nontreatment received 10 min of computer-assisted fluency practice on most days which may have led to treatment diffusion on the distal measures between Wave 1 and Wave 2 assessments. For example, the statistically insignificant increase in standard scores on the WJ-IV Math Facts Fluency subtest for Group 2 in the nontreatment condition would suggest minimal impact on its own; however, the effects may have had a delayed impact on effect size on the distal measure in the Wave 3 assessment. To further evaluate, future research could compare performance of students between fact family practice versus computer-based practice. The current investigation did not factor in absences and tardiness of students. A natural extension to the research can include an evaluation of dosage necessary for students with various educational needs as recognized by Fuchs et al. (2017) to modulate the intensity of intervention.

Unfortunately, the intermittent outages and slow connection to the Internet did not permit an analysis of the effects of the computer application nor did students have the opportunity to self-monitor growth with fidelity. Hence, the researchers collected social validity on anecdotal notes from daily visits and end-of-study interviews with teachers. Future research should still consider the use of a digital charting application with the intervention because the students did express high interest when the Internet worked and then disappointment and frustration over the inability to access, add scores, and evaluate visual displays.

Conclusion

Many students find mathematics difficult, which becomes problematic considering the importance of mathematics achievement for long-term outcomes. Mathematics is progressive and hierarchical (Geary 2013) requiring that students master requisite whole-number skills before learning more advanced skills and problem-solving processes. Yet, many students with mathematics difficulties have persistent challenges with base-10 number systems (Bryant et al. 2008) and curriculums do not provide enough sequential practice (Daly, et al. 2007; Witzel and Riccomini 2007) to adequately support student mastery of foundational mathematics skills, such as math facts. Failure to effortlessly and accurately recall mathematics facts inhibits cognitive efficiency when engaging more complex procedures. The research addresses a fundamental area of need for evaluating a supplemental mathematics frequency building intervention that can improve quantitative reasoning. Furthermore, evidence suggests fact family practice plausibly alleviates some of the critical appraisals of practicing mathematics facts in isolation.

The findings from the intervention support Codding et al.'s (2011) meta-analysis indicating that practice with modeling and three treatment components effectively supports fluency. The researchers systematically tested evidence-based treatment components in a novel way to increase both fluency and quantitative reasoning. The robust findings from the study also add new information to discussions in the field about mathematics fact fluency proficiency levels and growth rates. By targeting discrete skills and providing systematic timed practice, the researchers demonstrate generalizable benefits of targeting fact family fluency, suggesting the fact family intervention has promise as a high-leverage, evidence-based practice (Cook and Cook 2013).