Introduction

The co-occurrence rate between reading difficulties and attention-deficit/ hyperactivity disorder (ADHD) is approximately 15–40% (Goldston et al., 2007; Sexton et al., 2012; Willcutt et al., 2005). Students with co-occurring reading difficulties and ADHD tend to have more lower reading outcomes than students with only reading difficulties and more severe behavioral difficulties than students with only ADHD (Lyon, 1996; Mayes & Calhoun, 2007). According to the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association, 2013) the criterion for ADHD is having both inattention (e.g., disorganized, easily distracted, difficulty following directions) and hyperactivity (e.g., impulsivity, difficulty waiting, talking out). In terms of the co-occurrence between ADHD and reading, the inattention subtype is more strongly associated with reading difficulties than the hyperactivity subtype (Massetti et al., 2008; McGrath et al., 2011).

Students with co-occurring reading difficulties and inattention tend to have lower pretest scores and are at a greater risk of inadequate response to intensive reading interventions than students with a reading difficulty without inattention (Cho et al., 2015; Friedman et al., 2019; MacDonald et al., 2020). Due to the negative impact of inattentive behaviors on reading performance, many researchers have called for the integration of behavioral supports into academic instruction to simultaneously address reading and inattention (i.e., low engagement; Burns et al., 2012; Kuchle & Riley-Tillman, 2019; MacDonald et al., 2020; Roberts et al., 2021). Unfortunately, reading intervention research that concurrently investigates student attention (or engagement) is limited (MacDonald et al., 2020; Roberts et al., 2020; Stewart & Austin, 2020; Tannock et al., 2018). Thus, it is not surprising that schools too often address reading and behavior difficulties separately, when it may be more efficient to address them simultaneously within a given reading intervention (Burns et al., 2012; Friedman et al., 2019; Roberts et al., 2020, 2021).

Upper Elementary Reading Intervention Research for Students with ADHD

The focus of the current study is on students with co-occurring reading difficulties and inattention. Although, this section reviews the literature on reading interventions for students with co-occurring reading difficulties and ADHD, for two reasons. First, students with ADHD also have inattention. Second, to the best of our knowledge, the current study is the first reading intervention study to date to have an inclusion criterion of students having co-occurring reading difficulties and inattention.

In the last decade, two group design studies investigated the impact of small group reading interventions on samples which included upper elementary students (grades 4–5) with co-occurring reading difficulties and ADHD (Tamm et al., 2017; Tannock et al., 2018). In Tamm et al. (2017), grade 3–5 students were randomized to one of three treatment conditions: (a) reading-only, (b) combined reading with parent behavioral training and medication, and (c) parent behavioral training and medication (without reading). Findings suggested that students who received a reading-only or combined reading with parent behavioral training and medication conditions outperformed the parent behavioral training with medication condition (without reading) on word reading outcomes, but not reading comprehension (Denton et al., 2020). In Tannock et al. (2018), the authors found that grade 2–5 students who received one of two different word reading treatments outperformed a cognitive training group (without reading instruction) on word reading and reading comprehension, but not on indirect teacher measures of inattention and hyperactivity. Both Tamm et al. (2017) and Tannock et al. (2018) used modified commercially available reading interventions that primarily targeted phonics instruction and did not include a direct measure of behavior (i.e., only used surveys). Findings from these studies suggested that using reading interventions can support reading outcomes, but do not lead to improved behavioral outcomes.

Additionally, five recent single-case design reading interventions have been conducted for upper elementary students with ADHD (Stewart & Austin, 2020; i.e., Cullen et al., 2013, 2014; Flores & Ganz, 2007, 2009; Jozwik & Douglas, 2016). Two Cullen et al., (2013, 2014) studies implemented a computer-based reading intervention and measured reading outcomes. Cullen et al. (2013) found an increase in the percentage of words read correctly and Cullen et al. (2014) found improvements in reading comprehension. Neither Cullen et al., (2013, 2014) study measured behavior outcomes. Flores and Ganz (2007, 2009) and Jozwik and Douglas (2016) utilized instructor-delivered reading instruction and found an increase in comprehension and vocabulary. Across these five studies, a total of six upper-elementary students with ADHD were included and no study included behavior support or measured behavior outcomes.

Overall, across the two group and five single-case design reading intervention studies, the groups which received the reading instruction outperformed the groups that did not receive reading instruction on at least one reading outcome (i.e., word reading, reading fluency, reading accuracy, vocabulary, reading comprehension). For behavior, the two group design studies (Tamm et al., 2017; Tannock et al., 2018) measured behavior indirectly through a survey and the single-case design studies did not measure behavior. Across the seven studies reviewed, no study tested the impact of a behavioral component embedded into the reading instruction and no study used a direct observation measure of behavior to evaluate student behavior (e.g., engagement, disruptive behavior) during reading instruction. Findings from these studies point to a need to both measure engagement and develop efficient and effective methods to improve engagement during reading instruction to support students with co-occurring reading difficulties and inattention.

Current School-Based Behavior Intervention Research to Support Engagement

Classroom-based behavioral strategies to support students' engagement can be grouped into two categories: antecedent-based (i.e., a manipulation of events before the behavior) and consequence-based (i.e., a manipulation of events that occur after the behavior). Both antecedent- and consequence-based strategies are considered critical components of classroom behavior management and have led to improvements in the engagement of students with inattention or low engagement (e.g., Collier-Meek et al., 2019; DuPaul & Weyandt, 2006; DuPaul et al., 2011; Gaastra et al., 2016; Harrison et al., 2019; Sayeski & Brown, 2011; Simonsen et al., 2015). Based on the Office of Special Education Programs’ technical assistance document on supporting and responding to student behavior (Simonsen et al., 2015), and other reviews for students with ADHD (e.g., DuPaul & Weyandt, 2006; DuPaul et al., 2011; Gaastra et al., 2016; Harrison et al., 2019; Zaheer et al., 2019) highly effective antecedent-based strategies include: establishing and teaching classroom expectations with explicit instruction, reviewing the class expectations, providing students frequent opportunities to respond to instruction, and pre-correcting behaviors that do not meet expectations. Recommended highly effective consequence-based strategies include brief behavior-contingent error corrections, differential reinforcement (i.e., appropriate behaviors are reinforced with inappropriate behaviors ignored), and using behavior specific praise (DuPaul & Weyandt, 2006; DuPaul et al., 2011; Gaastra et al., 2016; Harrison et al., 2019; Simonsen et al., 2015). In addition to consequence-based strategies stated by the Office of Special Education Programs (Simonsen et al., 2015), positive reinforcement can also include token economies, in which the tokens (e.g., stickers, points) are later exchanged for a tangible item or a desired activity (e.g., DuPaul & Weyandt, 2006; DuPaul et al., 2011). When token economies are used in a classroom setting, rewards can be earned through independent group contingencies, (i.e., a reward is earned for the individual students based on their own behavior), dependent group contingencies (i.e., a reward is earned for the group based on the behavior of an individual or subset of students in the group), or interdependent group contingencies (i.e., a reward is earned for the group based on the behavior of all students in the group; Simonsen et al., 2008).

Several studies have combined antecedent- and consequence-based strategies to support student behavior in the general education (e.g., Kamps et al., 2015; Sutherland et al., 2020; Wills et al., 2018) and more restrictive settings (e.g., Harris et al., 2009; Oakes et al., 2010). Although, currently, research is limited on embedding both antecedent and consequence-based strategies into reading instruction (McKenna et al., 2017, 2019; Steward & Austin, 2020). In two related studies by Harris et al. (2009) and Oakes et al. (2010), the authors embedded antecedent- and consequence behavioral supports into small group reading instruction to support students with co-occurring reading and behavioral difficulties (Harris et al., 2009; Oakes et al., 2010). Harris et al. (2009) and Oakes et al. (2010) both taught expectations to students, used a token economy with response cost, and provided a choice of reward. In both Harris et al. (2009) and Oakes et al. (2010) oral reading fluency and phonics outcomes varied by student and direct measures of behavior were not included.

Relative to small group instruction, more research is available on embedding the combination of antecedent- and consequence-based strategies into the general education setting. One such program with evidence of efficacy using antecedent- and consequence-based strategies is the Class-wide Function-related Intervention Teams (CW-FIT; e.g., Kamps et al., 2015; Wills et al., 2009, 2018). CW-FIT includes (a) teaching specific behaviors (i.e., class expectations) over three to five sessions, (b) reviewing expectations and providing precorrections, (c) establishing group contingencies via teams of students, (d) creating point goals and delivering points (i.e., token economy), and (e) delivering rewards. Based on findings suggesting that CW-FIT leads to improved student engagement (e.g., Kamps et al., 2015; Wills et al., 2018) embedding antecedent- and consequence-based strategies, such as those presented in CW-FIT, has the potential to improve the engagement of students with co-occurring reading difficulties and inattention during reading instruction (Caldarella et al., 2018; Kamps et al., 2015).

Current Study, Purpose, and Research Questions

For upper elementary students with reading difficulties and inattention, it is critically important to identify mechanisms to improve student engagement during reading instruction (Cho et al., 2015; MacDonald et al., 2020; Roberts et al., 2021). To address this need, this study aims to improve the engagement of upper elementary students with co-occurring reading difficulties and inattention through embedding antecedent- and consequence-based behavior supports into a commercially available, evidence-based reading intervention shown to be effective at improving the reading outcomes of upper elementary students. After a careful review of all reading interventions on the What Works Clearinghouse and recently published peer-reviewed research articles for upper elementary students with reading difficulties and inattention, Voyager Passport (Voyager Sopris Learning, 2008) was selected as the evidence-based curriculum for the following reasons: (a) there was a randomized controlled trial demonstrating the effectiveness on reading comprehension outcomes for grade 4 students with reading difficulties (Wanzek et al., 2017), (b) the program could be implemented as a stand-alone supplemental intervention to the core reading instruction, (c) the program had a targeted reading comprehension component, which is a particularly important for upper elementary students (e.g., Vaughn et al., 2019), and (d) the program could be implemented daily in 30–45 min sessions. To date, this was the first study to utilize Voyager Passport for students with reading difficulties and ADHD or inattention.

By comparing a baseline condition with the delivery of the evidence-based reading curriculum (Voyager Passport) to an intervention condition of Voyager Passport with embedded antecedent- and consequence-based behavior supports, this study will answer the following research question: What are the effects of integrating behavior supports into a reading intervention on student engagement relative to a reading intervention without behavior supports for fourth-grade students with co-occurring reading difficulties and inattention?

Method

Setting

This study was conducted in an urban elementary school in the Rocky Mountain region of the United States. The elementary school had approximately 600 students with 56% White, not Hispanic, 22% Hispanic, 11% Black, not Hispanic, and the remaining 11% were either multiple races, Asian, Native American, or Native Alaskan. Additionally, 12% of the students were English language learners, 36% qualified for free or reduced lunch, and 15% received special education services. The study was conducted in the period after recess in a resource room setting across the hall from the general education classroom. During the intervention, two other small groups of students were receiving small group academic support in the same room. The school principal chose this setting to allow for a quiet space to conduct the intervention.

Selection Procedure and Participants

In partnering with a local school, fourth-grade teachers were asked to nominate five students they believed to have both reading difficulties and low engagement. All five teacher-nominated students received the baseline and intervention conditions. The next sections describe screening procedures to confirm the presence of reading difficulties and inattention and provide participant information for those who met the inclusion criteria.

Screening Procedure and Measures

With the five nominated students, the presence of having or being at-risk for reading difficulties and inattention was confirmed through a double-gating screening procedure with the Gates-MacGinitie Reading Test (GMRT) reading comprehension subtest (MacGinitie et al., 2000) and the Strengths and Weaknesses of ADHD Symptoms and Normal Behavior (SWAN; Swanson et al., 2012), respectively. The GMRT is a timed, group-administered assessment measuring reading comprehension and targeted inference making, summarization, literal understanding, and vocabulary. The English language arts (ELA) teacher completed SWAN (Swanson et al., 2012) is a timed ADHD screening measure.

The double-gating procedure criteria included students having a GMRT standard score of less than or equal to 85 (16th percentile), a similar criterion to previous studies with upper elementary students with reading difficulties (e.g., Vaughn et al., 2019), and a SWAN inattentive raw score of 6 or greater (representing a likelihood of having ADHD-inattention type). Three of the five nominated students met both inclusion criteria. The remaining two students only met the criterion of having a reading difficulty.

Participants

Participants were ten-year-old, fourth-grade students who did not receive special education services or English language services. Annette was a Black female, Elijah was a White male, and DeMarcus was a Black male. At the time of the study, no students were diagnosed with ADHD. The second behavior measure delivered at pretest, the Behavior Assessment Scale for Children, third edition, Teacher Rating Scale (BASC-3 TRS; Reynolds & Kamphaus, 2015), confirmed that the students displayed inattentive behaviors in the general education classroom setting. Additional information on each pretest measure is presented in the following section. Table 1 presents the three students’ pretest reading and behavior scores.

Table 1 Reading and behavior pretest scores

Pretest Measures

Reading

Assessment team members delivered the reading measures following a one-hour training and obtaining 100% reliability with the lead trainer on the research team. In addition to the screening measure, three reading measures were delivered at pretest. The Test of Word Reading Efficiency, Second Edition-2 Sight Word Efficiency subtest (TOWRE-SWE; Torgesen et al., 2012) is a 45 s, individually administered measure of word reading fluency. This measure prompts students to read a list of increasingly difficult words. The easyCBM passage reading fluency measure is a 1 min timed reading passage assessment with benchmark and progress monitoring forms (Alonzo et al., 2006). The Test of Sentence Reading Efficiency and Comprehension (TOSREC: Wagner et al., 2010) is a group-administered test of reading fluency and comprehension. Students are given three minutes to read and verify the accuracy (circling ‘yes’ or ‘no’) of as many sentences as possible.

Behavior

In addition to the SWAN screening measure, the assessment team delivered one additional behavior measure, the BASC-3 TRS (Reynolds & Kamphaus, 2015). The survey takes approximately 15 min to complete. For each scale and subscale, standard scores of 115–129 and greater than or equal to 130 meet the categorical criteria for at-risk and clinically significant, respectively. Table 1 reports the BASC-3 externalizing and internalizing behavior composite scores and ADHD and attention problems scores.

Direct Measure of Engagement

Each session was video recorded, and a direct observation measure of student engagement was coded. Student engagement was defined similarly to other single-case design studies monitoring engagement during class-wide and reading instruction (e.g., Harris et al., 2005; Wills et al., 2018) and included (a) having eyes oriented toward a given assignment or the teacher during instruction, directions, or on-topic comments or questions, (b) working on an assigned task, (c) using the materials appropriately (e.g., writing on a paper with a pencil, reading a book, opening a binder), and (d) interacting with teachers or peers about academic topics relevant to completing assignments. Using the video-recorded lessons, engagement was coded using a 10-s momentary time sampling recording system. Engagement was recorded at the end of each 10-s interval. On the coding sheet, 10-s intervals were scored with a one or zero if the student was engaged or not engaged, respectively. To calculate the percentage of time on task, the sum of the intervals with engagement were divided by the total number of intervals and multiplied by 100.

Procedures

Design

The study used an ABAB withdrawal design to examine the extent to which a functional relation was present between the reading with behavior supports intervention condition and student engagement for three fourth-grade students with co-occurring reading difficulties and inattention. All sessions were scheduled to be delivered daily for 30 min in a small group format with five students. In the ABAB withdrawal design, the following procedures were implemented sequentially: (a) professional development for the interventionist to deliver the reading intervention in the baseline phase, (b) baseline phase, (c) professional development for the interventionist to deliver the reading intervention with behavior supports in the intervention phase, (d) three student training sessions to learn the intervention, (e) intervention phase, (f) baseline phase, and (g) intervention phase. Additionally, all baseline and intervention phases had a minimum of three data points to meet the What Works Clearinghouse Design Standards with reservations (Institute of Educational Sciences, 2020). The timing of the phase changes was based on visual analysis of engagement data to identify stability in level or trend.

Interventionist Professional Development Prior to the First Baseline Phase

The interventionist was a female graduate student enrolled in an Early Childhood Special Education program. She was hired by the research team to deliver the intervention. Outside of the intervention, she had no other relationship with the school or its students. Prior to the intervention, the interventionist was informed that she would be delivering a reading intervention with and without behavior support, and that the aim was to improve student behavior. The interventionist was unaware of the operational definition of dependent variables or the study’s single-case design methodology. Prior to baseline, the interventionist participated in a four-hour reading intervention delivery professional development. The professional development included reviewing the components of the reading intervention, modeling the delivery of the instructional components, and the interventionist delivering the lesson to a member of the intervention team with a minimum of 90% fidelity. During the baseline phases, weekly meetings to review fidelity to the reading intervention were conducted by a member of the research team.

Baseline Phases

During the baseline phases, the interventionist delivered the Voyager Passport curriculum. Lessons from Voyager Passport consisted of two parts: word study and connected text. During word study, students participated in a 2 min warm-up and advanced word study activities. The warm-up included reading and spelling practice with vowel combinations and sight words. Advanced word study focused on concepts that included: prefixes, suffixes, compound words, root words, antonyms, and synonyms. The connected text component included previewing text, introducing vocabulary, engaging in repeated readings at students’ instructional levels, and utilizing different formats to check for understanding. Throughout the connected text activities, vocabulary and comprehension strategies were explicitly taught and practiced. This program included common research-based practices designed to support active student learning in the reading lesson, such as discussing questions, utilizing graphic organizers, and making connections both orally and through written practice during the baseline phase, the active learning supports remained in place to allow for comparison between the intervention phase and typical small group reading instruction for students with reading difficulties and inattention.

Interventionist Professional Development Prior to the Intervention Phase

Prior to the intervention phase, the interventionist received a two-hour professional development on how to embed behavior supports into the reading instruction delivered in the baseline phase. The behavior supports included evidence-based antecedent- and consequence-based classroom management strategies identified as effective for students with ADHD or low engagement (DuPaul et al., 2011; Evans et al., 2014; Harrison et al., 2019; Simonsen et al., 2015; Wills et al., 2009, 2018). The four behavior supports (more thoroughly described in the intervention phase section) included (a) identifying, teaching, and reviewing group rules, (b) behavior specific praise and precorrections (i.e., a reminder of an expected behavior before the behavior should occur), (c) token economy (i.e., awarding points contingent on appropriate behavior), and (d) point goals with a reward for obtaining the point goal. The professional development included modeling the expected interventionist behavior support practices, guided practice, and independent demonstration of mastery through the delivery of one lesson to a member of the intervention team with 90% or greater fidelity. During the intervention phase, a member of the research team facilitated weekly meetings to review fidelity to the reading and behavior intervention procedures.

Student Training Prior to the First Intervention Phase

Following the first baseline phase and prior to the first intervention phase, all students received three sessions that began with a 10 min training on the behavior support intervention. The purpose of the training sessions was to teach the group rules and token economy system (further described in the intervention phase section). Each of the three trainings aligned to CW-FIT-based procedures for introducing classroom rules (e.g., Kamps et al., 2015; Sutherland et al., 2020; Wills et al., 2018). At the beginning of each training session, one new group rule was introduced and adhered to the following sequence: (a) post the new (and previously taught rules) of the day in a visible location, (b) introduce one new group rule, (c) describe why the rule is important (i.e., rationale), (d) provide opportunities to role-play the rule, and (e) review the previously taught rules (second and third training session only). The following rules (also from CW-FIT) were introduced sequentially across the three training sessions: raise your hand to get the teacher’s attention, follow directions the first time, ignore peers’ inappropriate behaviors.

Following each of the three 10 min trainings, students completed modified lower-demand reading tasks (e.g., respond to read-alouds, read a story with a partner, draw a picture about a story) to allow more opportunities to access positive feedback, points, and the reinforcer. These reading tasks did not follow the reading lesson design during the baseline or intervention conditions. Therefore, student engagement is not reported during the training sessions. There was not a student mastery criterion required prior to beginning the intervention phase, although skills were reviewed daily in the intervention phase sessions. Procedural fidelity of the training sessions, as delivered by the interventionist, is reported in the procedural fidelity section.

Intervention Phase

During each intervention phase session and based on procedures outlined in CW-FIT (e.g., Kamps et al., 2015; Sutherland et al., 2020; Wills et al., 2018) for reviewing expectations and reinforcing appropriate behavior, the subsequent 5-step routine was implemented. First, upon student entry to the group, a 3 min timer was set and started. When the timer sounded, points were awarded to individual students who followed all group rules at the moment the 3-min timer sounded, followed by behavior-specific praise. If a student did not earn a point, a precorrection was given prior to restarting the timer. The timer was then reset and started again. Three-minute intervals have been shown to be an acceptable interval for delivery of points in both general education and special education classroom settings (e.g., Kamps et al., 2015; Orr et al., 2020). Second, group rules were reviewed. Third, the interventionist established a point goal for each student. Fourth, the reading lesson began. Finally, at the conclusion of the reading lesson, points were tallied and the students received a reward if they met their point goal. Throughout the lesson, behavior specific praise and precorrections were delivered.

Additionally, to fit the context of the small group reading setting and support student behavior and buy-in, in this study, students discussed and agreed upon the point goal with the interventionist at the beginning of the lesson. Furthermore, an independent group contingency reward system (as compared to a group contingency in CW-FIT) was implemented (i.e., a reward is earned for the group based on the behavior of all students in the group). Therefore, points were earned on an individual basis. Students who met their individual point goal at the end of the lesson engaged in a 3 min game with the instructor (e.g., Uno, Go Fish) until the session ended. Students who did not meet their goal continued with independent reading work.

Procedural Fidelity

Research team members evaluated procedural fidelity (Ledford & Gast, 2018) on all sessions using direct observation methods (Lane et al., 2004) from video-recorded sessions. The procedural fidelity protocol measured the extent to which the behavior supports were present during the baseline and intervention phases. The expectation was low behavior support procedural fidelity during the baseline phases and high procedural fidelity during the intervention phases. The behavior support procedural fidelity protocol had nine components and were scored as 0 (not present) or 1 (present). The behavior support procedural fidelity checklist, modified from Wills et al. (2018), included the following components: (a) class expectations were posted and reviewed, (b) points were in sight of students, (c) point goals were discussed and posted, (d) timer was used with 3 min intervals, (e) points were delivered when the timer sounded, and (f) points were calculated and desired activity was provided (when applicable) at the end of the session. To calculate procedural fidelity, the number of points earned was divided by the total possible points and multiplied by 100. Behavior support procedural fidelity for all baseline sessions was 0% (SD = 0%, range 0–0%). Behavior support procedural fidelity across all intervention sessions was 100% (SD = 0%, range 100–100%).

The procedural fidelity of the training sessions was also measured using a 15-component procedural fidelity protocol form. Identical to the procedural fidelity ratings during the baseline and intervention sessions, each component was scored as a 0 (not present) or 1 (present). The procedural fidelity across all three training sessions averaged 82% (SD = 3%, range 80–87%).

Inter-Observer Agreement

Graduate students were trained to reliably code engagement and procedural fidelity. Coders were blind to the study purpose. Two teams of coders (i.e., engagement coding team, procedural fidelity coding team) participated in separate one-hour trainings. The engagement coding training session reviewed the basics of behavior coding, identified examples and non-examples of engagement based on the operational definition provided, and independently reached an IOA of 85% or above with the lead IOA trainer. The behavior support procedural fidelity training session (a) reviewed the fidelity form items, (b) identified examples of not present, partially present, or fully present for each item, and (c) independently reached an IOA of 85% or above with the lead IOA trainer. Interval-by-interval comparisons were used to calculate IOA by summing the number of intervals with agreements, dividing the sum by the total number of intervals (i.e., agreements plus disagreements), and converting the result to a percentage. All engagement and procedural fidelity IOA data were collected on one session per phase.

An average of 29% (SD = 8%, range 20–50%) of the sessions per phase per student were coded for engagement and procedural fidelity IOA. The average engagement IOA across all students and phases was 86% (SD = 4%, range 81–93%). Table 2 presents disaggregates the engagement IOA by presenting this data per student and phase. Procedural fidelity IOA on all baseline and intervention sessions was 100% (SD = 0%, range 100–100%). The single training phase session where procedural fidelity IOA was collected was 93%.

Table 2 Engagement and IOA data

Analysis

Results were analyzed using visual analysis, based on the What Works Clearinghouse Standards Handbook (Institute of Educational Sciences, 2017) within and across phase characteristics, the percentage of nonoverlapping data (PND; Scruggs et al., 1987), and Tau-U effect sizes (Parker et al., 2011). Visual analysis within phase characteristics included the level (i.e., mean), trend line (i.e., slope), and variability of data from the trend line. Visual analysis across phase characteristics included immediacy of effect and the extent to which data overlapped across phases. Overlap was measured using two effect size calculations, the percentage of nonoverlapping data (PND; Scruggs et al., 1987) and Tau-U (Parker et al., 2011). PND is represented in a percentage from 0 to 100%, with a larger percentage indicating a larger effect. PND is best used for stable data without a trend or outliers (Vannest & Ninci, 2015). To calculate PND for the intervention phases, the number of intervention phase data points that were greater than the largest data point in the preceding baseline phase was divided by the number of data points in the intervention phase. An identical process measured PND in the second baseline phase, except that the number of data points in the second baseline phase that were less than lowest data point in the first intervention phase was divided by the number of the data points in the second baseline phase. Tau-U (Parker et al., 2011) was chosen as the second nonoverlap effect size, because unlike PND, Tau-U accounts and adjusts for within phase trends, and Tau-U is well-suited for handling small data sets (Vannest & Ninci, 2015). Tau-U effect size interpretation has a widely agreed upon effect size categorization of small (0.20 or less), moderate (0.21–0.59), and large (0.60 or greater; Vannest & Ninci, 2015; Harrison et al., 2019; Stewart & Austin, 2020). Another benefit of Tau-U is that it is a commonly used effect size in syntheses and meta-analyses (e.g., Harrison et al., 2019; Stewart & Austin, 2020), allowing for a contextualized effect size. In the current study, effect sizes were compared against a Tau-U effect size of 0.67, which represents the mean behavior intervention effect size from a highly relevant meta-analysis on single-case design classroom-based interventions for students with ADHD (Harrison et al., 2019). Tau-U was calculated with an online Tau-U calculator (Pustejovsky, 2017), which compared the baseline and intervention phases, for each case (i.e., student) and controlled for the trend in each phase (see Parker et al., 2011; Vannest & Ninci, 2015). To identify a Tau-U effect size for each case (i.e., student), the data from the baseline phases were combined and it was compared to the combined data from the intervention phases.

Social Validity

The social validity survey was aligned to Wolf’s (1978) social validity factors including goal importance, intervention procedures acceptability, and outcome importance. Students responded to statements on a five-point Likert rating scale ranging from strongly disagree (score = 1) to strongly agree (score = 5). Students responded to the following statements: (a) improving my reading skills is important to me, (b) I enjoy coming to this reading group, (c) this reading group is helping me improve my reading skills, and (d) I will be able to use the strategies I am learning in this reading group when I read on my own. The interventionist responded to the following statements on a 7-point Likert scale ranging from strongly disagree (score = 1) to strongly agree (score = 7): (a) the intervention phase was easy to implement, (b) the intervention training prepared me to fully implement the intervention, (c) the intervention was effective at improving reading comprehension, and (d) the intervention was effective at improving student behavior. The students and interventionist completed the social validity survey at the conclusion of the study.

Results

Table 2 presents the mean engagement, standard deviation, PND, Tau-U, and the engagement IOA outcomes for each student and phase. Figures 1, 2, and 3 visually display the percentage of intervals of engagement for each session for Annette, Elijah, and DeMarcus, respectively. The following sections provide and describe the visual analysis, effect sizes, and social validity data.

Fig. 1
figure 1

Percentage of engaged time for Annette

Fig. 2
figure 2

Percentage of engaged time for Elijah

Fig. 3
figure 3

Percentage of engaged time for DeMarcus

For Annette, in the first baseline, engagement began at 88% followed by a downward trend to 67%. In the first session of the first intervention phase, engagement remained at 67% followed by a positive trend and stabilization at 90% for the last two sessions. When the intervention was withdrawn, engagement decreased to 75%, with a stable slightly downward trend to the final session of this phase at 72%. After the re-introduction of the intervention, engagement increased to 95% with a gradual downward trend to 83%. The PND was 75%, 20%, and 100% during the first intervention, second baseline, and second intervention phases, respectively. Variability around the trend line in all phases was slight, except for the first intervention phase’s first data point and second baseline phase’s third data point. The Tau-U effect size was large at 0.83.

For Elijah, during the first baseline, engagement began at 70% followed by a downward trend to 41%. In the first intervention phase, engagement began at 69%, decreased to 56%, and concluded with an upward trend to 87%. When the intervention was withdrawn, engagement decreased to 64% and continued to decrease to 31% and 29%, before increasing and stabilizing at 55% and 58%. The second baseline also had high variability around a slightly downward trend. When the intervention was re-introduced, engagement increased to 77% with a flat slope and minimal variability. PND was 25%, 40%, and 100% during the first intervention, second baseline, and second intervention phases, respectively. The Tau-U effect size was large at 0.93.

For DeMarcus, during the first baseline, engagement began high at 88% and had a downward trend to 73%. During the first intervention phase, engagement increased to 95% and remained high with a flat trend line and minimal variability. When the intervention was withdrawn, engagement decreased to 88% and had a slightly downward trend to 82%, with minimal variability. When the intervention was re-introduced, engagement decreased to 74% then 48% (both of which overlapped with Baseline 2 data), followed by an increase in engagement to 93% for the last three sessions. The PND was 100%, 80%, and 60% during the first intervention, second baseline, and second intervention phases, respectively. For DeMarcus, the second intervention phase had the smallest PND effect size, and the second session of this phase had the lowest engagement of any session. Upon further analyzing the data of the second session of the second intervention phase, it was found that DeMarcus was engaged for 16% of the time during the first 8 min and 20 s and then re-engaged with the group for 97% of the remaining intervals. It was unclear why DeMarcus displayed initial low engagement followed by high engagement. The Tau-U effect size was moderate at 0.54.

Social Validity

Based on 5-point Likert-type scale, student social validity outcomes were as follows: improving my reading skills is important to me averaged 4.67 (SD = 0.58, range 4–5), I enjoy coming to this reading group averaged 5 (SD = 0, range 5–5), this reading group is helping me improve my reading skills averaged 4 (SD = 0, range 4–4), and I will be able to use the strategies I am learning in this reading group when I read on my own averaged 4.67 (SD = 0.58, range 4–5). Across all items and students, student social validity averaged 4.58 (SD = 0.51, range 4–5). Based on a 7-point Likert-type scale, the interventionist somewhat agreed (i.e., 5) that the intervention phase was easy to implement. The interventionist agreed (i.e., 6) that the intervention training prepared me to fully implement the intervention, the intervention duration and frequency of the intervention were appropriate, and the intervention was effective at improving reading comprehension. Finally, the interventionist strongly agreed (i.e., 7) that the intervention was effective at improving student behavior. Across all items, the overall interventionist social validity averaged 6 (SD = 0.71, range 5–7).

Discussion

Engagement is a critical component of student learning. Therefore, it is not surprising that students who display co-occurring reading difficulties and inattention tend to have lesser gains in reading instruction than students with reading difficulties without inattention. Even though previous reading intervention research for upper elementary students with co-occurring difficulties and ADHD found that reading interventions can improve reading outcomes (e.g., reading fluency, vocabulary, reading comprehension; (e.g., Jozwik & Douglas, 2016; Roberts et al., 2020; Tamm et al., 2017), current evidence does not suggest that research-based reading interventions can lead to a collateral impact on improved attention. Given that inattention is a key predictor of inadequate response to reading interventions for upper elementary students with reading difficulties (MacDonald et al., 2020), this study builds on previous reading intervention research by embedding antecedent- and consequence-based behavioral supports found to be effective at improving student engagement (e.g., DuPaul et al., 2011; Gaastra et al., 2016; Kamps et al., 2015) into an evidence-based reading curriculum.

Study findings from visual analysis suggest that the level, slope, and variability around the slope were indicative of a functional relation between the intervention and engagement for Annette and Elijah. Furthermore, for these two students, the second intervention phase had the largest PND effect size of any phase at 100%, due to the flat trend lines with high engagement in the final intervention phase. Since ascending and descending trend lines can lower PND effect sizes, for these two students, PND effect sizes were reduced in the first intervention phase and second baseline phase. Finally, for both these students, Tau-U engagement effect sizes were large and greater than the 0.67 mean effect size found in school-based behavior interventions for students with ADHD (Harrison et al., 2019). Findings for DeMarcus could not support a functional relation in large part due to the high engagement during baseline phases. For Demarcus, the Tau-U effect size was moderate and less than the school-based behavior intervention mean effect presented in Harrison et al. (2019). It is possible that the small group instruction and the active learning supports built into the reading curriculum were sufficient to support DeMarcus’s engagement during the baseline phase. This hypothesis could explain why the ELA teacher experienced higher rates of off-task behavior in the general education setting than what was observed during all phases of the study.

All components of the behavior support were fully present for all intervention sessions and not present during the baseline sessions. Findings from the behavior support procedural fidelity suggest that the behavior intervention was implemented with fidelity. Finally, social validity data found that the students and the interventionist either agreed or strongly agreed that the intervention was important and acceptable. The interventionist also somewhat agreed that the intervention was easy to implement.

Limitations and Future Research

This study has several limitations. The first limitation was that the screening measure for inattentive behaviors was based on the general education ELA teachers’ responses on the SWAN. This may have led to some students, such as DeMarcus, being included based on displaying inattentive behaviors in a larger general education setting, but not in a small group reading setting. Unfortunately, direct observation measures of student behavior during small group reading were not possible as part of the double-gating procedure. Therefore, a commonly used indirect measure of inattention (i.e., SWAN; e.g., MacDonald et al., 2020) was used to screen for inattention. Future researchers could consider how to utilize direct measures of engagement, as part of a gating procedure to include students for study participation to ensure identified students have inattentive behaviors during a small group reading instruction, particularly when a reading program embeds practices that support active student learning. Second, reading outcomes were not measured in this study. Given the brevity of this intervention, this intervention was unlikely to have led to an increase in reading outcomes. However, considering that our goal of increasing engagement was to improve reading outcomes, future research with longer durations could consider measuring reading outcomes in addition to engagement outcomes. Third, the procedural fidelity was not at 100% during the student training sessions (M = 82%, SD = 3%, range 80–86%). This finding suggests a need for additional professional development and coaching before, and possibly during the training session. Fourth, even though this study had an inclusion criterion for students with reading difficulties and inattention, it is worth noting that in addition to the behavioral screener for inattention, the SWAN also identified the students as being at risk for hyperactivity (and thus at risk for ADHD). Furthermore, BASC-3 data suggested that all students presented hyperactive and externalizing behaviors in the classroom. Therefore, this study’s sample represented students with or at-risk for co-occurring reading difficulties, ADHD, and externalizing behaviors. It was not unexpected that the sample of students had co-occurring hyperactive and externalizing behavior, as students with reading and behavior difficulties often have more than one co-occurring behavior (Lyon, 1996; Mayes & Calhoun, 2007). The final limitation was that student social validity focused solely on the students’ perceptions of the reading intervention as a whole (i.e., reading with behavior support) and did not address student perceptions of the behavior support component or their engagement with the use of the behavior support. Future research should address student perceptions on these topics as well as provide the student social validity measure during both the baseline and intervention conditions to identify differences in perceptions of these two conditions.

Implications for Practice

In this study, the aim was to identify the extent to which an a priori set of behavior support techniques can be implemented and embedded into an efficacious reading curriculum to improve student engagement during instruction. Results suggested that it is feasible and effective at improving engagement to systematically integrate well-researched behavioral techniques (e.g., Simonsen et al., 2015; Wills et al., 2018), such as teaching instructors to (a) identify teach, and review group rules, (b) deliver behavior specific praise and precorrections, and (c) implement a token economy. Furthermore, even though this study used a single reading curriculum, (Voyager Passport), this curriculum shares many characteristics with other small group reading interventions, such as explicit systematic instruction and active learning strategies. Therefore, it is probable that the behavior supports embedded into this curriculum could be used with similar reading curricula.

Overall, students with co-occurring reading difficulties and inattention need support to maintain engagement during reading instruction. Research has suggested the best method to address both reading and engagement may be to do so simultaneously (Burns et al., 2012; Kuchle & Riley-Tillman, 2019; MacDonald et al., 2020; Roberts et al., 2021). Findings from this study suggested that embedding behavior support into an evidence-based reading curriculum led to large Tau-U effect sizes in engagement for two of the three fourth-grade students with co-occurring reading difficulties and inattention. Therefore, the primary implication for practice from this study is that adding behavioral support principles to a standardized reading curriculum has the potential to lead to an increase in the percentage of time students are engaged in the reading instruction.

In conclusion, this study included novel methods to support students with reading difficulties and inattention by embedding antecedent- and consequence-based behavioral supports into reading instruction and by measuring behavior through direct observation measures. To the best of our knowledge, this is the first study to have these study characteristics. Moreover, since procedural fidelity was high and Tau-U effect sizes were large for two of the three students, this study’s intervention shows promise for usability and future research.