Procedural fidelity is a construct with widely varied terminology across studies including treatment integrity (e.g., Codding et al. 2005), implementation integrity (e.g., DiGennaro Reed et al. 2007), procedural reliability (e.g., Sanetti and Kratochwill 2009), curriculum fidelity (Vartuli and Rohs 2009), and other terms. Historically defined as the extent to which a treatment is implemented as planned (Gresham 1989; Yeaton and Sechrest 1981), a more recent conceptualization refers to this construct as the degree to which a trained interventionist consistently and comprehensively implements essential intervention components (Sanetti and Kratochwill 2009). Although helpful in guiding research to date, these definitions may not adequately capture the full range of procedures and dimensions embodied by this construct.

Current definitions ignore the implementation of procedures that may not constitute a treatment or an intervention per se, though the intended spirit of the definitions possibly suggests otherwise. For example, public school teachers frequently implement instructional programs adopted at the school- or class-wide level that would not be classified as a formal treatment or intervention. Correct implementation of assessment protocols by clinic staff or other paraprofessionals is a matter of great importance and may potentially affect treatment decisions that stem directly from the outcomes of an assessment (e.g., functional analysis). Moreover, the degree to which consultants carry out components of the consultation process may influence consultation outcomes (Noell 2008). In these limited examples, it becomes obvious that fidelity or integrity as a broad construct encompasses much more than implementation of an intervention. Thus, restricting ourselves to terminology that narrows the range of procedures captured by this broader construct could (1) produce limited to no access to relevant research during literature search procedures, (2) impact the decisions of practitioners when developing assessment and treatment protocols, and (3) add confusion surrounding its definition and measurement. As a result, we prefer the term procedural fidelity to refer to the degree to which a trained individual implements a procedure (including treatment, assessment, consultation, or other protocol) as designed. However, in this special issue, we have deferred to authors’ preference for what term they use in their own papers.

A related issue surrounds the varied dimensions of fidelity across conceptual models that may affect the definition of fidelity, its measurement, and subsequent decisions informed by fidelity data. In our experience, researchers and practitioners often measure fidelity by designing a task analysis, recording data on adherence to a procedure, and reporting the percentage of correctly implemented steps. These activities tend to assess two dimensions: content (which steps of the intervention were implemented) and quality (how well the intervention steps were implemented), though quality is sometimes erroneously confounded with content (i.e., identifying which steps were implemented becomes a proxy measure for quality). Procedural fidelity encompasses much more than these two dimensions (e.g., exposure, Dane and Schneider 1998; dosage, Jones et al. 2008) with some dimensions likely playing a more important role than others. Sanetti and Kratochwill (2009) provide an elegant summary of the issues and variables contributing to our understanding of fidelity as a multi-dimensional construct. As such, we will not summarize them here, but refer readers to this valuable resource. We agree with their conclusion that empirically derived dimensions of fidelity will play an important role in the utility and feasibility of fidelity measurement and, as a result, research should address this topic more formally.

Despite the varied terminology, narrowed definitions, and emerging empirical support for numerous dimensions, interest in treatment integrity—and, more broadly, procedural fidelity—is growing rapidly. A PsycINFO keyword search of “treatment integrity” (quotes included) from the years 1982–2012 yielded 249 results. The search parameters were restricted to these years to depict publication trends following the seminal paper on treatment integrity published by Yeaton and Sechrest (1981). The results included scholarly journals and books (theses and dissertations were excluded). Interestingly, but not surprisingly, the same search procedures for “procedural fidelity” yielded only seven results. To better document trends overtime, Fig. 1 depicts a cumulative record of publications associated with the phrase “treatment integrity” by year. The data suggest a rapid increase in the number of publications beginning in 2001, with a dramatic change in rate since 2008. These findings reflect a growing interest in our field perhaps due to studies documenting the relevance of procedural fidelity to educational outcomes, an increased emphasis on fidelity when drawing conclusions about research, or positions of professional organizations and grant-funding agencies about the necessity to consider and collect these data. Quite possibly, the growing interest represented by publication trends may be responsible for the related changes (or perhaps there is a bi-directional influence).

Fig. 1
figure 1

Cumulative record of publications in the PsycINFO database under the keyword phrase “treatment integrity” from the year 1982 to 2012

Regardless of the cause of our field’s growing interest, procedural fidelity has received greater attention and emphasis in recent years with research that spans a wide range of topics in education, psychology, behavior analysis, public health, and other areas. The topic of this special issue is devoted to research on advancements in procedural fidelity assessment and intervention, particularly in the area of education broadly defined. The purpose of this article is to (1) summarize the relevance of procedural fidelity to education, (2) summarize the relevance of procedural fidelity to research, and (3) propose priority research areas, some of which overlap with articles in this special issue.

Relevance to Education

One may conceptualize procedural fidelity as a mediating variable (Baron and Kenny 1986) between educational practices and learner outcomes; that is, the degree to which an educator implements a practice or procedure as designed explains the relationship between that practice and learner outcomes. Evidence for this conceptualization comes from real-world examples of the deleterious effects of procedural fidelity omissions as well as research documenting the relationship between fidelity and outcomes (DiGennaro Reed and Reed 2014).

Both correlational and causal evidence supports the claim that educator fidelity influences learner behavior. For example, Gresham et al. (1993) analyzed studies of school-based behavioral interventions published between 1980 and 1990 in seven journals and documented significant correlations between effect size calculations (i.e., Cohen’s d, percentage of nonoverlapping data points) and percent treatment integrity. Other researchers have produced similar findings when calculating the correlation between treatment integrity and learner outcomes (e.g., Dib and Sturmey 2007; DiGennaro et al. 2005, 2007). These findings suggest that more pronounced treatment effects are associated with higher fidelity. Research has also shown a causal relation between educator fidelity and learner outcomes (e.g., Northup et al. 1997; Rhymer et al. 2002) though these findings are mixed (Vollmer et al. 1999; Worsdell et al. 2000). Experimental preparations typically involve an evaluation of the effects of programmed treatment integrity levels (e.g., a parametric analysis; DiGennaro Reed et al. 2011; Wilder et al. 2006) on learner behavior, such as acquisition of a skill, problem behavior, or appropriate behavior. St. Peter Pipkin et al. (2010) showed that treatment integrity errors influenced outcomes to a greater degree following a baseline phase compared to errors made after some period of perfect or near perfect implementation. Their findings suggest that high fidelity is most critical during early stages of an intervention. Collectively, correlational and causal evidence indicates that educator fidelity plays an important role in short-term and possibly long-term learner outcomes, which highlights its relevance to educational practices.

A responsibility of school-based practitioners and consultants involves supporting educator implementation of individual behavior support plans, instructional practices, and class-wide interventions. Although fidelity errors are common (e.g., Carroll et al. 2013), practitioners may successfully facilitate educator procedural fidelity by providing high-quality training (e.g., Sarokoff and Sturmey 2004), delivering performance feedback (e.g., Noell et al. 2000), and implementing other support techniques (e.g., Iwata et al. 1976). Depending on the type and timing of the fidelity error as well as the procedure implemented, the impact on learner outcomes may not be substantial (e.g., St. Peter Pipkin et al. 2010) though other researchers have documented otherwise (e.g., Wilder et al. 2006). In some instances, fidelity errors can produce devastating outcomes that lead to litigation. For example, New Jersey families brought a class-action lawsuit to federal court after Newark public school personnel were significantly delayed in conducting psycho-educational evaluations of their children with disabilities. Subsequently, Newark schools then failed to provide special education services to which the children were entitled (M.A. v. Newark Public Schools, New Jersey Department of Education 344 F.3d 335). Other procedural fidelity violations have yielded lawsuits resulting from injuries incurred during incorrect implementation of a school-based procedure, such as omission of components of a behavior support plan (e.g., Ferraro v. the Board of Education of the City of New York 14 A.D.2d 815, 1961; DiGennaro Reed and Reed 2014).

Taken together, these findings highlight the important role of educator fidelity in promoting beneficial outcomes of learners with and without disabilities. Although our understanding of the conditions under which the effects of fidelity are most pronounced is emerging, without question, procedural fidelity is relevant to educational outcomes.

Relevance to Research

Johnston and Pennypacker (1980) aptly and simply suggested that, in the absence of clear evidence that an independent variable was employed as intended, there can be no definitive conclusions made on the dependent variable. Although such manipulation checks seem obvious in experimental studies as a way to ensure internal validity, intervention research across various applied fields has only recently attended to this consideration. Historically, analysis of published intervention studies in the fields of applied behavior analysis (McIntyre et al. 2007), autism (Wheeler et al. 2006), and psychotherapy (Perepletchikova et al. 2007) has illustrated a persistent lack of attention to collecting and reporting procedural fidelity data with percentages of published articles ranging from 3 to 30 %, depending on research area. Although evaluation of the school psychology treatment literature yielded higher rates of researcher-reported quantitative procedural fidelity data (50 %), 29 % of studies failed to provide an operational definition of or a reference for the intervention and the majority of studies only evaluated treatment adherence (Sanetti et al. 2011). Researcher-reported barriers to evaluation of procedural fidelity are broad and include (a) limited knowledge of procedural fidelity and its theoretical basis, (b) lack of consistent guidance on procedural fidelity procedures, (c) the time, cost, and labor required to collect and evaluate procedural fidelity data, and (d) lack of accountability on the behalf of editorial boards (Perepletchikova et al. 2009; Sanetti and DiGennaro Reed 2012).

It is plausible that failure to consider and report procedural fidelity has resulted in faulty empirical support for or against treatments that may or may not have been administered accurately. Lack of procedural fidelity assessment data also creates missed opportunities for researchers to evaluate the influence of factors such as treatment dose, treatment adaptations, interventionists’ skills, intervention complexity, and the match between the treatment implemented and client problem identified (Fixsen et al. 2005; Gresham 2009). Consequently, quality indicators have been espoused by experts to guide researchers when planning studies that employ group and single-case research designs. For example, Gersten et al. (2005) generated a list of essential features for special education research pertaining to group and quasi-experimental designs that included a necessity to ensure and assess treatment implementation in the description of the study procedures. Likewise, one of the six standards for single-case research was documentation of procedural fidelity (Horner et al. 2012).

Procedural fidelity assessment is essential to the ability to interpret treatment outcome data and is central to the evidence-based practice and prevention science movements. In 2002, the American Psychological Association posited that evaluation of interventions as having evidence-based status includes documentation of feasibility, defined as the extent to which a treatment can be implemented in an applied setting. Subsequently, similar recommendations have been adopted by various groups devoted to evaluating and disseminating information on evidence-based interventions. The criterion for excluding a research study when reviewing the available evidence on an intervention for What Works Clearinghouse (2010), an initiative of the Institute for Education Sciences, is that the intervention was not implemented as designed. Similarly, the Center for the Study of Prevention of Violence’s Blueprints for Healthy Youth Development program includes provision of quantitative fidelity of implementation data as one criterion for considering a prevention or intervention program as a promising or model practice (http://www.blueprintsprograms.com/resources/Blueprints_Checklist.pdf).

Other organizations have emphasized the multidimensional nature of the procedural fidelity construct (Sanetti and Kratochwill 2009) by requiring that multiple measures be employed to demonstrate fidelity. Both the Task Force on Evidence-Based Interventions in School Psychology (Kratochwill et al. 2002) and the Substance Abuse and Mental Health Services Administration’s National Registry of Evidence-based Programs and Practices (NREPP) recommend that research studies include documentation of (a) treatment adherence data, (b) use of treatment manuals or detailed protocols, (c) adaptations made and procedures used, (d) systematic collection of treatment dose and training time, (e) relationship between interventionist and participant, and (f) evidence that fidelity was evaluated using an instrument with adequate psychometric evidence.

In summary, failure to carefully attend to and measure the degree of implementation of the independent variable results in poor science and may have a profound impact on the applied work of practitioners who attempt to utilize research evidence in their clinical decision making. Numerous individuals and organizations have subsequently provided a range of procedural fidelity standards for designing research and reporting findings that will likely engender increasing support from editors and editorial boards and have already impacted access to federal funding through grant programs such as the Institute for Education Sciences (2013). Clearly, procedural fidelity has relevance to the design, reporting, and dissemination of research.

Priority Research Areas

The past 30 years of procedural fidelity research have yielded procedures to guide research and practice across a wide variety of areas including, but not limited to, definitional and conceptual issues, measurement considerations, assessment technologies, developing effective training procedures, provision of ongoing educator support, and others. Fortunately, our understanding of procedural fidelity has improved since Yeaton and Sechrest’s (1981) seminal paper, which we hope translates into better educational practices and services. Despite these valuable findings and a greater emphasis on procedural fidelity, much remains unknown. The articles published in this special issue align with the research areas that we believe should be a priority focus. Thus, we conclude this paper by briefly summarizing the importance of these particular topics and hope that this special issue not only contributes to our understanding, but also sparks future research. These are exciting times for both researchers and practitioners interested in this topic.

Parametric Analysis

A handful of studies have experimentally manipulated levels of procedural fidelity and evaluated the effects on learner outcomes. That is, the researchers conducted parametric analyses wherein fidelity errors are purposefully programmed into the teaching procedure or intervention plan. The degree to which these errors affect learner performance is then measured. The research in this area spans basic and applied experimental preparations, but may be best conceptualized as a translational or use-inspired basic research approach (Mace and Critchfield 2010). Translational fidelity research attempts to answer questions of applied significance in a controlled manner by conducting research in analog settings, adopting human operant arrangements, carefully manipulating levels of the independent variable (i.e., level of procedural fidelity), or teaching arbitrary tasks (rather than using current instructional materials to minimize detrimental outcomes on performance outside of the experimental settings). In a novel demonstration of the basic-to-applied continuum of research, St. Peter Pipkin et al. (2010) conducted three studies, one of which involved college student participants responding to a computerized task and two involving individuals with disabilities who emitted problem behavior in school. In their human operant preparation with college student participants, problem behavior was arbitrarily defined as clicking a black circle and appropriate behavior as clicking a red circle on the computerized task. Problem behavior for participants in the remaining studies included off-task behavior or aggression. St. Peter Pipkin and colleagues evaluated the effects of differing levels of fidelity of a differential reinforcement procedure on problem behavior. Their preparation manipulated both errors of omission (i.e., a missed intervention component) and commission (i.e., an added intervention component). In addition, the researchers assessed the influence of condition order on participant problem behavior. They showed (1) increased problem behavior during conditions containing errors of commission involving reinforcement for problem behavior compared to conditions containing errors of omission involving a failure to reinforce appropriate behavior, and (2) that the sequence of conditions influenced participant problem behavior. Specifically, participant problem behavior was less affected by fidelity errors during conditions that were preceded by perfect implementation of the procedure. Extending this line of research, Hirst et al. (2013) provide another example of a human operant preparation involving programmed fidelity errors. They conducted a parametric analysis of feedback errors during a computerized instructional task and documented slower short- and long-term task acquisition during conditions with more errors. Wilder et al. (2006) evaluated the effects of differing levels of prompting fidelity on preschooler’s compliance to instructions in a more applied study involving parametric analysis. They showed greater compliance under conditions with higher fidelity. Researchers have adopted similar approaches to examine the effects of fidelity levels on appropriate toy manipulation (Groskreutz et al. 2011), arbitrary response chains (Grow et al. 2009), functional communication (Worsdell et al. 2000), responses during discrete trial instruction (Carroll et al. 2013; DiGennaro Reed et al. 2011), and other behavior (e.g., St. Peter Pipkin et al. 2010).

These and other studies serve as excellent models of translational fidelity research and have contributed greatly to our understanding of the effects of procedural fidelity on learner outcomes. The challenge with translational research, however, is in designing studies that maximize ecological validity without compromising experimental control. Carroll et al. (2013) address this issue by measuring the types of fidelity errors commonly made in instructional settings and incorporating these errors into their parametric analysis. Like Carroll et al., future research should strive to enhance the external validity of the experimental findings by conducting manipulations of the independent variable that reflect errors that occur in practice, using representative participants and/or settings, and incorporating tasks of applied relevance (Fryling et al. 2012). In this issue, Leon, Wilder, Majdalany, Myers, and Saini provide an elegant extension of previous research by building on the work of St. Peter Pipkin et al. (2010) and Wilder et al. (2006). They addressed an issue of clinical importance (i.e., noncompliance emitted by young children with disabilities) across two studies where errors of omission and commission were manipulated in an experimental analysis. In addition, Leon et al. examined the role of sequence effects on participant noncompliance.

Telehealth

Telehealth refers to the delivery of health care or health education across distances via technology (McGeary et al. 2012) and may include electronic written communications, consultation using remote audio and visual equipment, and teleconferencing (US Department of Health and Human Services, n.d.). Although limited by the capabilities of the technology at the time of its adoption, recent advances have resulted in telehealth becoming the norm in health care (Brennan et al. 2008). This service model has been used to address a wide variety of clinical concerns including but not limited to compliance with treatment for sleep apnea (Smith et al. 2006), pain management (McGeary et al. 2012), weight maintenance (Haugen et al. 2007), post-traumatic stress (Sloan et al. 2011), and many others. Home telehealth and remote monitoring is an innovation growing in popularity with estimates suggesting a doubling of use from 2008 to 2010 (Brennan et al. 2008). Advantages to telehealth include delivery of services to patients living in rural communities, eliminating barriers associated with access to care, and reducing costs while improving quality (Speedie et al. 2008).

The above-referenced advantages are particularly relevant to the delivery of services to individuals with autism and their caregivers, where the demand for services often far exceeds the availability of services in any given area. Emerging research has documented the effective delivery of behavioral services via telehealth (e.g., Barretto et al. 2006; Wacker et al. 2013). Following a systematic review of the literature, Boisvert et al. (2010) concluded that telehealth (or telepractice) is a promising approach to delivering services to individuals with autism, yet it requires additional research. They recommended several areas for future research including, but not limited to, conducting cost-benefit analyses, evaluating efficacy and effectiveness of numerous clinical procedures across disciplines, assessment of social validity, and implementation issues. Although we concur that research addressing each of these areas is valuable, identifying ways to tackle implementation issues seems especially important. An effective, socially valid, and affordable treatment is not helpful to families if it is delivered with low fidelity (e.g., scheduling mishaps, technological disruptions, or consultant or consultee errors). Implementation issues involving degradations in procedural fidelity during telehealth will likely influence treatment effectiveness and social validity. Given the growing interest in delivering autism services via telehealth, additional research is sorely needed. In this issue, Suess et al. evaluate the fidelity with which parents implement functional communication training in their home with their children with autism. They measured parents’ adherence to the treatment protocol, noting the types of errors made as well as the child’s problem behavior and communication. Their findings support previous research (e.g., Barretto et al. 2006; Wacker et al. 2013) and contribute to our understanding of fidelity under conditions when the consultant is not immediately present via technology.

Fidelity Assessment

One of the documented challenges of determining how to measure procedural fidelity pertains to consensus on the dimensions necessary to measure. Common dimensions include treatment adherence, interventionist competence, treatment differentiation, and interventionist–student interactions; however, treatment exposure, participant responsiveness, and quality of treatment delivery have also been emphasized as central features of the procedural fidelity construct (Gresham 2009; Sanetti and Kratochwill 2009; Southam-Gerow and McLeod 2013). Research is needed on the value added by measuring each of these dimensions as well as whether value is determined by an intervention itself, the context within which the intervention is implemented, or both. Given the paucity of research, common and consistent suggestions include the measurement of multiple dimensions of procedural fidelity (e.g., Barnett et al. 2011; Hirschstein et al. 2007; Sanetti and Fallon 2011).

The extant literature has examined how to measure procedural fidelity through a number of methods including (a) direct observation (i.e., using checklists/scripts, time-sampling, or response dependent behaviors), (b) interventionist’s self-report, and (c) permanent products generated from intervention implementation. In this issue, Barnett and colleagues identified, through systematic review of a large body of child intervention literature, that some methods are utilized more than others. Sanetti and Mollier-Creek demonstrated that accuracy of the methods employed is variable when considering the dimension of procedural fidelity assessed and student responsiveness. Both Barnett et al. and Sanetti and Mollier-Creek draw readers’ attention to assessment issues pertaining to the representativeness of procedural fidelity assessment. Interestingly, findings from each study yield support for collecting multiple procedural fidelity measures and methods.

Identifying procedural fidelity measures with adequate psychometric properties is also lacking. In part, this challenge is related to what should be measured and how to measure it. For example, frequency count of the praise statements teachers make to students might simply require inter-observer reliability estimates, whereas evaluating implementation of evidence-based classroom management procedures is more complex (e.g., Jeffrey et al. 2009). Assessment of convergent and divergent validity among methods is one essential avenue for future research (Greenwood 2009; Gresham 2009). Equally compelling is when or how often to evaluate procedural fidelity. That is, should procedural fidelity be assessed (a) more frequently following initial intervention training and then faded, (b) collected intermittently or on a prespecified schedule (e.g., Reed et al. 2010), or (c) based on performance, that is, contingent on a criterion of accuracy (Barnett et al., this issue)? Another challenge worthy of exploring is whether measurement of procedural fidelity is equivalent from session to session and, if not, whether it should be or how fidelity might be reported given the dynamic nature of treatment implementation. Barriers to procedural fidelity data collection also need to be considered (see Sanetti and DiGennaro Reed 2012) such that the type of method employed and its psychometric qualities are matched to the purpose of data collection (see Perepletchikova 2011).

Fidelity Promotion

Assessment of procedural fidelity is inextricably linked to how integrity failures are addressed by consultants, coaches, and administrators. Conceivably, if sensitive measures of procedural fidelity assessment are generated (Gresham 2009), the idiosyncratic and systemic reasons for integrity failure can be identified and directly addressed. For example, if adherence to the treatment protocol is determined to be adequate but quality of implementation is suspect, then training on quality of treatment delivery, but not adherence, is necessitated. Perhaps, adherence and quality are excellent but student behavior is not changing, suggesting that the intervention provided needs to be revisited in terms of match to skill needs, dose, frequency, or type. It is also possible that treatment adherence is low, student response is in the expected direction, and evaluation of treatment adaptations made by the interventionist seems to have improved treatment effectiveness. It could be that one step of the treatment protocol is never implemented, and consequently, student behavior remains unchanged, suggesting that further consultation around the barriers to implementation of that step might need to be scheduled. These examples are far from the number of hypotheses that could be generated, illustrating that procedural fidelity is essential for the problem-solving process and necessary for intervention planning and modification. Unfortunately, despite the recognized importance of evaluating procedural fidelity among school-based practitioners, it is rarely measured in applied settings due to time constraints as well as lack of knowledge and skills among teachers and administrators (Cochrane and Laux 2008). This reality underlies the necessity to promote knowledge, understanding, and skill development to improve procedural fidelity.

Prevention of procedural fidelity errors begins with professional development pertaining to treatment protocols and procedures. In their seminal meta-analysis, Joyce and Showers (2002) demonstrated that training solely on describing theory and facilitating discussion was inadequate for promoting increases in knowledge and skills among teachers in public schools. However, providing skill demonstrations along with opportunities for practice and immediate feedback yielded considerable gains in both areas. Best practices in professional development training require a number of components, such as time, personnel, and money, that may stress available resources in applied settings. Furthermore, applied settings need to focus on building system-wide capacity that may render the use of external experts less efficient and effective for maintaining procedural fidelity.

Consequently, computer-based or video modeling training procedures have been employed to improve procedural fidelity. Video modeling and instruction has led to immediate and adequate, albeit not perfect, levels of procedural fidelity when participants were required to implement discrete trial teaching protocols (Catania et al. 2009; Nosik and Williams 2011), backwards chaining procedures (Nosik and Williams 2011), functional behavioral analysis techniques (Moore and Fisher 2007), and behavior interventions (DiGennaro Reed et al. 2010). Another resource efficient alternative may be to employ a train-the-trainers model, also known as pyramidal training. Pyramidal training invites experts to train local individuals who in turn will be responsible for training other members from their own institutions or organizations. In this issue, Pence, St. Peter, and Giles describe a pyramidal training procedure within which teachers trained other teachers in functional analysis procedures. This paper contributes to a growing body of literature supporting the use of pyramidal training for improving procedural fidelity (e.g., Kuhn et al. 2003; Page et al. 1982) and offers an effective and creative solution to training that has the potential to improve use of evidence-based practices and reduce frequently noted resource-related barriers. Continued efforts to develop innovative training methods that support evidence-based practices, while limiting associated barriers and promoting system-capacity building, are rich areas for future research.

Although essential, training alone is insufficient for promoting application of newly taught skills in actual classroom settings (Joyce and Showers 2002; Sterling-Turner et al. 2002), and consistent declines in procedural integrity have been documented after as few as three to five sessions of treatment implementation following training (Solomon et al. 2012). The addition of coaching support and other specific follow-up interventions subsequent to effective training procedures are necessary to promote procedural fidelity (Joyce and Showers 2002; Kratochwill et al. 2007; Noell et al. 2005). In this issue, Reinke, Stormont, Herman, and Newcomer analyzed the overall impact and amount of coaching along with the value of particular aspects of coaching on use of proactive classroom strategies. This paper offers direction on the utility of coaching that can spawn future research questions such as determining the core components of coaching necessary for efficient and effective outcomes as well as how much coaching to offer, when, and under what circumstances.

The extant literature has defined a number of follow-up strategies that have potential utility and effectiveness for promoting procedural fidelity. One of the most widely examined procedures is performance feedback for which a recent meta-analysis yielded a moderate effect size, demonstrating positive impact on procedural fidelity across preschool, elementary, and secondary settings with special and general education teachers (Solomon et al. 2012). Other available options include regularly scheduled follow-up meetings, directed rehearsal of treatment components that are implemented incorrectly or omitted, self-monitoring, and negative reinforcement strategies such as cancelation of consultant–consultee meetings contingent on accurate treatment implementation. In this issue, Noell and colleagues conduct a meta-analysis of the extant single-case design literature that employed follow-up strategies using mixed linear growth models to evaluate changes in level and trend. This paper offers readers comprehensive evidence for a number of effective strategies to promote treatment implementation. As noted by the authors, these findings should encourage future researchers to extend the literature beyond simply what works.

Given the complexity of the procedural fidelity construct, future research should consider the interactions among the numerous variables across the implementation environment, intervention, and interventionist that influence integrity failures (Gresham 2009; Noell 2008; Sanetti and Kratochwill 2009). Practitioners and researchers alike might consider how to better match interventionist needs with the interventions employed to promote fidelity. For example, brief experimental analysis could be utilized to determine whether interventionists are operating under positive or negative reinforcement contingences (DiGennaro et al. 2005). It is also possible that different supports need to be provided to interventionists depending on whether implementation skills fall in the acquisition or fluency stages of development (Barnett et al. 2011). Some level of continued support may need to be provided to most interventionists throughout the implementation process; however, the type and nature of those services may need to be adjusted according to individual or system-level needs.

Maintenance

An important but oft-overlooked component of any program is ensuring maintenance of desired behavior change (Stokes and Baer 1977). Procedural fidelity promotion not only includes provision of professional development training and ongoing follow-up and support, but also designing consultation activities to facilitate fidelity maintenance (Fiske 2008). Failing to address the latter activity represents a terrible waste of valuable resources and may result in low fidelity and poor learner outcomes. Unfortunately, few researchers have systematically programmed for fidelity maintenance. After improving educator fidelity through brief daily performance feedback meetings with teachers, Noell et al. (2000) thinned the schedule of feedback to every other day contingent on teachers meeting a criterion of 100 % fidelity across four consecutive days. Noell et al. (2002) also used fading to decrease the frequency of data review meetings. In both studies, variability in fidelity was temporarily observed for some teachers following fading. DiGennaro et al. (2005, 2007) adopted a similar procedure of dynamic fading where the schedule of a performance feedback and directed rehearsal (i.e., repeated practice of missed intervention steps) package was progressively thinned contingent on educator fidelity. Although they successfully faded the feedback/rehearsal schedule, their findings also demonstrated increased variability in fidelity during dynamic fading. Clearly, additional research is needed, particularly conceptually systematic research that yields effective fading procedures, reduced variability in fidelity once fading is introduced, and maintenance over longer periods. In this issue, Gross and Duhon evaluated a fading procedure that incorporated indiscriminable contingencies to help combat the variability in educator fidelity documented in previous research. They also carried out this applied investigation in the context of a multidisciplinary prereferral team conducting business as usual, which increases the generality of their findings.

Conclusion

The number of empirically supported technologies available to researchers, practitioners, and educators has grown in recent decades, particularly in the past 10+ years. Yet, a great deal of work remains. In addition to the priority research areas identified above, other worthwhile lines of inquiry (e.g., promoting generalization, evaluating dimensions) will yield productive careers and help many educators and students. Quite possibly, our field is entering a new and exciting era of procedural fidelity assessment and intervention. The evidence supporting the importance of fidelity has accumulated to a sufficient degree to warrant addressing this topic more formally in our graduate training programs. We propose a two-pronged approach: (1) expand or develop graduate-level lecture and practicum courses to better train practitioners about best and/or current practices in fidelity assessment and intervention and (2) ensure research courses and experiences not only teach students research methodologies appropriate to their fields of study, but also expose students to behavioral science approaches more broadly.

We encourage graduate programs to develop a stand-alone course that tackles the complexities of procedural fidelity assessment and intervention. Although these topics may be embedded in other coursework, an upper-level graduate course will expose students to advanced topics presented in this special issue and elsewhere, generate discussions surrounding the nuances that could influence outcomes for learners, create opportunities to brainstorm future research studies, and generally immerse students and faculty in these important issues. Practicum courses could be revised to require students to show evidence of procedural fidelity and a rationale for the measurement system used.

With respect to the second approach, we believe that future work will require researchers to embrace methodologies that may not entail simple research designs and rapid data collection. We do not mean to imply that the research published to date is simplistic. As researchers in this area, we disagree with that conceptualization. However, we are advocating that to advance the literature, researchers may need to embrace research methodologies that answer the questions using the best scientific approaches for that question. We encourage researchers to avoid being locked in particular research designs and philosophical approaches. Instead, embrace the wise words of Murray Sidman who wrote, “There are no rules of experimental design” (1960, p. 214). What will this mean? Addressing the important and complex questions of our time will require accumulated evidence gathered from a series of studies conducted over years using myriad techniques that bridge scientific approaches.