Implementing procedures that do not have empirical support are not evidence based, or are pseudo- or antiscientific carries risks for individuals diagnosed with autism spectrum disorder (ASD) and their families (Freeman, 2008). These risks include, but are not limited to, wasting educational time and money, interfering with proven effective interventions, and making a negative emotional impact on caregivers when an intervention proves ineffective (R. Leaf, McEachin, & Taubman, 2008). Behavior analysts involved in the treatment of ASD are committed to efficient and effective service delivery and view it as within their professional duties to advise clients about the extent to which interventions are likely to be effective. Behavior analysts in this context of service provision may be confronted with requests to implement, provide advice, or work collaboratively with professionals on interventions that fall outside the range of behavior analysis (Brodhead, 2015). Therefore, it is important for behavior analysts to familiarize themselves with common interventions that lack a strong evidence base, have a shortage of empirical support, and share hallmarks of pseudoscience.

Social thinking (ST) is an approach to intervention with several manuals and materials (e.g., Winner, 2005a, b, 2007a, b, 2008, 2013) available to professionals and families of individuals diagnosed with ASD, is the topic of presentations for professionals at national and international conferences, and is implemented across a variety of settings for many individuals diagnosed with ASD. In our experience, an increasing number of individuals have begun to receive interventions based on ST methodology. After an extensive review of the literature (empirical and nonempirical), we found that empirical support was almost nonexistent and that ST materials include many hallmarks of pseudoscience. J. B. Leaf et al. (2016; note many authors in common with the current article) provided a brief overview of ST, discussed the limited research evaluating ST, and assessed ST based on Green’s (1996) criteria for scientific evidence. Their conclusion was that ST was not evidence based or empirically supported and met Green’s criteria for pseudoscience.

Crooke, Chief Strategy Officer of Research, Content, Clinical Services and the Director of Social Thinking Training & Speakers Collaborative, and Winner, founder and CEO of Social Thinking, published a response to J. B. Leaf et al. (2016), the purpose of which was to highlight perceived misconceptions about ST and supposed inaccuracies in J. B. Leaf et al.’s article, and to clarify how ST could be considered evidence based. Crooke and Winner’s response essentially ignored the question of whether ST was scientific—as opposed to pseudoscientific or antiscientific—and instead redirected the discussion to what constitutes evidence-based practice (EBP). Crooke and Winner discussed a few references that they argued represent a body of evidence justifying the use of ST for individuals diagnosed with ASD.

The current authors appreciate the need for dialog and welcome the discussion of this issue. Debate is essential in science, and differing opinions can be expressed in the analysis of claims, especially about the effectiveness of interventions. We were especially pleased to be permitted to rebut this reply to the original article, as it constitutes professional dialog and provides an appropriate forum for the analysis of both sides of this issue.

We did not find the counterarguments convincing and would like to provide our rationales. Thus, the purpose of the current article is to (a) examine Crooke and Winner’s (2016) argument that ST meets EBP standards; (b) examine the empirical research, commentaries, and dissertations cited as evidence in support of ST; (c) address the purported inaccuracies in J. B. Leaf et al. (2016); (d) identify concerns about the conceptual basis of ST as it relates to behavior analysis; and (e) address concerns with an eclectic approach, which Crooke and Winner advocated.

Is ST an Evidence-Based Procedure?

J. B. Leaf et al. (2016) concluded, “Based on this information [the authors’ evaluation], ST, to date, cannot be considered evidence based, empirically supported, or a scientific approach” (p. 154). One of the major points in Crooke and Winner’s (2016) response was a definition of EBP that requires little experimental rigor, thereby setting the stage for the claim that ST could be considered an EBP. Although Crooke and Winner acknowledged that ST does not meet the more stringent standard of empirically supported therapies (ESTs), they contended that clinical expertise and stakeholder input are part of the “evidence” that can allow ST to be regarded an EBP and followed this claim with several citations in support of their definition (i.e., American Psychological Association Presidential Task Force on Evidence-Based Practice, 2006; American Speech-Language-Hearing Association, 2005; Dollaghan, 2007; Kazdin, 2008; La Roche & Christopher, 2009; National Autism Center [NAC], 2011; Wong, Odom, Hume, Cox, Fettig, Kucharczyk, & Schultz, 2015). We agree with Crooke and Winner that ST does not meet EST standards. The critical question, however, is what standard of research should be expected for an intervention to be considered an EBP. Among the articles that Crooke and Winner cited, there are widely varying standards, some of which are and some of which are not met by ST. In order to evaluate whether ST meets EBP standards, it is necessary to compare the various definitions of EBPs, including those in the citations listed above, as well as other commonly used and established definitions.

Due to the detailed criteria of these standards and for the sake of space, we will not provide a complete description of each. Instead, we evaluated whether ST meets or does not meet, or if it is unclear whether ST meets or does not meet, the criteria provided. If ST met the specific criteria of EBP, we provided the complete criteria and discussed how. If ST did not meet the criteria, we discussed how. We evaluated only the three studies published in peer-reviewed journals with an empirical evaluation (Crooke, Hendrix, & Rachman, 2008; Koning, Magill-Evans, Volden, & Dick, 2013; Lee, Crooke, Lui, Kan, Mark, van Hasselt, & Tong, 2016).

Citations Provided by Crooke and Winner (2016)

American Psychological Association (APA, 2006)

APA’s Presidential Task Force on Evidence-Based Practice defined evidence-based practice in psychology (EBPP) as “the integration of the best available research with clinical expertise in the context of patient characteristics, culture, and preferences” (APA, 2006, p. 273). Further, “The purpose of EBPP is to promote effective psychological practice and enhance public health by applying empirically supported principles of psychological assessment, case formulation, therapeutic relationship, and intervention” (APA, 2006, p. 273). Although APA defined EBPP in part by the “best” available research, it also stated that multiple research designs could contribute to EBPP. Other research designs considered acceptable include clinical observation, qualitative research, and case studies. Given this definition and purpose, ST meets the standard of EBPP. However, the standard of evidence is far from rigorous. Clinical observation, case studies, and qualitative research are considered preexperimental at best (Campbell & Stanley, 1968). There are several individual procedures (e.g., video modeling, script fading, the teaching interaction procedure) and comprehensive programs (e.g., those based on the principles of applied behavior analysis [ABA]) with far more empirical evidence than ST (e.g., Charlop-Christy, Le, & Freeman, 2000; Krantz & McClannahan, 1993; J. B. Leaf et al., 2017; J. B. Leaf, Oppenheim-Leaf, et al., 2012; J. B. Leaf, Tsuji, et al., 2012).

American Speech-Language-Hearing Association (ASHA, 2005)

ASHA’s (2005) EBP criteria include the integration of clinical expertise and expert opinion, external scientific evidence, and client perspectives to provide services that meet the values, choices, interests, and needs of the client. ST meets ASHA’s EBP criteria. However, some additional factors must be considered. First, ASHA’s criteria do not specify what constitutes external scientific evidence or its quality. Although the research on ST has critical methodological flaws, it could be considered external scientific evidence per ASHA’s standards. Second, ASHA’s criteria do not define the parameters of clinical expertise nor clarify whether this expertise or opinion should be guided by clinical data. Therefore, by this definition, it is possible that no or poor data could allow ST to be considered an EBP.

Dollaghan (2007)

According to Dollaghan (2007), EBP is considered the best available external evidence from research, best clinical evidence, and best evidence concerning the preferences of an informed client. Dollaghan suggests evaluating research using the Critical Appraisal of Treatment Evidence (CATE), which proposes 15 questions for consumers of scientific literature to ask about any given study. CATE questions include: “Was the evidence from an experimental study?”; “Was there a control group or condition?”; “Was the treatment described clearly and implemented as intended?”; and “Was the outcome evaluated with blinding?” (Dollaghan, 2007, p. 153). When evaluating ST literature using the CATE, many questions must be answered “no” (i.e., evidence does not meet the criteria), leading to the conclusion that the literature supporting ST “can be debated on so many points that unbiased experts might reach opposite conclusions about its validity or importance” (Dollaghan, 2007, p. 152).

Kazdin (2008)

Kazdin (2008) defined “evidence based” as a larger concept of how clinical practice is informed by evidence for interventions, practitioner expertise, and patients’ needs, values, and preferences. ST meets Kazdin’s criteria because what constitutes evidence is not clearly specified. Therefore, an intervention could be considered an EBP according to studies with poor or nonexistent experimental control.

La Roche and Christopher (2009)

La Roche and Christopher’s (2009) definition of EBPP aligns with APA’s (2006) standards. That is, La Roche and Christopher define EBPP as the “integration of the best available research with clinical expertise in the context of patient characteristics, culture and preferences” (p. 397). As with APA, ST meets La Roche and Christopher’s criteria; however, their standard is less rigorous with respect to scientific research when compared to other definitions of EBP because, according to APA, acceptable levels of research include clinical observations, qualitative research, and case studies.

NAC (2011)

Crooke and Winner (2016) discussed NAC’s 2011 guide to EBP in schools, though not the updated version (2015a). Also relevant are NAC’s National Standards Project, Phase 1 (2009) and Phase 2 (2015b). ST was not mentioned in the 2009 standards or 2011 guide, but in both the 2015 standards and guide, ST was listed as an unestablished intervention, one with “little or no evidence in the scientific literature that allows us to draw firm conclusions about [its] effectiveness with individuals with ASD” (NAC, 2015a, p. 63).

Wong et al. (2015)

ST does not meet Wong et al.’s (2015) criteria because it does not include two high-quality experimental group designs conducted by two different research groups or five high-quality single-subject designs by three different research groups.

Citations Not Provided by Crooke and Winner (2016)

Cook et al. (2014)

Cook et al.’s (2014) standards allow numerous paths toward establishing an intervention as evidence based, including the combination of single-subject and group designs. ST fails to meet these criteria due to an insufficient number of peer-reviewed studies, group designs, single-subject designs, and participants across studies.

Horner et al. (2005)

Horner et al. (2005) provided five domains to evaluate whether a procedure or methodology with a single-subject design framework should be considered an EBP. To date, ST has been evaluated once using a single-subject design (Crooke et al., 2008) and twice using group designs (Koning et al., 2013; Lee et al., 2016). Thus, only one study could be evaluated using Horner et al.’s criteria.

ST does not meet these EBP criteria (Horner et al., 2005) for several reasons: Crooke et al. (2008) (a) did not provide a complete operational definition for ST; (b) did not collect treatment fidelity measures (nor was treatment fidelity measured in Koning et al., 2013, or Lee et al., 2016); and (c) used a weak experimental design (i.e., pretest–posttest), resulting in a failure to demonstrate a functional relationship. What is more, there have yet to be enough (as defined by Horner et al., 2005) studies using a single-subject research design (at least five needed), with enough participants (at least 20 needed), across different research groups (at least three needed).

Conclusions About Whether ST Is Evidence Based

Of the nine definitions of EBP discussed previously, ST meets only four (APA, 2006; ASHA, 2005; Kazdin, 2008; La Roche & Christopher, 2009), all less rigorous than the five it does not meet—including three cited by Crooke and Winner (2016). By comparison, there are many interventions that meet higher standards of scientific rigor and have been demonstrated to be effective in improving social behavior. Proponents of ST may argue that they have the best scientific evidence; however, ST meets only the definitions of EBP that incorporate judgments of professionals rather than taking into account only scientifically rigorous controlled studies.

Given the tremendous impact that intervention can have for an individual diagnosed with ASD and his or her family, it is imperative to have high standards for what constitutes EBP. It may be that this will become clearer as the field adopts a more rigorous and universally applied definition of EBP, which is likely to happen over time. However, even in the current context, in which multiple and differing definitions exist, one cannot conclude that ST meets most definitions. Instead, it is only marginally sufficient according to the weakest of these.

A Review of the ST Literature

The peer review process acts as safeguard for consumers of scientific literature. Publishing in a peer-reviewed journal is an impressive accomplishment in any scientific field; however, peer review does not guarantee a study is well designed. A well-designed study includes methods that help assure the consumer that the intervention, or independent variable—and not some extraneous or confounding variable—is solely responsible for the change in the dependent variable (Campbell & Stanley, 1968). Other important components include sufficiently detailed information about the methods to allow for replication and measures of consumer satisfaction with the results and procedures implemented (Wolf, 1978).

Further components of a well-designed behavior–analytic research study include (a) describing the participants, independent variable(s), and dependent variable(s); (b) including objective and observable measures as opposed to descriptions of subjective or unobservable behavior; (c) using an experimental design that minimizes threats to internal and external validity and controls for extraneous variables; (d) collecting interobserver agreement (IOA) data on at least 25% of sessions and finding an acceptable level of agreement; (e) collecting treatment fidelity data on at least 25% of sessions and finding acceptable treatment fidelity; and (f) taking social validity data to ensure that consumers are satisfied with the results and the procedures implemented.

With these six components in mind, we reevaluated the research on ST cited in Crooke and Winner (2016), as well as studies on their peer-reviewed research list (see https://www.socialthinking.com/research). Eleven articles were excluded from this reevaluation, seven of which were commentaries that did not include empirical data or were not published in peer-reviewed journals (Crooke & Olswang, 2015; Crooke, Winner, & Olswang, 2016; Volkmar et al., 2014; Winner, 2002; Winner & Crooke, 2009a, b, 2014). Four were dissertations or theses (Bolton, 2010; Clavenna-Deane, 2010; Taylor, 2011; Yadlosky, 2012), which were excluded because they do not go through the same rigorous peer review process as studies published in peer-reviewed journals. Thus, three empirical studies first evaluated in J. B. Leaf et al. (2016) remained for reevaluation.

Crooke et al. (2008)

Crooke et al. (2008) evaluated ST with six participants diagnosed with Asperger syndrome or “high-functioning” autism. The authors evaluated whether implementation of ST methodology resulted in desired changes in expected verbal behavior, initiations, and listening with one’s eyes. The authors used a pretest–posttest design (similar to an AB case design), though they stated that this study was part of a larger multiple baseline design. The results showed significant increases in expected verbal behavior, initiations, and listening with one’s eyes for all six participants after the implementation of ST. The authors clearly described their dependent variables. Additionally, IOA data were collected in at least 33% of sessions, and the average IOA was above 80% for each dependent variable.

Despite the promising findings of Crooke et al. (2008), several components of a well-designed empirical evaluation were missing. First, although the authors stated the study was part of a larger multiple baseline design, they evaluated the data using a pretest–posttest design that has been described as weak (Bailey & Burch, 2002) and preexperimental (Campbell & Stanley, 1968). This design does not control for many threats to internal validity (e.g., history, maturation, testing, instrumentation, and interactions among these) and threats to external validity (e.g., interaction of testing and the treatment and of selection and the treatment). As a result, it remains unclear whether the ST methodology was responsible for the observed changes, as the design did not control for potentially confounding variables.

Second, Crooke et al. (2008) did not include key demographics of the participants, such as formal language or social assessments that could have provided information about the behavior of each participant. Failure to report this and other important information makes it difficult for professionals to determine whether ST would be effective with their clients.

Third, although Crooke et al. (2008) provided an appendix of the independent variable (ST), they did not provide enough detail for replication. Fourth, treatment fidelity measures were not included, so it is impossible to know if the treatment was implemented as planned or if other techniques were inadvertently used. Finally, measures of social validity were not assessed, making it impossible to identify whether the participants or their parents were satisfied with the results or the procedures. Taken together, failure to include these components results in critical flaws that seriously weaken the impact of the study.

Koning et al. (2013)

Koning et al. (2013) included seven participants who received treatment and eight assigned to a control group. The authors implemented an intervention package that included prompting, reinforcement, coaching, and ST worksheets (Winner, 2002, 2005a). The authors used a randomized control group design to evaluate the effects of the intervention package, and results were significant. However, many components of a well-designed study were missing. First, Koning et al. did not describe the treatment with detail adequate for replication. Second, they did not collect treatment fidelity data to ensure that the teaching methodology was implemented as intended. Finally, and most importantly, they evaluated a treatment package with multiple components, only one of which was unique to ST. Because a component analysis was not conducted, it is impossible to know whether the ST worksheets were critical or effective, or whether some other variable (e.g., prompting or reinforcement) was responsible for the change.

Lee et al. (2016)

Lee et al. (2016) evaluated ST with 39 individuals diagnosed with ASD or who had other social communication problems. The authors used several components consistent with ST and evaluated behavior change using the ILAUGH model (Winner, 2005b). Using a one-group, pretest–posttest design without a control group, the authors found significant differences in five of the six domains of the ILAUGH model.

Despite these positive findings, Lee et al. (2016) lacked several features of a well-designed study. First, descriptions of the participants were not detailed and did not include measures of communication or social deficits. Second, the design, which did not include a control group, was only preexperimental (Campbell & Stanley, 1968), as it did not include controls for the same threats to internal and external validity as the AB case design in Crooke et al. (2008). Third, no measures of treatment or social validity were included. Fourth, the dependent measures were subjective, involving only a nonstandardized assessment, the ILAUGH model.

Conclusions About ST Research

Crooke and Winner (2016) contended that the three aforementioned studies, along with dissertations (Bolton, 2010; Clavenna-Deane, 2010; Taylor, 2011; Yadlosky, 2012) and conceptual papers (Crooke et al., 2016; Crooke & Olswang, 2015; Volkmar, Siegel, Woodbury-Smith, King, McCracken, & State, 2014; Winner, 2002; Winner & Crooke, 2009a, b, 2014) provide “preliminary data about the potential benefits of individual components of ST” (p. 404). However, dissertations are not subjected to peer review, and conceptual papers lack empirical data.

Although Crooke and Winner (2016) acknowledged the available data are preliminary, our observation is that excitement generated in workshops far exceeds the data’s rigor. Again, intervention approaches are available whose effectiveness has been demonstrated in improving the social functioning of individuals diagnosed with ASD, and the suggestion that ST may improve on these is purely speculative. We appreciate Crooke and Winner’s acknowledgement of the need for more empirical evidence and look forward to reading peer-reviewed empirical studies in the future. One may recognize some promise anecdotally, but ST awaits empirical support.

Purported Misconceptions and Inaccuracies

Crooke and Winner (2016) stated that their purpose was to “address several misconceptions and inaccuracies that were advanced in the article ‘Social Thinking: Science, Pseudoscience, or Antiscience?’” (p. 403). Some of these purported misconceptions and inaccuracies require clarification.

One criticism of J. B. Leaf et al. (2016) was that the authors stated that ST proponents claim “they can produce high levels of success quickly across a variety of disorders” (p. 152). J. B. Leaf et al. cited Green’s (1996) definition of pseudoscience, which does address rates of success over time. However, this is the one characteristic of pseudoscience J. B. Leaf et al. did not discuss with respect to ST, and the authors accept that ST does not make such a claim. Crooke and Winner (2016) did not dispute any other characteristics of pseudoscience that J. B. Leaf et al. found to be associated with ST. In fact, the authors critiqued ST only in areas in which there was clear evidence of pseudoscience, and rate of acquisition was not part of this critique.

Crooke and Winner (2016) also claimed that the assertion that people associated with ST make negative statements about proven therapies is unfounded. But they pivoted their position, arguing that treatment should be individualized, and providing recommendations of an array of teaching procedures, some of which are not evidence based (e.g., the Greenspan Floortime Approach; see NAC, 2015b) or have serious methodological flaws (e.g., Social Stories; see J. B. Leaf et al., 2015).

Advocating for individualized teaching does not address the negative statements ST proponents have made about EBP, particularly procedures based on the principles of ABA. Although Crooke and Winner (2016) wrote that “ABA has proven to be successful in helping children with autism develop increased basic social competencies” (p. 406), this implies that ABA and the principles of reinforcement are not successful at teaching more advanced social behaviors, as does Winner’s (n.d.) charge that “complex behavior change is not based on a simple reinforcement system” (para. 3). This is not consistent with the research (e.g., Dotson, Leaf, Sheldon, & Sherman, 2010; Ferguson, Gillis, & Sevlever, 2013; Kamps et al., 1992; Kamps, Barbetta, Leonard, & Delquadri, 1994; Kassardjian et al., 2014; Koegel & Frea, 1993; J. B. Leaf et al., 2009; J. B. Leaf, Dotson, Oppenheim, Sheldon, & Sherman, 2010; J. B. Leaf et al., 2017; J. B. Leaf, Oppenheim-Leaf, et al., 2012; J. B. Leaf, Tsuji, et al., 2012; Oke & Schreibman, 1990; Schrandt, Townsend, & Poulson, 2009).

These comments, plus additional negative comments about ABA in other writings (cited in J. B. Leaf et al., 2016), and the ST website’s promotion of links (Social Thinking, n.d.) to the website of another organization that has called ABA interventions “inappropriate or even harmful” (ASAN, n.d.) and urged individuals to seek out interventions other than ABA for ASD treatment (ASAN, 2015) result in uncertainty, at best, about Crooke and Winner’s stance on interventions that not only meet EBP criteria, but also the more stringent EST criteria.

A third area in which Crooke and Winner (2016) claimed J. B. Leaf et al. (2016) misrepresented their position is science, as in, for example, “Winner’s statements about scientific methods are also mixed” (p. 155). In particular, Crooke and Winner objected to the citation of Winner’s (2013) statement that “we have put the proverbial cart before the horse in being asked to provide scientifically rigorous evidence for an area that remains highly subjective and open to interpretation in every facet of its application” (p. 229), suggesting that this quotation in context tells a different story:

How can we assess an area—social thinking and related skills—that’s never been clearly defined, in a population of individuals—those with ASD and related disabilities—that have no common grouping upon which research can be based? We have put the proverbial cart before the horse in being asked to provide scientifically rigorous evidence for an area that remains highly subjective and open to interpretation in every facet of its application [emphasis added in Crooke & Winner, 2016]. Nevertheless, many of us continue to pursue the development of treatment methodologies that can be shown to be effective through research methods developed for more individualized instruction, such as single subject designs. (Winner, 2013, pp. 229–230)

Although Winner (2013) does acknowledge single-subject designs, the essence of the argument is that the complexity of social skills necessitates interventions that are so subjective and open to interpretation that they are not researchable, and that practitioners should settle for professional judgment in lieu of a scientific approach. We disagree that professionals cannot empirically evaluate procedures or interventions in “everyday settings,” as numerous such studies have been conducted (e.g., Gould, Tarbox, O’Hora, Noone, & Bergstrom, 2010; Kamps et al., 1992, 1994; Kassardjian et al., 2014; J. B. Leaf et al., 2017; Persicke, Tarbox, Ranick, & St. Clair, 2013). Furthermore, Crooke and Winner (2016) stated that they are not ready to group individuals to evaluate social skills, but behavioral researchers have published numerous articles on group design experiments to teach children diagnosed with ASD basic and advanced social behavior (e.g., Dotson et al., 2010; Ferguson et al., 2013; Kamps et al., 1992, 1994; Kassardjian et al., 2014; Koegel & Frea, 1993; J. B. Leaf et al., 2009, 2010, 2017; J. B. Leaf, Oppenheim-Leaf, et al., 2012; J. B. Leaf, Tsuji, et al., 2012; Oke & Schreibman, 1990; Schrandt et al., 2009).

As another example, J. B. Leaf et al. (2016) quoted Winner (2008): “If our goal is to determine the best or most promising practices, we need to consider more than best scientific evidence” (p. 107). Crooke and Winner (2016) provided further context:

If our goal is to determine the best or most promising practices, we need to consider more than best scientific evidence [emphasis added in Crooke & Winner, 2016]. Social skills play out in the “real world,” one that involves family/client values, cultural differences, economic backgrounds, not to mention the clinician’s experience in the field itself, and any preconceptions and perceptions that the clinician brings to the experience. (Winner, 2008, p. 107)

This is another statement that is difficult to interpret, even within context, and it remains unclear how important research is to ST methodology. Are considerations such as clinical expertise, family values, or cultural differences more important than research? These should not be impediments and should in fact be embraced as part of conducting applied research. Per J. B. Leaf et al. (2016), Winner’s statements about science and research are mixed at best, with a preponderance that questions the value of the scientific method. We suggest consumers carefully evaluate these statements.

The Conceptual Basis of ST

Behavior Change Principles

ST creates several problems for practicing behavior analysts, and Crooke and Winner’s (2016) response highlighted some of these. For example, it is not clear which techniques (e.g., reinforcement, punishment) are responsible for behavior change, though Crooke and Winner provided examples of empirically supported practices included in ST methodology. Specifically, they stated that “ST is a methodology upon which empirically supported research-based practices (e.g., modeling, naturalistic intervention, reinforcement, visual supports) can aggregate into specific strategies (e.g., establishing reciprocity, initiating social contact, utilizing problem-solving), via lessons, and activities for implementation” (p. 403).

However, earlier articles on ST include conflicting statements about the potential mechanisms. For example, a study cited in Crooke and Winner (2016) argued the following:

Unlike previous studies reported in the literature, this approach [ST] does not use reinforcement to increase desired social behaviors, nor does it use tangible consequences or punishment to decrease less desirable behaviors [emphasis added]. Instead, children were taught to understand that others had “thoughts” separate from their own and that “social” is based on understanding and regulating others’ thoughts via their own individual behaviors. (Crooke et al., 2008, p. 586)

Recall that the study by Koning et al. (2013), one of the three empirical studies cited by Crooke and Winner (2016), reported using prompting and reinforcement. Also, Crooke et al. (2016), cited in Crooke and Winner (2016), discussed how ST aligns with cognitive behavioral therapy (CBT): “Behavior-based approaches center on the power of an identified ‘reinforcer’ to motivate new learning of memorized social skills. Cognitive frameworks, such as CBT, focus on creating awareness and motivation to change performance” (p. 289). Given the inconsistency within and across ST studies, it is unclear what procedures are included or not included in ST and what mechanisms are responsible for any observed change.

Based on the study by Crooke and Winner (2016) and other works, it is clear that the methodological components of ST differ from study to study; the three empirically based articles Crooke and Winner cited did not provide descriptions detailed enough to determine responsibility for behavior change. Reasons for these varied portrayals of the techniques involved in ST could include that the methodology has evolved over time and different techniques have been accounted for through clinical observation. If proponents of ST want to equate their methodology with components of others, as they often do, it behooves them to be more descriptive of their behavior change techniques in practice and research.

Endorsing Eclectic Approaches

Crooke and Winner (2016) wrote that “regarding the value of other therapies, Winner and ST have consistently stated the importance of individualizing practices to the needs of the individual, rather than the adoption of one therapy or approach for all or based solely on diagnosis” (p. 406). This is followed by implications that individuals diagnosed with ASD may benefit from interventions employing several different strategies, behavioral and cognitive behavioral, as well as the Greenspan Floortime Approach (a procedure without established evidence according to the National Standards Project, Phase 2; see NAC, 2015b) and Social Stories (a procedure shown to have serious methodological flaws; see J. B. Leaf et al., 2015).

In a non-peer-reviewed commentary, Winner and Abildgaard (n.d.) stated that ABA and ST should merge to meet learners’ individual needs: “So rather than argue whether a student should receive ABA or Social Thinking, we try to explore how we can merge the best ideas from both treatments into one intervention approach for our higher functioning students” (p. 1).

These and other statements suggest ST proponents believe that individualized treatment for those diagnosed with ASD is appropriate, that behavior analysts should work collaboratively with professionals from other teaching philosophies, and that effective treatment may include multiple teaching approaches. We agree with Crooke and Winner (2016) that programming needs to be individualized and have discussed this in numerous publications (e.g., J. B. Leaf et al., 2015, 2016; Taubman, Leaf, & McEachin, 2008). Curricular goals need to be individualized to help ensure therapists are targeting functional, meaningful, and culturally responsive skills. Additionally, the teaching methodology should be individualized based on what has been demonstrated to be most effective. However, all interventions should consist only of EBPs with empirical support. By using empirically supported, evidence-based methodologies, practitioners help ensure provision of interventions with documented effectiveness as opposed to those that may or may not lead to meaningful behavior change.

We also agree with Crooke and Winner (2016) that behavior analysts should work collaboratively with other professionals. If a child has an oral-motor disorder and a behavior analyst does not have expertise in this area, that behavior analyst should work with others with more expertise, such as speech–language pathologists. If an adolescent demonstrates signs of depression, a behavior analyst should work with a social worker or licensed psychologist who has appropriate expertise. Collaboration is essential, but it does not mean behavior analysts should endorse, recommend, or implement procedures that lack empirical support. Many behavior analysts have provided suggestions and flow charts for working collaboratively with professionals who may promote non-evidence-based practices (e.g., Brodhead, 2015; Chok, Reed, Kennedy, & Bird, 2010).

Furthermore, methodologies that are non-evidence based, pseudoscientific, or antiscientific should not be considered behavior–analytic procedures. Labeling them so goes against our guiding principles (Baer, Wolf, and Risley, 1968; Green, 1996). Behavior Analyst Certification Board (BACB) certificants who encounter professionals recommending ST or procedures based on ST must not endorse, recommend, or implement these procedures, per the BACB’s Professional and Ethical Compliance Code for Behavior Analysts, Section 8.01 (Behavior Analyst Certification Board, 2017). Additionally, we encourage behavior analysts to discuss the importance of implementing strategies that are empirically supported, the necessity of avoiding pseudoscience or antiscience, and the potential harm of implementing procedures not based on empirical support.

We also agree that there are instances when it is appropriate to use multiple techniques during treatment (J. B. Leaf et al., 2016). For example, quality intervention could include discrete trial teaching, the teaching interaction procedure, systematic desensitization, and token economies. However, we vehemently disagree with an eclectic approach (multiple interventions combined) when some components are not empirically validated or evidence based, or if the evidence is questionable. For example, because the Greenspan Floortime Approach is not evidence based (NAC, 2015b), we would not recommend it be used with ABA-based procedures. Nor would we recommend Social Stories in conjunction with ABA-based procedures, as the research on Social Stories is questionable (Kokina & Kern, 2010; J. B. Leaf et al., 2015; Reynhout & Carter, 2011; Sansosti, Powell-Smith, & Kincaid, 2004). There have yet to be any empirical studies comparing ABA combined with ST to an approach solely based on the principles of ABA, and ST is considered an unestablished intervention (NAC, 2015a, b). Until such research occurs, we must be guided by the current research, which demonstrates that an eclectic approach is not as effective as an approach implementing only ABA-based procedures (Howard, Sparkman, Cohen, Green, & Stanislaw, 2005; Howard, Stanislaw, Green, Sparkman, & Cohen, 2014; Lovaas, 1987). Given the importance of treatment efficacy, behavior analysts should be reluctant to implement practices that dilute the effectiveness of interventions based on the principles of behavior analysis. Thus, we should not embrace or promote an eclectic approach to intervention for individuals diagnosed with ASD.

Conclusion

The purpose of this article was to address the response from Crooke and Winner (2016) regarding ST, which we did by clarifying and expanding upon perceived misconceptions and inaccuracies they outlined. Although there are three published studies evaluating interventions based on ST, each has serious methodological flaws, leaving us with minimal confidence in ST’s effectiveness. The current article also revealed that ST does not meet EBP criteria, per several established definitions (e.g., Cook et al., 2014; Dollaghan, 2007; Horner et al., 2005), as well as the National Standards Project, Phase 2 (NAC, 2015b). This article also demonstrated that the conceptual basis for ST is not aligned with a behavior–analytic worldview; ST may be described as a mentalistic approach. Furthermore, the principles that guide ST have not been clearly operationally defined and have been described differently across publications. Finally, we discussed how proponents of ST endorse eclecticism and the problems with such an endorsement.

As J. B. Leaf et al. (2016) argued, ST has many hallmarks of pseudoscience, which Crooke and Winner (2016) did not dispute. As ST is a nonempirically validated procedure, it is still our recommendation that behavior analysts do not endorse, recommend, or implement it, either alone or in conjunction with procedures based on principles of ABA, as it may dilute interventions’ effectiveness and reduce outcomes. This is true especially given the empirical evidence and clinical support that show procedures based on the principles of ABA and comprehensive ABA interventions have been effective in teaching social behaviors to more impacted and “higher functioning” individuals diagnosed with ASD (Chung et al., 2007; Dotson et al., 2010; Kamps et al., 1992, 1994; Kassardjian et al., 2014; Koegel & Frea, 1993; J. B. Leaf et al., 2009, 2010, 2017; J. B. Leaf, Oppenheim-Leaf, et al., 2012; J. B. Leaf, Tsuji, et al., 2012; Oke & Schreibman, 1990).

Further, we hope that behavior analysts will help educate parents and other professionals on the importance of implementing procedures that are evidence based, empirically supported, and not pseudoscientific (e.g., Chok et al., 2010), as well as to critically analyze all procedures, including those based on the principles of ABA. Although J. B. Leaf et al. (2016) and the current article focused on ST, ST is not the only procedure implemented for individuals diagnosed with ASD that is not evidence based, not empirically supported, or pseudoscientific. Much of this analysis of ST could be applied to other controversial and non-evidence-based procedures (see Jacobson, Foxx, & Mulick, 2005 for a detailed discussion of these). Many of these procedures, including ST, can have serious unintended negative consequences (e.g., wasting valuable teaching time and money, making negative emotional impacts) for those diagnosed with ASD and their families. It may be that ST will scale the ladder of EBP, and if that happens, recommendations regarding its inclusion in treatment for ASD could be altered. Until then, the reluctance to dilute effective interventions with procedures not yet verified must stand.