At the end of World War II, Vannevar Bush wrote a treatise on the role of science in technological innovation called Science: The Endless Frontier (Bush, 1945). The term basic research was first devised in this letter (p. 1). Bush distinguished basic from applied research on the basis of goals. He claimed that basic research was conducted for the sake of knowledge and applied research conducted for the sake of practical use. The products of basic research were transferred across domains, generating technological innovation. Technology transfer was Bush’s conceptual framework depicting how the products of basic research benefited society at large. This paradigm of the polar distinction between basic and applied research remains dominant in science (Stokes, 1997). In the last decade, the U.S. National Institutes of Health (NIH) have promoted a bridge between the basic and applied poles called “translational research.” Concepts such as the “translational pipeline” (Cheeran et al., 2009) guide basic and applied research in the behavioral and biomedical sciences toward technological innovation.

Some of the earliest writing on knowledge–practice transformation in behavior analysis described the experimental analysis of behavior and applied behavior analysis operating at opposite ends of a basic–applied continuum (Hake, 1982). Parallel to Stokes’s (1997) distinction based on goals, we contend that there is a necessary trade-off between internal validity and ecological validity that distinguishes basic research from applied research and practice. Experimental analysis of behavior is an efficient way of identifying behavioral mechanisms and making causal inferences about functional environment–behavior relations. By definition, applied research involves socially important problems (Baer, Wolf, & Risley, 1968, 1987) and enables the development of strategies and interventions that can be modified for implementation in practice.

The term translational is popular, but it is not well defined (Woolf, 2008). Within behavior analysis, translational research is sometimes defined as any research that occupies the space between “pure” basic and “pure” applied research, irrespective of the long-term objectives of the researchers (Hake, 1982; McIlvane, 2009) or as any programmatic attempt to produce research informed by practice or vice versa (Lerman, 2003; Mace, 1994). Translational has been used to describe collaborative studies across fields (Ogilvie, Craig, Griffin, Macintyre, & Wareham, 2009), the examination of naturally occurring events with quantitative models (Critchfield & Reed, 2009), animal models of applied phenomena (Mace & Critchfield, 2010), tests of the generality of phenomena with human populations (Adams, 2008), applications of basic findings in clinical settings (Goldblatt & Lee, 2010), and the testing of the products of applied research in the community (McIlvane, 2009). Furthermore, the process of translation is conceptualized as a unidirectional (Zucker, 2009), bidirectional (Cheeran et al., 2009; Hörig, Marincola, & Marincola, 2005), or multidirectional or dynamic relation between basic and applied domains and scientific fields (Marr, 2017; Ogilvie et al., 2009).

Ultimately, developing a consensus about what constitutes translational behavior analysis may alleviate some barriers to effective transformation of knowledge into practice and vice versa within the discipline and across other sciences. However, any attempt to describe precise boundaries among basic, translational, and applied research that is not predicated on a clear taxonomy, encompassing the complete spectrum of possible behavior analysis research strategies, is unlikely to produce a consensus. In this article, we propose a taxonomy inspired by a biomedical model of translational research that organizes behavior analysis as a spectrum divided into five tiers. We hope that systematic criteria for classification can provide a foundation for developing and promoting the continued advancement of behavior analysis.

Biomedical Translational Pathways

The term translational first appeared in the scientific literature in 1993 to describe basic research in genetics that uncovered factors relevant to the fight against cancer (Butler, 2008). Since that time, the term has increased in frequency across the biomedical sciences in general and cancer research in particular (Cambrioso, Keating, Mercier, Lewison, & Mogoutov, 2006). In the United States, the NIH created the NIH Roadmap to guide biomedical science (Collins, Wilder, & Zerhouni, 2014; Zerhouni, 2003). The first initiative in the Roadmap was the “bench to bedside and back” model. This model is focused on translating the products of basic research from the laboratory into clinical practice. In bench-to-bedside translation, the onus is on basic and applied researchers, including scientist practitioners, to address issues of the other domains in their research and practice. The second initiative is called “to the community and back.” The second initiative encourages implementation of the products of basic and applied research in the community and places the burden of translation on public health scientists. Mold and Peterson (2005) added a third initiative—“from dissemination to practice”—to the Roadmap. The goal of dissemination-to-practice translation is to focus on how to incorporate, rather than simply implement, the various products of science into the landscape of clinical practice. The burden is on practitioners to build and restructure best practices based on science.

Biomedical researchers adapted and refined the Roadmap in service of particular objectives. For example, Blumberg, Dittel, Hafler, Von Herrath, and Nestle (2012) built on the work of others (Finkbeiner, 2010; Sung et al., 2003; Westfall, Mold, & Fagan, 2007; Woodcock & Woosley, 2008) to specify a translational pathway for autoimmune disease research with five tiers. Tiers were defined in terms of the operational challenges to be overcome. For Blumberg and colleagues, Tier 0 (T0) is research that defines cellular mechanisms. Tier 1 (T1) begins the translation of bench to bedside by establishing proof of concept in (healthy) human subjects. Both Tier 2 (T2) and Tier 3 (T3) involve clinical trials but are distinguished by their objectives: controlled studies leading to effective treatment (translation to patients; T2) and operationalization of procedures that optimize the use of the treatment with patients who stand to benefit from it (translation to practice; T3), respectively. Population-level outcome effectiveness research is described separately as Tier 4 (T4).

In autoimmune disease research, activities associated with each tier are distributed across academia, government, industry, and community, with limited interaction across sectors. Blumberg et al. (2012) argued that this lack of effective integration has impeded drug discovery for autoimmune disorders. They proposed several potential solutions, including an increased emphasis on preclinical T0 research and the creation of centers where collaborators work across tiers to integrate experimental research, drug manufacturing, and patient care. Success in implementing these solutions has been limited to date and is rapidly becoming economically and practically unfeasible in biomedical research (Fernandez-Moure, 2016). In comparison, identifying and addressing problems in behavioral science may require less infrastructure and can involve more flexibility in time frame and scale, making a similar approach to classifying research in behavior analysis viable and relatively expedient.

A Tiered Spectrum of Behavior Analysis Research

The theoretical underpinnings of basic and applied behavior analysis are more alike than different. Both domains are pragmatic in the sense that they judge research on the basis of effective action (Lattal & Laipple, 2003), but the goals of basic and applied research differ based on whether the fundamental question relates to understanding behavior or curing behavior problems (Azrin, 1977). Basic researchers tend to choose subjects, responses, stimuli, and settings for the purposes of maximizing experimental control. Applied researchers typically choose responses, stimuli, and settings that are socially significant or immediately important to the subject (Baer et al., 1968).

Pathways for effective translational medical research were first established before behavior analysis existed as a field (Boynton & Elster, 2012), so it seems reasonable to learn from medical approaches. We propose a behavior analysis spectrum that is divided into five tiers, adapted from the translational pathway described by Blumberg et al. (2012). Figure 1 is a schematic of this pathway. Rather than emphasize operational challenges that are necessarily tied to potentially ambiguous research objectives, as Blumberg and colleagues did, our application of this approach to behavior analysis defines tiers in terms of relative priority placed on internal validity versus social importance of several aspects of the research. Specifically, we consider the selection of research subjects, target behaviors, relevant stimuli, and data collection settings. Each of these criteria was discussed by Baer et al. (1968) as being important to one or more dimensions of applied behavior analysis. Our aim with this tiered model is to provide a scheme for categorizing across the basic–applied behavior analysis continuum in a systematic and meaningful way.

Fig. 1
figure 1

A basic–applied spectrum for behavior analysis research adapted from a biomedical translational pathway (Blumberg et al., 2012). Each tier has a label (e.g., T0), a description, a potential starting point for research, and an expected outcome. This tier system classifies research according to (a) whether the research subjects were selected for experimenter convenience (T0) or as representatives of those who stand to benefit from research outcomes (T1–T4) and (b) whether the target behaviors, stimuli, and settings were selected based on convenience to the researcher (T0 and T1), social importance (T3 and T4), or a mixture (T2). Research in lower tiers prioritizes internal validity and experimental control, whereas research in higher tiers emphasizes ecological validity and societal relevance. T0 = Tier 0; T1 = Tier 1; T2 = Tier 2; T3 = Tier 3; T4 = Tier 4

The question may arise as to what constitutes a problem of social significance or social importance in the selection of subjects, behaviors, stimuli, and settings. There will necessarily be some variability in judgment of social validity in behavioral goals, procedures, and effects (Wolf, 1978), which makes the borders of our tiers less than rigid. We do not attempt to provide a comprehensive definition of social significance that addresses the many and varied ways in which behavior analysts use the term. For our purposes, research that involves socially important elements is expected to have a reasonably direct impact on the world that is independent of its contribution to science. Moreover, social importance in research does not necessarily equate to urgency or scope of impact under naturalistic circumstances; problems of social significance can range from simple and idiosyncratic to vast epidemics.

To be pragmatic, our model adopts a standard of substitutability. A subject, behavior, stimulus, or setting is socially important if replacing it with something else changes the nature of the contribution. When a subject, behavior, stimulus, or setting is convenient, the researcher could theoretically replace it with another and still answer the same research question, although doing so might take longer or cost more. Convenience does not imply carelessness in the selection of research methods, nor does it imply that the subject, behavior, stimulus, or setting selected is not important outside the research context. Descriptions of the specific research tiers in our model all include several examples in which convenience and social significance are contrasted.

We can consider a behavior socially important if it affects the physical, social, or occupational functioning of an individual, population, or community. Examples of socially important behavior span from challenging behaviors related to developmental disabilities (e.g., self-injurious behavior) to positive behaviors that promote health and longevity (e.g., exercise) and well-being (e.g., developing hobbies) in individuals and in society at large. In research that does not involve socially important behavior, target responses and dependent variables tend to be selected for the investigators’ convenience. Head entry into food magazines, key pecks, lever presses, and computer mouse clicks are often selected as responses because they are easy to establish and modify, are easy to record, or avoid potential logistical or ethical problems. The critical distinction between an experimentally convenient behavior and a socially important one is not its topography but its potential to affect a target population or client. In principle, lever pressing could be a socially important behavior, and exercise could be experimentally convenient (see Baer et al., 1968).

Socially important stimuli are materials (e.g., discriminative cues, task materials, academic targets, potential reinforcers and punishers) relevant to the research question under naturalistic conditions. For example, in one study, researchers interested in medication adherence measured whether participants opened pill bottles and provided incentives if participants took their pills or provided text-message prompts if participants did not take their pills (e.g., Raiff, Jarvis, & Dallery, 2016). In this case, the incentives, prompts on a phone, and medication were socially important stimuli. Convenient antecedent stimuli are often simple and abstract (e.g., geometric primitives, different frequencies of flashing lights or white noise, Gabor patches) that are relatively easily perceived by the subject (e.g., visual stimuli for pigeons and auditory stimuli for rats) and are sometimes created and used exclusively in a research context (e.g., greebles, nonwords). Convenient antecedents may be specifically selected because they are unfamiliar and therefore initially meaningless to research subjects or inconsequential to the subject’s daily functioning. By contrast, consequences tend to be convenient because their efficacy can be assumed within the context of the experiment. For example, journals do not generally expect or require authors to include a functional analysis in a study where food was used to reinforce responding in food-restricted laboratory animals.

Socially important settings are places in which the environment–behavior relations of interest may be observed or manipulated under naturalistic circumstances outside of the research context. For the purposes of our model, a socially significant setting may be the subject’s home, a classroom, a residential facility, a treatment center, or a clinic that the subject usually attends for services outside of the research context or a setting where a treatment for a treatment-seeking population is usually applied. For example, a small, quiet room in which to conduct one-on-one teaching trials for a skill acquisition intervention may be a socially important setting for the evaluation of an individually delivered teaching intervention but may not be a socially important setting for a classwide intervention. Similarly, laboratories and other contrived environments would only be considered socially important settings if generalization of the training to a naturalistic context is measured and if one or more elements of the setting were specifically designed to facilitate that generalization. A laboratory or other convenient setting minimizes potential extraneous variables and permits observation of behavior–environment relations in an unadulterated, if synthetic, form.

Behaviors, stimuli, and settings cannot be socially significant unless they are important to someone, something, or some group in particular. A socially important research subject is one that represents the individuals for whom the research outcomes are socially significant. These subjects might be the “clients” themselves but can also include human proxies and animal models that share key characteristics with clients. In principle, research might involve a socially important subject but convenient behaviors, stimuli, and settings, but if any behavior, stimulus, or setting is socially important, there must be a client to whom it is socially important. For this reason, the use of socially important research subjects is the key feature that distinguishes T0 from T1 in our model, and the use of any other socially important features distinguishes T2 from T1. T3 and T4 both feature socially important subjects, behaviors, stimuli, and settings but are distinguished by whether the study reports new applications or evaluates the impact of established applications on populations, communities, or societies.

Although we suspect that most researchers would consider nearly all research classified in the lowest (T0) and highest (T3, T4) tiers as basic and applied, respectively, those labels do not necessarily neatly map onto all tiers. There are T1 experiments that are published in applied journals that many behavior analysts would consider “applied” and T2 studies that many might call “basic.” We propose this model as a classification scheme in the hope that it will advance discussion of translational behavior analysis specifically and discussion of behavioral research more broadly. The tiered model is not designed to rank research output from most basic to most applied; instead, it organizes the research by aspects of the method. One potential advantage of the model is that it provides an alternative to the basic–applied dichotomy, which might be fairly construed to be overly reductionist and likely to discourage meaningful efforts toward transforming knowledge into practice and practice into new knowledge.

In the following sections, we identify recent research articles that are examples of each tier in the pathway and provide some clarification about the characteristics of subjects, behaviors, stimuli, and settings that exemplify research in that tier. Because research on impact assessment is large in scale and often synthesizes results from several studies or interventions, we looked for research published over a longer time period that exemplified T4 scholarship.

Tier 0: Blue Sky Basic Science

Tier 0 research contributes new knowledge about the fundamental nature of behavior–environment relations. Internal validity and experimental control are prioritized in the selection of research subjects, behaviors, stimuli, and settings.

Pigeons were the most frequently used research subject in articles published in the Journal of the Experimental Analysis of Behavior between 1958 and 2013 (Zimmermann, Watkins, & Poling, 2015), so a laboratory experiment in which pigeons pecked lighted discs for access to food is probably the quintessential T0 study. For example, Andrade and Hackenberg (2017) studied pigeons’ preference for generalized tokens. Pecking one key earned tokens that were indicated by the illumination of small lights. Pecking a second key exchanged tokens. Red tokens could be exchanged for food, green tokens could be exchanged for water, and white tokens were generalized tokens that could be exchanged for either food or water. Preference for generalized tokens was sensitive to their price relative to the available alternatives. Fox and Kyonka (2016) evaluated how changing the location and color of an illuminated response key affected the pigeons’ ability to use their own behavior as temporal cues in interval schedules of food reinforcement. They found that signaled responses were more effective time markers than unsignaled responses but less effective time markers than stimulus changes that were not response contingent. In these studies, the pigeons, pecks, key light and token characteristics, and food and water were not essential to answering the research question. Instead, the subjects, responses, stimuli, and settings were used to ensure that the research was conducted with as much experimental control as possible. The contribution of these studies was to define functional relations related to generalized reinforcers and time marker efficacy. In principle, different subjects, responses, stimuli, and settings could have been used to obtain the same results.

Purpose-bred captive laboratory animals are convenient research subjects for several reasons. Best practices for the husbandry and welfare of pigeons, rodents, and small primates are well established. In general, standard operating procedures surrounding extra-experimental housing, feeding, and enrichment need not vary much from one experiment to the next, which can make training laboratory personnel and acquiring ethics approval relatively straightforward. Unquestionably, researchers have more control and flexibility over the structure of experiments with purpose-bred captive laboratory animals than with human subjects (Baron, Perone, & Galizio, 1991; Vanderveldt, Oliveira, & Green, 2016).

Although purpose-bred captive laboratory animals tend to be convenient research subjects, not all T0 research involves laboratory animals, and not all laboratory animal research is T0. In psychology, the term convenience sample is often synonymous with undergraduate students enrolled in introductory psychology classes. Students are convenient because they are geographically proximate and presumably have the time and motivation to participate in research. Compared to the whole of humanity, they are also disproportionately Western, educated, industrialized, rich, and democratic (Henrich, Heine, & Norenzayan, 2010). Nonrepresentative samples are problematic when the behavior studied correlates with demographic variables. Behavior–analytic research generally bypasses the issue of demographic generalization, either by using single-subject designs or by investigating basic behavioral processes that are presumably preserved across human subjects (if not across species). If biomedical researchers are able to study autoimmune disease functioning without demonstrating empirically that it does not vary with level of education or political origin, surely behavior analysts interested in the fundamental laws of behavior can do the same with selection by consequences.

Some T0 behavior analysis with human subjects involves traditional single subject with replication designs. In an experiment that compared estimates of sensitivity obtained from generalized matching analyses under negative versus positive reinforcement conditions, college students earned money by clicking with a mouse cursor on moving rectangles displayed on a computer monitor (Magoon, Critchfield, Merrill, Newland, & Schneider, 2017). Participants were recruited through notices posted on a college campus. Data collection occurred in a laboratory setting designed to minimize distractions but maintain interest in the task (participants were permitted to listen to music during experimental sessions). The response was straightforward and relatively abstract—investigators made no attempt to make the task resemble video game play, for example. Participants earned money for responding so that their behavior was sensitive to differential contingencies: Money was a demonstrably effective, convenient reinforcer in much the same way as food and water are used in experiments with pigeons and other laboratory animals—not as a socially important consequence.

Human subject T0 behavior analysis also includes experimental research in which results are primarily analyzed at the group level. One such study (Jimenez & Pietras, 2017) evaluated whether manipulating income affected choice. According to models of risk reduction, individuals should choose a variable income in negative-energy budget conditions but a fixed income in positive-energy budget conditions. Jimenez and Pietras (2017) reported no effect of social variables on college students’ choices in any energy budget. In different conditions, participants were told that working with others involved cooperating with a computer or with a participant at another university (Experiment 1) or were shown that the earnings of the cooperating partner were less than, equal to, or greater than their own (Experiment 2). Analyses of variance showed that choices were consistent with risk reduction predictions and that neither social manipulation had any systematic effect. Jimenez and Pietras provided minimal instructions, used simple stimuli, and arranged earnings to be consequential within the experimental context but without value outside the laboratory so that choices were unlikely to be affected by extraneous variables or participants’ extra-experimental histories.

Comparative research involving cross-species analyses or species-typical behavior sometimes also falls within the scope of T0 behavior analysis. Wright, Magnotti, Katz, Leonard, and Kelly (2016) found that concept formation in Clark’s nutcrackers was similar to that of rhesus monkeys and accelerated compared to that of pigeons. Russell and Burke (2016) conducted five experiments that demonstrated conditional same–different learning in a short-beaked echidna. Sargisson, Lockhart, McEwan, and Bizo (2016) reported that the scalar property—a psychophysical feature of responding—occurred in brushtail possums’ lever pressing for food in a peak-interval procedure. Each of these reports is an example of T0 behavior analysis because the studies used experimentally controlled settings, abstract or unfamiliar stimuli, and convenient reinforcers. Critically, they all contributed to the scientific understanding of behavior–taxa relations in animal learning and operant conditioning. For example, Wright et al.’s contribution would have been similar had the authors studied blue jays instead of Clark’s nutcrackers.

Tier 1: Use-Inspired Research

In our model, T1 research addresses a critical need—that is, the knowledge produced is necessary to solve a specific problem of social significance for a particular subject or population. It is possible to conceive of a “client” who stands to benefit from T1 research, and research subjects are selected because they are or share key characteristics with clients. Like T0 research, T1 research investigates the fundamental nature of behavior–environment relations, but in T1 the results are relevant to a particular population or situation. T1 research involves convenient behaviors, stimuli, and settings so that the functional relation can be specified as clearly as possible.

In some T1 research, the clients themselves are the research subjects. For example, in a study of pregnant women who smoked, demand for cigarettes was greater among heavier smokers than among lighter smokers (Higgins et al., 2017). Higgins and colleagues recruited women undergoing prenatal care and confirmed self-reported smoking biochemically. In an obstetric clinic or in their own homes (places they would not actually be able to purchase cigarettes), participants reported the number of cigarettes they would purchase and smoke at a variety of prices. In this case, using socially important behaviors, stimuli, or settings would have presented ethical challenges. In spite of the necessary artificiality of the cigarette purchase task, its demand characteristics predicted subsequent quit attempts better than conventional predictors.

Tier 1 behavior analysis includes research that targets many different populations. Target populations might be defined by clinical or preclinical diagnosis, demographic characteristics, occupation, or activity, among other factors. Recent examples of T1 in which researchers used clients within the population of interest as research subjects include studies of delay discounting of disordered gamblers (Dixon, Buono, & Belisle, 2016), implicit relations of Northern Irelanders regarding Catholics and Protestants (Hughes, Barnes-Holmes, & Smyth, 2017), conditional discrimination by typically developing children (Bergmann, Kodak, & LeBlanc, 2017), and persistence in a repetitive task by employees (Henley, DiGennaro Reed, Reed, & Kaplan, 2016). These studies can accomplish different types of goals. The examples mentioned herein all demonstrate that a previously validated experimental procedure can be used with a novel target population or establish a proof of concept for studying specific behavioral phenomena in contrived settings.

When the target clients of T1 behavior analysis are not available (e.g., because it would be unethical to use them as research subjects or because recruiting a sufficient sample is not possible), researchers sometimes selectively recruit individuals from a more accessible population to serve as proxies. For example, McEnteggart, Barnes-Holmes, Egger, and Barnes-Holmes (2016) assessed whether the Implicit Relational Assessment Procedure might be useful as a tool for studying auditory hallucinations. Rather than attempt to recruit patients with a clinical diagnosis of auditory delusions, which is a relatively unusual symptom, McEnteggart and colleagues identified “nonclinical voice hearers” based on responses to several different questionnaires (p. 614). Using proxy subjects allowed the researchers to detect delusional ideation as a predictor of hearing voices, even at subclinical levels.

Researchers also use proxy subjects when the target population is not defined by clinical characteristics. For example, behavioral interventions can be designed to increase the ability of parents and other caregivers of newborns to tolerate inconsolable infant crying. In designing these interventions, researchers might be interested in testing the intervention under contrived circumstances with individuals from a more accessible population because infants’ caregivers might have difficulty traveling to a university campus to participate in experimental research. Glodowski and Thompson (2017) selected participants from a pool of university students who quickly escaped from a recording of an infant crying by clicking a button that terminated the noise. Those participants were then re-exposed to the recording of the infant crying but had distracting activities available. Some participants were able to tolerate the recorded infant cry longer when those distracting activities were available than when activities were not available (Glodowski & Thompson, 2017).

Our examples may have led to the impression that clients and proxy subjects are exclusively human. Nonhuman subjects are also used in T1 research. Any experiment that investigates the behavior of a nonhuman animal model of human disordered behavior or disease processes is T1 provided that the research question is related to the disorder or disease. For example, compared to a “standard” diet, a diet that is high in fat resulted in increased food consumption and body mass (among other results) in rats in both free-feeding (Boomhower & Rasmussen, 2014) and effort-based (Robertson, Boomhower, & Rasmussen, 2017) contexts. Demonstrating similar effects conclusively in humans has been slow, challenging, and notoriously controversial (e.g., Taubes, 2013). Rasmussen and colleagues’ experiments belong in T1 because the rats that were fed high-fat diets are proxies for humans who consume similar high-fat diets.

Some T1 research is critical to the development of techniques for using service animals in novel ways (e.g., Mahoney et al., 2014). In this type of research, the service animals that are used as research subjects are not substitutable with other subjects, but it is human clients who benefit from the research. However, clients do not need to be human. Use-inspired applied animal behavior research can play an important role in animal welfare (e.g., Macpherson & Roberts, 2013). Work that has the potential to benefit service animals, production animals, and wildlife is well worth behavior–analytic attention and investment.

Tier 2: Solution-Oriented Research

On the spectrum of behavior analysis research, T2 studies test and refine behavioral technology under relatively simplified, controlled circumstances. Take, for example, studies by Cox, Virues-Ortega, Julio, and Martin (2017; Studies 1 and 2), who noted that individuals with developmental disabilities have trouble complying during medical procedures such as magnetic resonance imaging (MRI) scans that require the patient to tolerate loud sounds and to lie still for a period of time. The research question was of immediate importance to the subject or society (individuals with disabilities must undergo medical procedures to maximize longevity), but some aspects of the study were selected for convenience (i.e., to preserve internal validity), whereas others were selected for social significance (i.e., to maintain external validity). In Studies 1 and 2, Cox et al. trained children and adolescents with autism spectrum disorder (ASD) to comply with directions and remain still in a simulated MRI scan. They used MRI-related stimuli, including a mock scanner and an audio recording of MRI scanner sounds. The setting—a plain room without technicians and other characteristics of a scanning facility—was selected for experimental control during the training procedure. Thus, the authors targeted a socially important behavior but conducted the study using a convenient setting outside of the MRI room with stimuli created for training purposes.

In a footnote in their classic paper on applied research, Baer et al. (1968) stated:

Research may use the most convenient behaviors and stimuli available, and yet exemplify an ambition in the researcher eventually to achieve application to socially important settings. For example, a study may seek ways to give a light flash a durable conditioned reinforcing function, because the experimenter wishes to know how to enhance school children’s responsiveness to approval. Nevertheless, durable bar-pressing for that light flash is no guarantee that the obvious classroom analogue will produce durable reading behavior for teacher statements of “Good!” Until the analogue has been proven sound, application has not been achieved. (p. 92)

T2 research like that of Cox et al.’s (2017) simulated MRI scan is not purely T3 because one or more of the behaviors, stimuli, or settings is not socially important to the subject or to the problem. The training sessions were conducted outside of the scanning setting, and the stimuli were not necessary for the medical procedure. For example, a blanket could be used to create the feeling of being in a small, dark space and therefore could substitute for the tube, and recordings of other loud noises could substitute for the MRI-like sounds (although more training sessions and a range of stimuli may be necessary to obtain generalization to actual MRI scans).

Like the research by Cox et al. (2017), one iteration of T2 research is to study socially important behavior with convenient stimuli in a controlled setting. Metz, Kohn, Schultz, and Bettencourt (2017) evaluated the use of behavioral technology to improve the accuracy at which young adults poured a standard serving of beer. They recruited a sample of participants for whom the treatment would be of social significance: adult college students who were unable to estimate a standard serving of beer within 10%. Instead of using beer, however, the researchers used water mixed with food coloring and conducted the research in a room at the university rather than in the participants’ typical drinking environment.

Another iteration of T2 research is to study socially important behavior using socially important stimuli but in a convenient setting. For example, Critchfield and Howard (2016) recruited students who were at risk for melanoma due to their skin color and taught them to discriminate between skin lesions with and without melanoma symptoms. Participants were presented with socially important stimuli: images of skin with and without melanoma symptoms. However, rather than having a doctor present the stimuli to participants at the doctor’s office, the images were presented virtually in a computer lab or at a networked computer. In another study, Critchfield and Reed (2016) recruited adults at risk for melanoma, presented them with pictures of lesions with and without melanoma symptoms, and either included or excluded cancer language in the discrimination task. As in the previously described study, participants completed the task on their computers rather than at their doctor’s office.

Other T2 research targets socially important behavior in a socially important setting but uses convenient stimuli to maximize experimental control. Witts, Arief, and Hutter (2016) were interested in techniques for teaching graduate students how to identify verbal operants in their academic setting. The stimuli they used—lyrics to Lady Gaga’s song “Applause”—were selected for convenience rather than social importance in the lives of the graduate students. Socially important stimuli might have been a relevant course reading, a research article, or other written works related to the students’ graduate training.

Another iteration of T2 research is conducted in a socially important setting but targets a convenient behavior and uses convenient stimuli to evaluate a problem of social significance. For example, Mullane, Martens, Baxter, and Steeg (2017) conducted an evaluation of risky choice behavior in children in the school setting. The primary dependent variable was preference for mixed versus fixed reinforcer schedules. Rather than measuring time allocation to actual risky activities that may lead to ethical issues in the research, they selected a convenient response of math problem completion under different reinforcement schedules. They selected convenient stimuli—different-colored construction paper—to signal those schedules.

Some T2 studies are designed using socially important stimuli but are conducted in a convenient setting and target convenient behavior. Becirevic, Reed, and Amlung (2017) noted that the use of indoor tanning devices puts individuals at risk for melanoma and other skin cancers. They recruited participants who reported using an indoor tanning device at least twice in the past year. The study was an evaluation of effects of socially important stimuli—cues related to tanning—in a laboratory setting and on measures of choice that were theoretically interesting but were not actual tanning behaviors. Specifically, participants responded to behavioral economic questionnaires measuring demand and craving for tanning.

Studies that include socially important stimuli presented in a socially important setting but that select a convenient behavior would fit into T2 of our tiered model. For example, Feuerbacher and Wynne (2016) applied functional analysis methodology to identify whether access to owners could serve as a reinforcer in dog training. The stimuli were socially important to the dogs (e.g., owner access, access to toys, etc.), and the sessions were conducted in the home setting. However, the study targeted an arbitrary behavior—button pressing—to demonstrate the efficacy of owner access as a reinforcer.

Tier 3: Applied Behavior Analysis

In our tiered model, T3 could be considered pure applied behavior analysis research. Tier 3 research subjects are or represent those individuals who will benefit directly from research outcomes. As already well described in Baer et al.’s (1968) work, applied behavior analysis research is “constrained to examining behaviors which are socially important … in their usual social settings [using] stimuli [that were] chosen because of their importance to man and society, rather than their importance to theory” (p. 92). Although internal validity remains an important consideration, T3 research emphasizes ecological and social validity in all respects. Some might argue that these criteria for T3 classification are slightly more restrictive than Baer et al.’s “applied” criteria for applied behavior analysis; however, we contend that they are appropriate because our tier model is relatively complex and attempts to encompass the spectrum of research in the field of behavior analysis.

In the previous section, T2 or use-inspired research, we described two studies conducted by Cox et al. (2017) in which children with ASD were given a behavioral treatment to increase compliance and motion stillness in a simulated medical procedure in a laboratory setting. These T2 studies were followed by a third study in which criteria for T3 were clearly fulfilled. Three individuals with ASD who participated in the T2 studies also completed an actual MRI scan in a clinical neuroimaging laboratory in a hospital. Researchers measured a socially important behavior—the participants’ head movements—and determined whether the real MRI scans were completed successfully.

In addition to evaluations of behavioral assessment and intervention on challenging behavior related to disorders, T3 research can focus on adaptive and habilitative behavior such as safety and daily living skills. Ledbetter-Cho et al. (2016) used behavioral skills training to teach abduction prevention skills to children with ASD. In the study, socially important stimuli such as abduction lures and confederates posing as strangers were presented to children in a variety of socially relevant settings. Aldi et al. (2016) used point-of-view video modeling to teach important daily living skills such as cooking, cleaning, setting the table, and folding jeans to young men with ASD in a residential setting.

Problems of potentially criminal behavior and social deviance can be assessed and treated using applied behavior–analytic technology in socially important settings. Reyes, Vollmer, and Hall (2017) studied whether paired stimulus preference assessments could be used to identify arousing stimuli in adult male alleged sexual offenders with intellectual disabilities (ID). The study was conducted in a residential treatment facility for offenders with ID. The researchers measured sexual arousal using penile plethysmographs while participants observed deviant and nondeviant video clips. Researchers also measured participants’ choices for video clips using a paired stimulus preference assessment in which participants selected still images from the video clips used in the assessments. In addition to benefiting potential victims of sexual offense, the subjects in the Reyes et al. study represented a population of individuals with ID who could benefit from effective assessment, treatment, and prevention of sexual offending.

Behavior–analytic research fitting within T3 of the spectrum has often targeted health behaviors with socially important stimuli in naturalistic settings. Food refusal is one example of a health behavior assessed and treated using T3 research. Borrero, England, Sarcia, and Woods (2016) examined the relation between results of descriptive assessment and functional analysis in identifying functional reinforcers in populations with food refusal enrolled in inpatient or day treatment. Ewry and Fryling (2016) applied an antecedent intervention in which multiple bites of highly preferred food were presented with a single bite of less preferred food to increase bites of food taken by an adolescent with ASD in the home setting. Medication nonadherence and substance use are health behaviors that have also been targeted using T3 behavioral interventions. For example, Stitzer et al. (2017) evaluated a multitarget contingency management intervention for HIV-positive substance users that used financial incentives to promote behaviors related to treatment of HIV (e.g., patient navigator meetings, doctor visits, medication checks, lab visits, viral load suppression, substance use disorder treatment, drug abstinence). Raiff, Arena, Meredith, and Grabinksi (2017) evaluated a smartphone-delivered group contingency management incentive intervention with pairs of smokers (who had a relationship outside of the research context) on biologically verified smoking abstinence targets. Applications of interventions to decrease sedentary behavior and promote physical activity can also be examined within T3. For example, Krentz, Miltenberger, and Valbuena (2016) used token reinforcement to increase walking behavior in adults with ID in an adult day training center.

Applied behavior analysis has found a niche within school settings. Some T3 studies use instructor-delivered interventions targeting problem behavior using classroom-related stimuli. Cariveau and Kodak (2017) used a randomized dependent group contingency to promote academic engagement in second-grade students with varying levels of engagement in the classroom. In an evaluation of a classroom behavior management strategy called the Good Behavior Game (GBG), Pennington and McComas (2017) evaluated whether effects of the game on positive behaviors generalized across different classroom contexts. Applied behavior analysis in school settings is not limited to grade schools. For example, Carroll and St. Peter (2017) evaluated the parameters of point availability necessary to increase course attendance in adult college students.

In addition to targeting challenging behavior in school settings, school-based T3 research might use interventions to improve poor academic performance using academically relevant stimuli. Dixon et al. (2017) used a teaching curriculum (PEAK–E) to establish derived categorical responding in students with disabilities who were selected for poor picture identification skills. Shillingsburg, Gayman, and Walton (2016) used textual prompts to teach children with ASD to mand for information in their classrooms. Denton and Meindl (2016) tested whether colored overlays changed reading fluency (vs. supported treatments for reading) in people with dyslexia in a home or school setting. In the college classroom, Varelas and Fields (2017) tested a clicker-training procedure to induce equivalence classes of names, time periods, and characteristics of stages of prenatal development in adult college students enrolled in a life span development course.

Interventions that use socially important stimuli to build and maintain occupational skills, improve job-seeking behavior, and increase efficiency in behaviors in the workplace are classified within T3. Cohrs, Shriver, Burke, and Allen (2016) evaluated a teacher training procedure using instructions and performance feedback on the use of behavior-specific praise in teaching staff in the classroom. Spieler and Miltenberger (2017) used an awareness training intervention with socially important stimuli (modeling and feedback from an instructor) to decrease nervous habits in four college students who wanted to improve their public speaking skills. O’Neill and Rehfeldt (2016) tested the efficacy of selection-based instruction to teach adult men with disabilities who were enrolled in a vocational development center to respond to interview questions accurately while sounding unscripted. Subramaniam, Everly, and Silverman (2017) evaluated whether different payment strategies affected productivity in a job skills training program for unemployed, substance-abusing adults.

Research in T3 has also been conducted that may advance the science and practice of behavior analysis when subjects are researchers, clinicians or practitioners, or students of behavior analysis. For instance, Diller, Barry, and Gelino (2016) asked board-certified behavior analysts and editorial board members for behavioral journals to rate experimental control in graphs of behavioral interventions and evaluated the level of agreement within and across groups. Ratings were more consistent for some items (e.g., questions about the presence or absence of trends) and less consistent for others (e.g., questions about stability), highlighting aspects of visual analysis where additional explicit training is likely to be particularly beneficial.

Tier 3 research need not focus solely on problems of the human condition. Applied animal behavior interventions that aim to improve species-specific behavior in nonhumans and that aim to inform husbandry practices of human handlers can also fall under the umbrella of T3 research. For example, Pereira-Figueiredo, Costa, Carro, Stilwell, and Rosa (2017) evaluated different handling time frames on fear responses and learning in Lusitano yearling horses. Another example of applied animal behavior research that affects practices of animal husbandry is a study by Casal-Plana, Manteca, Dalmau, and Fàbrega (2017). Casal-Plana and colleagues evaluated the effects of environmental enrichment (natural hemp ropes, sawdust, rubber balls, and an herbal compound) on the stereotypy and exploratory behavior of growing pigs.

Tier 4: Impact Assessment

In the closing remarks of his book About Behaviorism, B. F. Skinner (1974) stated that “problems can be solved, even the big ones, if those who are familiar with the details will also adopt a workable conception of human behavior” (p. 276). To this end, behavior analysts who are familiar with the interaction between organism and environment should be conducting the research that evaluates behavioral interventions for impact and scalability. This research fits into T4 of our tiered spectrum and consists of program evaluation that calculates the utility and cost-effectiveness of behavioral technology and specifies how it could be optimized for widespread use and incorporated into policy. This research might involve prospective analysis into the potential for an effective intervention to benefit society (as is sometimes seen in grant applications) or retrospective assessment of the positive impact already accrued.

Impact assessment is usually carried out using research strategies such as systematic reviews, meta-analyses, large-scale program evaluation, and multisite clinical trials. These research strategies are not typically used in behavior–analytic work (for typical experimental designs, see Cooper, Heron, & Heward, 2007; Sidman, 1960). Furthermore, some may argue that the experimental designs used in T4 research are not behavior analysis per se because they do not conform to the analytic approach described by Baer et al. (1968) or the kind of scientific approach endorsed by Sidman (1960). We include T4 research in our spectrum because we argue that behavior–analytic work must be implemented and evaluated on a larger scale than that of individual clients or participants as the best way to establish generality, demonstrate scalability, and better define social validity. In addition, T4 research facilitates the growth and future impact of behavior analysis in a society where the intellectual or financial capital to conduct scientific research is often based on actual or potential significance and impact.

A focused line of research in applied behavior analysis has led to an evidence base for early intensive behavioral intervention (EIBI) as a treatment for ASD. In their review of this evidence base, Reichow, Barton, Boyd, and Hume (2012) showed that the intervention package had moderate to large effects on communication and language skills, socialization, and daily living skills in nonrandomized trials. EIBI is also cited as a treatment that can lead to typical functioning in a subset of children with ASD. However, the treatment package is intensive and time consuming; the recommended dose is 20 to 40 h/week for 3 years. In their T4 study, Peters-Scheffer, Didden, Korzilius, and Matson (2012) compared the cost-effectiveness of EIBI with that of usual care for children with ASD in the Netherlands. They found that EIBI can lead to long-term savings for the individual and the Dutch ASD population compared to treatment as usual, at least when considering education, security, and living or working expenses offset by gains in reduced dependency.

The GBG is another example of T4 research in which behavioral technology has been packaged, disseminated, and evaluated on a large scale. The GBG was first described by Barrish, Saunders, and Wolf (1969) in a classic example of T3 research. They targeted off-task behaviors in a fourth-grade classroom using a game in which each instance of the response decreased the chances of privileges commonly used in classrooms (e.g., extra recess, special projects, tangible items). T4 research on the GBG included a review by Embry (2002), who nominated the GBG as a low-cost “behavioral vaccine” because of its impact on classroom behavior and generalization to other impulsive behavior over the long term. In another T4 study, Kellam et al. (2014) described the impact of the GBG implemented in first-grade classrooms in 19 schools in the Baltimore City Public School System. They described the impact of the GBG on high-risk sexual behaviors and drug abuse and dependence in young adults who received the GBG intervention in first grade.

Research in T4 can demonstrate how behavioral technology is incorporated into programs that improve people’s access to resources. Holtyn, Jarvis, and Silverman (2017) reviewed how behavior analysts can help solve the problem of poverty using contingency management interventions that harness the power of operant conditioning through financial incentives to improve education, job skills, and income. They reviewed existing government-implemented incentive interventions for people living in poverty and discussed how those interventions could be improved using principles of operant conditioning. These suggestions included decreasing the delay between meeting a contingency and receiving an incentive payment, delivering incentives more frequently, decreasing response requirements to qualify for an incentive, increasing the magnitude of the incentives, and providing training to improve the skills necessary to qualify for the incentives.

Research in T4 can also demonstrate how behavioral technology is incorporated into programs that improve public health. Poling et al. (2017) described a program of research that used behavioral intervention to train African pouched rats to detect tuberculosis (TB) in sputum samples in Tanzania and Mozambique, areas where sensitive diagnostic tools for the disease are not widely available. Rats were trained in a discrimination task to respond to samples with mycobacteria and to not respond to other sputum samples (see Poling et al., 2011, for a review of the T1–T3 research contributing to this program). Patients who visited clinics for respiratory problems provided sputum samples that were evaluated through microscopy by clinicians and sent to the rats for screening. In 2014, samples from Tanzania and Mozambique were evaluated and the rats increased the detection rate (compared to microscopy) by 39% and 53%, respectively. This research was used to demonstrate that the screening program using TB-detecting rats was effective and worked in different countries within the continent.

Scope of the Tiered Spectrum Model

The utility of a taxonomy of behavior analysis is similar to the utility of any taxonomy. Within biology, taxonomists are responsible for discovering and naming new species, generating hypotheses about shared evolutionary history based on phylogenetic relationships, and constructing guidelines that allow nontaxonomists to identify species (Wheeler, 2015). The subject matter of our taxonomy is behavior analysis research output rather than living organisms, but it can be used in analogous ways. Applying the organizational structure of a spectrum of research in behavior analysis has the potential to enable recognition of emerging research areas. It can allow behavior analysts to catalog the diversity—or lack thereof—in the discipline, thereby identifying knowledge and practice gaps and highlighting potential avenues of collaboration. It may generate hypotheses about the history and historiography of behavior analysis that can be tested through bibliometric analysis (Critchfield & Reed, 2004). Consensus regarding nomenclature and classification of research can only help behavior analysts communicate the significance of their results to nonexperts (Freedman, 2016).

The five-tiered spectrum of behavior analysis we propose as a taxonomy has specific advantages. Its similarity to biomedical translational pathways (Blumberg et al., 2012) may facilitate communication with biomedical researchers. As a taxonomy based on task characteristics (Fleishman, 1975), the spectrum has the potential to enable dependable predictions about the generalization of results. As a spectrum, it may ultimately enable clearer delineation of basic, translational, and applied behavior analysis.

The initial impetus for developing this taxonomy of behavior analysis was grounded in our wish for an operational definition of translational research. We believe that such a definition would be useful in the development of research designed to have an impact and may be critically important for effectively communicating research outcomes. Interest in translational behavior analysis has resulted in increasingly frequent publication of explicitly translational output in the Journal of the Experimental Analysis of Behavior and the Journal of Applied Behavior Analysis (Mace & Critchfield, 2010). This trend has been supported by efforts from recent editors to encourage translational submissions (Madden, 2012; Mazur, 2010; Odum, 2015). Unfortunately, at present, neither journal specifies what constitutes translational research in its mission, potentially leading to editorial disagreements over whether the research falls within either journal’s scope. A taxonomy could provide some structure for journals and guidance to authors on the appropriateness of submissions that are neither blue sky basic research nor pure applied behavior analysis.

The segregation of basic and applied behavior analysis (Rider, 1991) may function to discourage research that does not fit unambiguously in either category. The basic–applied dichotomy is easily observed in funding mechanisms offered through professional behavior analysis organizations. Currently, many behavior analysis organizations recognize and support students through separate student research grants for basic and applied research. In 2018, the Association for Behavior Analysis International (https://www.abainternational.org/about-us/organizational-chart/membership/student-committee/details.aspx) listed 14 student award opportunities with a total of 18 monetary awards. Of those awards, three were for students conducting basic research, five mentioned applied behavior analysis research, and several of the others provided funding for regional conferences with an applied behavior analysis focus. If behavior–analytic organizations adopt a classification system that extends beyond basic and applied, such as our model, they might embolden students and other researchers to pursue work that is not neatly classified as either basic or applied.

Behavior analysts have written a great deal to encourage more translational behavior analysis (e.g., Critchfield, 2011; Critchfield & Reed, 2009; Mace & Critchfield, 2010; Poling, 2010), but researchers interested in this pursuit could benefit from additional guidance on how to do it well or where to start. Indeed, we suggest that some 35 years after behavior analysts began seriously discussing the existence of hybrid niches, our scholarly community still lacks a good sense of what the niches are and what functions they serve. How much contemporary behavior analysis research is translational? What types of research should behavior analysts be doing that we are not? Answering these questions is critical for training the next generation of researchers, but doing so requires much more operational specificity than the terms basic, applied, and translational currently convey. Our tiers could potentially provide the necessary specificity while breaking away from the basic–applied dichotomy.

Of course, a spectrum defined by the convenience or social importance of research subjects, behaviors, stimuli, and settings will not fulfill all taxonomic requirements. For example, it does not include a mechanism for distinguishing T3 research on smoking cessation from T3 research on language acquisition. Individual studies may have more in common with research from other tiers than with other research from the same tier. Some behavior analysis will not be easily classified using this system. Conceptual research may be particularly difficult to classify due to ambiguity in determining the subjects, stimuli, settings, and behaviors of interest. Ambiguity in determining whether research subjects, behaviors, stimuli, and settings are socially important (Critchfield & Reed, 2017; Wolf, 1978) is likely to lead to disagreement over the classification of some empirical studies. However, a taxonomy with no ambiguity is likely not a realistic goal at present. Furthermore, other dimensions and aspects of research (cf. Hayes, Rincover, & Solnick, 1980) might also be useful as criteria for categorizing behavior analysis. Our hope is that the present five-tiered spectrum model can serve as a starting point—if it does not work for behavior analysis, let it be refined or replaced with something better.