In the first of two personal epilogues to his analysis of Verbal Behavior, Skinner (1957) wrote:

I have assumed a common interest in the field of verbal behavior. It is my belief that something like the present analysis reduces the total vocabulary needed for a scientific account. . . . In many ways, then, this seems to me to be a better way of talking about verbal behavior. (p. 456)

Fed up with “comma-counting,” Skinner sought to develop a functional account of language based on his science of human behavior. But despite the parsimony of Skinner’s analysis, a natural science of verbal behavior has not been widely adopted.

Historically, language has been conceived to be an innate human quality (Ding, Melloni, Zhang, Tian, & Poeppel, 2016; Wolfe, 2016) or a tool to be used at will by the autonomous agent (Everett, 2012). A common misconception of language is that once a word has been learned receptively, it can readily be used expressively (Petursdottir & Carr, 2011). Skinner (1957), however, described the functional independence of verbal relations, noting six distinct environmental variables that exert control over verbal behavior. Like all behavior, verbal behavior is susceptible to the reinforcing contingencies of postcedent stimulation, although in the case of verbal behavior reinforcement, it is mediated by a listener.

The notion of functional independence means that any given verbal response must be either explicitly reinforced or under the abstracted control of commonalities across various environmental relations. Fluent speakers readily acquire a response under one set of conditions and automatically emit it under others. This behavioral phenomenon easily obscures the functional independence of verbal operants, as the greater part of a typically developing speaker’s repertoire is not explicitly reinforced. Sundberg (2007) noted this when he compared the verbal behavior profiles of typically developing children with those of children diagnosed with autism spectrum disorder (ASD). Neurotypical children displayed relatively proportional levels of responding across measures of verbal operant strength—a function of the emergence of untrained operants.

For children with ASD, however, the verbal community frequently fails to establish a functional verbal and listener repertoire (Drash & Tudor, 2004; Malott, 2004). Contrary to their neurotypical peers, whose verbal behavior was assessed under relatively equal degrees of strength, the participants with ASD in Sundberg’s (2007) study displayed disproportionate levels of responding across the primary verbal operants. The discrepancies in the verbal repertoire of individuals with ASD help to elucidate the functional distinction of verbal relations. Consequently, the fluency with which typically developing children demonstrate transfer control across operants may be used as a standard against which we can measure the severity of autism and other verbal behavior disorders.

Researchers in the field of verbal behavior have long since validated Skinner’s notion of functional independence (Hall & Sundberg, 1987; Lamarre & Holland, 1985; Sigafoos, Doss, & Reichle, 1989; Sigafoos, Reichle, Doss, Hall, & Pettitt, 1990). Sixty years after its initial publication, Skinner’s (1957) analysis still provides a revolutionary framework for categorizing verbal behavior—not according to structural composition but to the environmental variables under which language is shaped and maintained.

Definitionally, verbal behavior requires an audience relation to mediate reinforcement for the speaker and, consequently, in whose presence verbal behavior is particularly strong. Mands primarily benefit the speaker by specifying their reinforcer. In contrast, tacts are maintained by the verbal community, which ultimately benefits from an extra set of eyes, ears, and other exteroceptors to further expand the environment for the benefit of the group. Generalized reinforcement also conditions echoic, intraverbal, and textual behavior under their respective sources of verbal stimulus control, whereas autoclitic behavior is evoked by (or acts upon) the other verbal behavior of the speaker.

Although we may discretely categorize verbal behavior according to any of the six aforementioned relations, more often than not the variables of which language is a function are complex (Axe, 2008; Eikeseth & Smith, 2013; Sundberg & Sundberg, 2011), multiple (Bondy, Tincani, & Frost, 2004; Michael, Palmer, & Sundberg, 2011), or otherwise indistinct (Schlinger, 2008). Does the purest mand not inherently tact the motivating operations in effect? Does a genuine tact not ever so softly mand the attention of the listener? Although Skinner’s functional distinction affords scientific analysis, verbal operant strength is not mutually exclusive.

Palmer (2009) observed that “the apparent unity of emitted behavior masks a bedlam of concurrent fluctuations in strength of responses in the repertoire but below the threshold of emission” (p. 49). At any given moment, multiple sources of varying strength are in competition with one another. Skinner (1953) noted that when two responses are close to equal in strength, we may observe an oscillation in the individual’s behavior. Multiple control of verbal behavior can be observed when two distinct potentiated minimal units recombine or converge to create a novel, if not ineffective, response.

The prepotency of one response over another is described by the matching law (Herrnstein, 1961; Rachlin & Baum, 1969), which states that the response emitted among a convergence of those potentiated is a power function of the relative value of the reinforcers among those in competition with one another (Shahan & Podlesnik, 2008). With respect to verbal behavior, however, the reinforcer is often held constant (i.e., generalized reinforcement maintains tacts, echoics, and intraverbals), whereas the preceding stimulus varies. Consequently, two stimuli that signal the availability of the same reinforcer seemingly exact the same amount of control over a particular response. For instance, a spherical object should, in accordance with the generalized matching law, evoke the response “ball” just as much as a verbal stimulus (i.e., “Say ball”). This is not the case, however, for many individuals with language disorders such as autism (Drash & Tudor, 2004).

Michael et al. (2011) explained that in the case of convergent multiple control, “the target response is not usefully considered an operant. Rather, it is a response whose topography is common to a variety of verbal operants, each of which contributes to the emission of the response” (p. 7). Accordingly, the individual verbal operants must exist at some strength to effectively converge. However, the language deficits of speakers with autism may be a function of the prepotent strength of one operant over the others. The disproportionate responding demonstrated by individuals with autism is frequently described as stimulus overselectivity (Reed, Stahmer, Suhrheinrich, & Schreibman, 2013; Rieth, Stahmer, Suhrheinrich, & Schreibman, 2015). Rather than converging, a dominant operant supersedes other sources of control.

This article proposes a model for examining the complexity of a dynamic verbal repertoire according to the relative value of its component parts. As noted previously, the verbal behavior of fluent speakers is demonstrated by the automatic transfer of control across verbal operants. Underlying this ease of transfer is an assumption of relative strength across each of the primary verbal operants. However, verbal behavior assessments frequently measure response strength independently across the verbal operant domains. We posit that absolute values may be less informative than ratios for measuring verbal operant strength and propose the use of a familiar formula for quantifying the relative value of the verbal repertoire.

As a General Rule, in Order to Identify Any Type of Verbal Operant, We Need to Know the Kind of Variables of Which the Response Is a Function. (Skinner, 1957, p. 36)

Verbal behavior researchers have time and again shown that individuals with autism are susceptible to disproportionate sources of strength over their verbal responding (e.g., Goldsmith, LeBlanc, & Sautter, 2007; Kodak & Clements, 2009). This disproportionality is primarily demonstrated through the use of criterion-referenced tests (CRTs) of verbal behavior, such as the Verbal Behavior Milestones Assessment and Placement Program (VB–MAPP; Sundberg, 2008), which determine present levels of functional verbal performance. Arranged in increasing complexity, these task analyses of language are designed to assess verbal operant strength through either observation or the direct manipulation of environmental variables.

The results of these CRTs may be used to inform individualized treatment plans to increase language by strengthening responding under the antecedent and postcedent variables specific to the operant class (Sundberg & Michael, 2001). However, the emergence of untaught verbal behavior is only sporadically assessed within the greater literature base (Grow & Kodak, 2010; Kelley, Shillingsburg, Castro, Addison, & LaRue, 2007; Petursdottir, Carr, & Michael, 2005), and the concomitant effects of conditioning a single verbal operant on the entirety of the verbal repertoire are seldom reported (cf. Mason & Andrews, 2014). Moreover, recent reviews of the literature bring to light issues of construct validity that appear to be prolific within verbal behavior research (Gamba, Goyos, & Petursdottir, 2015; Grow & Kodak, 2010). Inadequate descriptions of participants’ verbal behavior deficits increase the difficulty of interpreting variability in outcomes; obfuscating relevant participant characteristics prohibits generalization to the larger population.

Threats to internal validity have been addressed elsewhere in the behavioral literature. The use of pretreatment functional analyses to identify the variable(s) maintaining behavioral excesses has been shown to increase both treatment precision and efficacy (Beavers, Iwata, & Lerman, 2013; Hanley, Iwata, & McCord, 2003). This technology has been used to identify nuanced environmental determinants undetectable through mere descriptive assessment. The data generated through the experimental manipulation of antecedent and postcedent variables surrounding challenging behavior serve as judgmental aids for designing targeted interventions to weaken their occurrence.

Whether they address behavioral excesses or deficits, functional analyses are designed to systematically assess the environmental relations controlling behavior through experimental manipulation. Lerman et al. (2005) proposed the use of functional analysis methodology to determine the sources of control over the erratic verbal behavior of children with intellectual disabilities. Four conditions were run, with each controlling for the empirical examination of the primary verbal operants. The methodology for conducting functional analyses of verbal behavior has since been revised through subsequent investigations (Kelley et al., 2007; LaFrance, Wilder, Normand, & Squires, 2009; Normand, Machado, Hustyi, & Morley, 2011; Normand, Severtson, & Beavers, 2008; Plavnick & Ferreri, 2011).

As with functional analyses of behavioral excesses, these verbal operant analyses (VOAs) have been used to identify the environmental variables maintaining a particular response that already exists at some strength. The purpose of any functional analysis is to isolate the controlling variables responsible for maintaining the target behavior. Specific to the VOA, this means systematically manipulating the environment to ensure that only the relevant controlling features of the environment to induce targeted verbal operants are present during each condition. Mands are emitted in the presence of restricted access to a specific reinforcer. Tacts are controlled by exteroceptive stimulation. Echoics are verbal episodes that share both point-to-point correspondence and formal similarity with a verbal stimulus, whereas intraverbals do not correspond in any consistent manner with a preceding verbal stimulus (Vargas, 1982). Each experimental condition includes a listener to mediate reinforcement while necessarily controlling for potential confounds from the other verbal operants.

Plavnick and Normand (2013) reported that such experimental analyses of verbal behavior may be beneficial for evaluating previous instructional efforts in addition to guiding the selection of future educational targets and procedures. Moreover, the enhanced validity of the VOA may have particular utility for assessing the emergence of untrained operants and evaluating the concomitant effects of targeted intervention on the overall repertoire.

In the pretreatment VOA, participants are exposed to four experimental conditions that repeat across three iterations of the assessment to provide sufficient power for analysis. Each administration begins with a free operant preference assessment (FOPA; Roane, Vollmer, Ringdahl, & Marcus, 1998) to serve as the tact condition, followed by mand, echoic, and intraverbal conditions in pseudorandom sequence. Response targets are identified through a series of FOPAs in addition to caregiver interviews and direct observation of the participant within the laboratory during the caregiver interview. Reinforcing stimuli (SR+) are identified through the participant’s allocation of time. As the participant engages with a new SR+, tacts are assessed by asking the participant to name the stimulus. For example, if a teddy bear is selected by the participant, after a few seconds of interaction the researcher points to the bear and says, “What’s this?” The emission of the target response bear—or any approximation thereof—is immediately followed by generalized reinforcement (e.g., edibles, high fives, pats, praise, tickles, tokens, etc.). Verbal community–specific responses are also reinforced. For instance, the child may call the teddy bear by name rather than saying bear. This information is gathered in advance during the caregiver interview to the extent possible. These procedures are repeated until five to 10 target responses have been identified.

Regardless of whether or not the target response was emitted during the tact condition, each SR+ selected during the FOPA is included in the remaining conditions. Ten trials are conducted in each of the four experimental conditions. Stimuli are presented in the order in which they were selected by the participant. Accordingly, if only eight stimuli were selected during the FOPA, these items represent the first eight trials, and the two highest preference items (those with which the participant engaged for the longest duration of time) are repeated for the final two trials. Following the tact condition, the remaining three conditions are presented pseudorandomly.

Mand function may be assessed with each of the SR+ presented as part of a reinforcer preference assessment, such as paired (Fisher et al., 1992) or multiple stimulus (DeLeon & Iwata, 1996). From the array of SR+ identified in the tact condition, the participant is allowed to select one. All other SR+ are removed. After 10 s of allowed manipulation, the selected item is also removed from exteroceptive stimulation—to control for a potential tact confound—and a conditional intraverbal prompt (e.g., “What do you want?”) is provided to evoke a mand from the participant. If a mand occurs within 5 s, the manded item is presented and the participant is allowed to continue interacting with the SR+ for the remainder of the interval.Footnote 1 If no mand occurs within 5 s, the item is placed on the table or floor just beyond arm’s reach of the participant, who is then allowed to re-engage with the stimulus for the remainder of the interval. Throughout each trial, a mand for any item—not just the targeted one—is reinforced with 15 to 20 s of access. At the end of the interval, the preferred item is removed and the remaining stimuli are re-presented to the participant in a different order from the previous presentation. This sequence continues until either all stimuli have been selected. The total condition lasts approximately 5 min.

Generalized mands such as “mine,” “please,” and so on are reinforced with continued access to the item, but no generalized reinforcement is provided. These responses are scored as incorrect, along with defective mands and the absence of any response. Any extraneous verbal behavior is also noted throughout this condition.

In the echoic condition, the SR+ identified by the FOPA are removed from the laboratory. At 30-s intervals, verbal stimuli corresponding to each of the SR+ identified in the FOPA are provided one at a time. If the participant emits an echoic response, generalized reinforcement is provided. All other responding is extinguished until the next 30-s interval during which another verbal stimulus is presented. These procedures are repeated for the duration of the condition.

In the intraverbal condition, the SR+ identified in the FOPA are removed from the laboratory. At 30-s intervals, a corresponding fill-in-the-blank verbal stimulus is provided for each item identified in the tact condition. This frame describes how the participant interacted with the SR+ during the FOPA and is structured such that the missing word is the target response (e.g., “You roll the ____”). Generalized reinforcement is provided for all targeted intraverbal responses with 5 s of the verbal discriminative stimulus (SD). These procedures are repeated for the duration of the condition.

The aforementioned procedures are each conducted three times sequentially.Footnote 2 With each successive administration of the VOA, a new FOPA is conducted to assess fluctuation in motivating operations and to increase the diversity of responses. Each condition lasts approximately 5 min in duration, equating to roughly 30 min per administration and approximately 90 min to conduct the entire assessment.

These procedures produce frequencies of verbal responding from the participant across each of the four experimental conditions. The total number of mands, echoics, tacts, and intraverbals are summed across administrations with a ceiling of 30 verbal episodes for each of the four conditions, which exceeds Cohen’s (1992) minimum recommendation for sampling behavioral research.

The frequency count for each operant is then divided by the total number of responses across all four experimental conditions to yield a percentage of responding for each verbal operant relative to the other three. This measure of stimulus and motivational control subsumes a measure of response strength, both of which are a function of contingency history and are explained by the generalized matching law (Baum, 1974).

The Verbal Operants We Have Examined May Be Said To Be the Raw Material Out of Which Sustained Verbal Behavior Is Manufactured. (Skinner, 1957, p. 312)

The complexity of the verbal repertoire may be summarized by identifying the relative levels of operant strength that comprise its component parts. Although Skinner (1957) identified six primary verbal operants, the literature on verbal behavior highlights four of these as fundamental to functional communication: mand, echoic, tact, and intraverbal (Sautter & LeBlanc, 2006). We posit that the relative strengths of these four operants in relation to one another comprise the greater verbal repertoire. Within the VOA, response topography is held constant across experimental conditions to validate mand, echoic, tact, and/or intraverbal function. The frequency of responding for each of the four conditions is tallied and then divided by the total number of responses emitted across all four conditions to show the strength of motivational and stimulus control for each of the verbal operants relative to one another and expressed as a percentage of the entire repertoire. Although raw data may be important for showing the absolute strength of an operant (Johnston & Pennypacker, 2009), it is the relative strength of response that proves to be more pragmatic in the present analysis.

Using the VB–MAPP, Mason and Andrews (2014) employed stimulus control ratios to depict the verbal repertoire of a child with autism at both the onset and cessation of behavior–analytic intervention. Preintervention pie charts showed disproportional levels of mand, echoic, tact, and intraverbal relations. In contrast, after 13 weeks of intervention, the control ratios were significantly less skewed. Underlying this evaluation are Sundberg’s (2007) observations of the fluent speaker’s verbal behavior under proportional levels of stimulus control across these four primary verbal operants.

Figure 1 displays the relative levels of control over the respective verbal repertoires of four children diagnosed with ASD. The upper left control ratio is of a 4-year-old girl named Olivia. The verbal operant analysis demonstrated that Olivia’s verbal repertoire was primarily under echoic control and showed some strength to access specific reinforcement. However, verbal responding was altogether absent in the presence of nonverbal and verbal stimuli without point-to-point correspondence.

Fig. 1
figure 1

The stimulus control ratios of four children diagnosed with autism spectrum disorder showing a range of proportionality from least (top left) to most (bottom right). IV = intraverbal

The upper right control ratio is of a 4-year-old girl, Nancy, whose verbal repertoire showed strength across all four conditions, although it was most heavily conditioned under echoic control. The combination of mands, tacts, and intraverbals comprised less than one third of the overall repertoire.

The lower left control ratio belongs to a 3-year-old girl named Katie whose verbal repertoire showed slightly more commensurate levels of stimulus control. A dominant source of strength can still be easily identified through visual analysis of Katie’s stimulus control ratio, but echoic control makes up less than half of the complete verbal repertoire.

The lower right control ratio shows the relatively proportional levels of stimulus control in conjunction with the motivating operations that govern the verbal repertoire of a 4-year-old boy named Tomás. No dominant source of control is evidenced; relative deficits, such as that of the intraverbal relation, may still be identified.

As depicted across the four charts of Fig. 1, the control ratio increases in proportion to the complexity of the verbal repertoire. This trend continues to the point that the level of responding equalizes across the four operants. Proportional levels of stimulus control appear to be a function of fluent responding within each verbal operant class. A perfectly balanced control ratio may be noted as neurotypical children develop fluent language, and it is at this level of responding that the individual operants may appear to be functionally indistinct. Perhaps the lack of a sufficient measure to assess the relative values of verbal operant strength has prevented Skinner’s (1957) analysis of language from being more widely adopted.

Interobserver Agreement Is the Bedrock upon Which Sound Behavioral Measurement Rests. (Watkins & Pacheco, 2000, p. 207)

Indeed, neurotypical children demonstrate the linguistic flexibility with which untaught relations are derived across verbal operants at a rate that appears to refute the notion of functional independence (Moran, Stewart, McElwee, & Ming, 2014). Words conditioned under the control of a specific verbal operant automatically transfer across operant classes. Figure 2 depicts what may be referred to as the “null verbal repertoire,” a hypothesized model in which response strength is proportional across environmental conditions and which states that there are no differences in levels of control across operant classes. In other words, our null hypothesis brackets Skinner’s notion of functional independence by assuming zero variance across operant classes. Fluent speakers demonstrate proportionate responding across independently assessed mands, echoics, tacts, and intraverbals. When comparing response ratios across four operant classes, the null hypothesis states that each begets 25% of the repertoire.

Fig. 2
figure 2

A graphical representation of the null repertoire with a variance of zero across the mand, echoic, tact, and intraverbal relations

This hypothetical norm provides a standard of strength against which the deviations identified through a functional analysis of verbal behavior can be compared. An analysis of variance is conducted between the percentage of the repertoire allocated for each observed operant and the null (i.e., 25%) using a conventional reliability formula in which the sum of the agreements for each assessed operant is divided by the sum of agreements and disagreements for each assessed operant:

$$ \frac{\mathrm{Agreement}}{\mathrm{Agreement}+\mathrm{Disagreement}} $$

To calculate the agreement between the observed data and the null hypothesis, we divide the smaller of the two percentages by the larger of the two percentages (see Fig. 3.) In doing so, “we assume that the overlap in the two counts represents agreement. We assume that nonoverlap represents disagreement,” explains Miller (2006, p. 65). The resulting coefficient, ranging from 0 to 1, yields an effect size of correspondence between the null hypothesis and the observed proportions of verbal operant strength upon which pragmatic instructional decisions and performance outcomes can then be verified.

Fig. 3
figure 3

The darker sections of both graphs represent agreement between the observed mand value (left) and the null mand value (right). The lighter portion of the graph on the right represents disagreement between the observed and null mand values. In this particular example, the null mand proportion is greater than the observed mand proportion and is therefore composed of both the agreement and disagreement. In other examples provided throughout this article, the observed proportion is greater and hence constitutes both agreement and disagreement

The verbal behavior stimulus control ratio equation (SCoRE) calculates the extent to which the participant’s verbal behavior correlates with the null hypothesis. The percentages of agreement between mands, echoics, tacts, and intraverbals across both sets of data are individually calculated and totaled. This aggregate is then divided by the sum of the percentages of agreements and disagreements for each operant:

$$ \frac{{\mathrm{A}}^{\mathrm{mand}}+{\mathrm{A}}^{\mathrm{echoic}}+{\mathrm{A}}^{\mathrm{tact}}+{\mathrm{A}}^{\mathrm{intraverbal}}}{{\left(\mathrm{A}+\mathrm{D}\right)}^{\mathrm{mand}}+{\left(\mathrm{A}+\mathrm{D}\right)}^{\mathrm{echoic}}+{\left(\mathrm{A}+\mathrm{D}\right)}^{\mathrm{tact}}+{\left(\mathrm{A}+\mathrm{D}\right)}^{\mathrm{intraverbal}}} $$

In this way, the verbal behavior SCoRE provides a statistic to quantify the individual’s functional verbal repertoire.

Quantifying the extent to which an individual’s verbal behavior deviates from the norm provides an objective method for describing the extent of a language deficit. Referring to Fig. 1, a verbal behavior SCoRE can be calculated for each child’s verbal repertoire. Note that the repertoire becomes more balanced as the size of the SCoRE approaches 1. To find Olivia’s SCoRE, we must first determine the relative strength of each operant by finding the percentages of the verbal repertoire they comprise. For instance, Olivia emitted one mand, 20 echoics, and no tacts or intraverbals, for a total of 21 responses. By dividing the value of each operant by the total and then multiplying by 100, we can obtain a percentage of the whole for each operant (mand = 4.5%, echoic = 95.5%, tact = 0%, intraverbal = 0%). These percentages are then compared with the balanced repertoire, which places each operant at 25%. If the percentage is less than 25%, it goes in the numerator; if the percentage is greater than 25%, it goes in the denominator. By summing the agreements and dividing this by the sum of agreements and disagreements, Olivia’s verbal behavior results in a value of 0.17 (see Table 1).

Table 1 Conversion of the Ordinal Values of Verbal Operants to the Relative Value of the Verbal Repertoire for Olivia

Applying the same formula, we can calculate Nancy’s verbal behavior SCoRE. The sum of the values for each operant equals 29. By dividing each individual value by the total, we find the relative strength of control for mand (17.2%), echoic (69%), tact (10.3%), and intraverbal (3.4%) relations. Comparing these values against the null repertoire, Nancy’s SCoRE equates to 0.39 (see Table 2).

Table 2 Conversion of the Ordinal Values of Verbal Operants to the Relative Value of the Verbal Repertoire for Nancy

To find Katie’s verbal behavior SCoRE, we again begin by summing the absolute value of each verbal operant and then dividing the value of each operant by the sum. We then multiply by 100 to find the relative value of each operant (mand = 16.1, echoic = 45.2, tact = 25.8, intraverbal = 12.9) and compare this percentage against the null value of 25%. Katie’s verbal behavior SCoRE is 0.65 (see Table 3).

Table 3 Conversion of the Ordinal Values of Verbal Operants to the Relative Value of the Verbal Repertoire for Katie

Finally, we can apply the same formula to Tomás’s stimulus control ratio, which equates to a verbal behavior SCoRE of 0.86 (see Table 4).

Table 4 Conversion of the Ordinal Values of Verbal Operants to the Relative Value of the Verbal Repertoire for Tomás

The SCoRE provides a pragmatic valuation of verbal behavior, accounting for variation across four primary operants. Necessarily, a measure of this sort is sensitive to change over time. Keller and Schoenfeld (1950) refer to the operant level as the unconditioned rate of emission for a given response that appears as part of the general activity of the organism and determines the quickness with which the response can be reinforced. By quantifying the operant level of the verbal repertoire, the SCoRE provides a means of evaluating treatment effects and conducting pragmatic verbal behavior research.

Figure 4 displays the stimulus control ratios for a 3-year-old boy named Enrique assessed at the onset and conclusion of 13 weeks of behavior–analytic intervention. Visual analysis of the pre-SCoRE demonstrates disproportional control of the verbal repertoire among the four operants. Arguably, the discrepancy between echoics (47.5%) and tacts (39%) may not be significant (a difference of 8.5%), but visual analysis shows that mands and intraverbals are discernibly weaker by comparison. The post-SCoRE clearly portrays the development of mand and intraverbal relations, with all four operants balanced to within 5 percentage points of one another. The SCoRE’s precision of measurement eliminates any subjectivity of the overall gains, which may be quantified by subtracting the pre-SCoRE from the post-SCoRE.

Fig. 4
figure 4

The stimulus control ratios for Enrique at intake (left; SCoRE = .47) and discharge (right; SCoRE = .94) across 13 weeks of behavior–analytic intervention. IV = intraverbal; SCoRE = stimulus control ratio equation

Within the realm of research, the SCoRE allows us to specify participants’ respective levels of responding either before or after intervention (as discussed previously) or repeatedly over time. Testing before and after intervention provides a convenient method of conducting group research, such as randomized control trials, to evaluate the efficacy of specific interventions. The SCoRE may be particularly useful for testing emergent relations. When a response is reinforced within one operant class, we can easily document where, and the extent to which, it emerges across other classes.

Furthermore, the SCoRE may be used to summarize the efficacy of verbal behavior interventions that have already been evaluated in the literature. If the relevant information has been reported, the SCoRE provides an objective, systematic procedure for synthesizing previously reported results. For instance, using the data reported by Lerman et al. (2005), we can calculate a SCoRE of 0.42 for Linda.

Resolving Agreement Is Not Equivalent to Discovering Truth; Rather, It Affords Professionals Approaches to Assessment that Maintain Enough Scientific Integrity to Serve Pro Temporee as Best Approximations to Truth. (McDermott, 1988, p. 239)

Stimulus control ratios may be especially useful for guiding instructional decisions on transferring stimulus control (Barbera & Kubina, 2005; Bloh, 2008; Coon & Miguel, 2012; Sweeney-Kerwin, Carbone, O’Brien, Zecchin, & Janecky, 2007). The relative levels of strength depicted in the control ratios provide information about what types of prompting may be most effective when conditioning weaker verbal operants. For example, Katie’s data (Fig. 5) prescribe the use of echoic prompts to support the acquisition of mands. Similarly, the convergence of echoic stimuli and establishing operations may facilitate the acquisition of tact and intraverbal relations. In other cases, the simultaneous support of mand, echoic, and tact relations may be necessary to evoke intraverbal responding.

Fig. 5
figure 5

Katie’s stimulus control ratio provides a framework for conditioning intraverbals in terms of most (a convergence of mand, echoic, and tact relations) to least (mand only) prompting and fading. IV = intraverbal

The convergence and subsequent fading of variables may be used in an errorless learning format. For instance, given the objective of strengthening Katie’s responding to intraverbal frames, the relatively improbable intraverbal response (12.9%) could be supported by converging echoic (45.2%), tact (25.8%), and mand (16.1%) function in conjunction with the intraverbal SD to further increase the probability of a response. Using most-to-least prompt fading, mand function could then be eliminated and support the intraverbal response with only echoic and tact prompts. Fading could be further facilitated by eliminating support from the tact while converging echoic and mand relations. The intraverbal response could then be evoked solely with the help of an echoic stimulus. Supports could continue to fade prompts by combining tact and mand relations and eliminating echoic prompts. This could be followed by converging intraverbal and tact relations—and then intraverbal and mand relations—before finally bringing the response solely under the control of an intraverbal stimulus.

In many cases, it is unlikely that the participant would need all seven incremental steps of the aforementioned fading process. However, these steps are designed to eliminate prompt dependency and stimulus overselectivity to bring the response solely under the control of the corresponding—in this case, intraverbal—stimulus relation. Such procedures may be particularly useful for conditioning mand relations for which a corresponding stimulus is far less discrete (Skinner, 1957).

The control ratio may be applicable to systematic prompt fading as well. Once echoic and tact relations have converged to support Katie’s manding, the visual prompt that controls tacting may be faded to reduce the level of support required to induce a mand. When echoic prompts suffice alone, visual prompts may be reinstated to then fade echoic prompts and condition the mand with support from exteroceptive supports. Having conditioned the mand under the relevant motivating operations in addition to either echoic or tact supports, the alternation of these prompts may facilitate the abstraction of control to motivating operations alone.

The SCoRE also potentiates instructional grouping. Speakers with similar functional levels may be clustered together to address similar objectives and create opportunities for social interactions with peers. Furthermore, targets for social skills instruction may be clarified by organizing students into pairs or groups based on their respective SCoREs.

The Usefulness of Any Lawful Relation Depends upon the Sharpness of Reference of the Terms in Which It Is Stated. (Skinner, 1953, p. 200)

The SCoRE provides a mechanism for interpreting the correlation between the observed behavior and the null hypothesis and may be considered analogous to the coefficient of determination (R2), in which the complexity of the repertoire is measured between 0 and 1, regardless of sign. Verbal behavior SCoREs approaching 0 show minimal correlation with the null hypothesis, whereas SCoREs approaching 1 show perfect correlation. Consequently, Ferguson’s (2009) thresholds may be used to interpret the size of the repertoire. Below.20, we might refer to an individual’s verbal repertoire as emerging, consistent with the use of the term emergent to describe the untrained transfer across verbal operants. For SCoREs greater than or equal to .20, Ferguson describes the repertoire as practical. A moderate repertoire refers to a SCoRE at or above .50, and if it is greater than or equal to .80, the individual’s verbal repertoire can be described as strong.

Acknowledging these effect sizes may be pragmatic for multiple reasons. First, it provides a precise way to describe an individual’s verbal repertoire. Descriptive terms such as nonverbal, minimally verbal, low performing, and high functioning are notoriously unreliable, if not altogether inaccurate (Hanley, 2012; Miller, 2006). In contrast, the SCoRE provides a measure of magnitude to which we can specify various thresholds anchored to other effect sizes. For instance, a SCoRE of .2 is the equivalent of r = .2, which can be readily translated to d = .5 without altering the interpretation (Ferguson, 2009). This transferability potentiates the use of the SCoRE metric for conducting analyses of large data sets, such as meta-analyses of published literature.

Additionally, these thresholds can also be used as benchmarks for progress monitoring. The efficacy of an intervention can be inferred when we note that an individual has progressed from having a practical verbal repertoire to having a moderate one. Similarly, the social validity of an intervention may be enhanced when speakers transition across benchmarks.

Multiple Control of Verbal Responses Is the Rule Rather than the Exception. (Michael, et al., 2011, p. 3)

Given the apparent ubiquity of convergent multiple control, we propose that analyzing verbal operants individually is important, yet insufficient; verbal behavior scientists and practitioners must go one step further to examine the relative levels of verbal operant strength. Using the null verbal repertoire as a basis for comparison, the verbal deficits of individuals with autism and related disorders may be precisely quantified. The stimulus control ratio’s primary benefit is allowing behavior analysts to examine individual operants as parts of the gestalt verbal repertoire, thereby affording the prediction and control of variables that may influence the speaker’s verbal behavior.

The growing literature base on the emergence of untrained verbal operants supports the notion that verbal operants may be functionally interdependent (Fryling, 2016). Verbal behavior SCoREs may have value for assessing the emergence of other verbal operants during functional communication training, which predominantly focuses on conditioning the mand relation. Intermittently probing for untrained relations may provide additional data to serve as a prognostic indication of treatment effectiveness and outcomes.

Stimulus control ratios depict the relative level of strength across functionally independent operants and may be adjusted to fit the null hypothesis dictated by the record floors and ceilings imposed by a given assessment. The VOA described previously is only one of many verbal behavior assessments to which the SCoRE may be applied. The rapid advancement of this technology over the past 12 years leaves little doubt that the procedures will continue to be revised and improved (Plavnick & Normand, 2013). Other common measures of verbal behavior include the Assessment of Basic Language and Learning Skills—Revised (ABLLS–R; Partington, 2006), Promoting the Emergence of Advanced Knowledge (PEAK; Dixon, 2014), and the VB–MAPP (Sundberg, 2008). The SCoRE may be used with any of these criterion-referenced assessments and other measures of verbal behavior. However, a primary limitation of many verbal behavior assessments is that they may not account for the proportionality of responding across the verbal operants (see Table 5).

Table 5 A Comparison of the Number of Items Assessed Within Each Verbal Operant Domain by Different Verbal Behavior Assessments

When control ratios are calculated for each of the aforementioned assessments and compared with a null speaker’s repertoire, we can calculate a maximum SCoRE for each. As noted previously, the SCoRE for a VOA ranges from 0 to 1. The VB–MAPP shows the next greatest degree of behavioral relativity (range of 0 to .82), followed by the ABLLS–R (range of 0 to .72) and the PEAK (range of 0 to .54). Accordingly, disproportionate levels of control may be more easily observed on the VOA and VB–MAPP than on the ABLLS–R and PEAK. Asymmetrical control ratios increase the complexity of interpretation and may inhibit some of their more pragmatic implications regarding transfer of control. That is, varying the frequency of items assessed within each verbal operant domain precludes the identification of relative strength. As we have previously argued, the overselectivity of autism (Lovaas, Koegel, & Schreibman, 1979) is a manifestation of disproportionate levels of stimulus and motivational control. Transfer of stimulus control is premised upon proportionate levels of mands, echoics, tacts, and intraverbals and serves as a behavioral cusp for multiple convergent control.

A Test Is Simply a Convenient Opportunity to Observe Behavior—To Survey or Sample Our Dependent Variable. The Score May Be Used To Predict Some Aspect of the Larger Universe of Behavior from Which the Test Is Drawn. (Skinner, 1953, p. 199)

To close, we offer an example of how the results of a VB–MAPP assessment may be translated into a verbal behavior SCoRE. Figure 6 shows the verbal behavior results of a VB–MAPP for a 3-year-old boy with autism. Taking the results from the verbal operant domains, we can create a stimulus control ratio by summing the scores for the speaker’s mand (1), tact (5), echoic (4), and intraverbal (0) assessment and then dividing the value for each operant by the sum (10) to yield a percentage for each operant.

Fig. 6
figure 6

The verbal operant data from a 3-year-old boy with autism shown on the likeness of a VB–MAPP grid (left) and as a stimulus control ratio (right). VB–MAPP = Verbal Behavior Milestones Assessment and Placement Program

Once we have developed the stimulus control ratio, we can then calculate the verbal behavior SCoRE by comparing our observed results with the null repertoire. The percentage of strength identified for each operant in the control ratio is compared against the null repertoire’s 25%. The smaller number of each comparison goes in the numerator, and the larger number is placed in the denominator. Then, the values within the numerator and denominator are summed, and the speaker’s SCoRE is calculated at .43.

Apart from demonstrating the present level of functional verbal performance, the stimulus control ratio serves as a curriculum guide, with the aim of balancing the relative levels of control across all four operants (Fryling, 2016; Mason & Andrews, 2014). To achieve this goal, the control ratio also provides an individualized hierarchy of most-to-least prompting. For instance, given the target of completing an intraverbal frame, the behavior technician may begin by converging mand, echoic, and tact relations. The level of prompting can then be systematically faded by momentarily eliminating mand function and converging only echoic and tact relations. The mand relation can then be reintroduced in conjunction with the tact while eliminating the echoic stimulus. Mand and echoic functions can then be converged before fading down to the isolated prompts from tact, echoic, and mand relations, respectively.

For the data provided, the summed value of mand and echoic control (4 + 1) is equal to that of the tact (5). Additional research should focus on the extent to which converging multiple verbal relations operate in greater strength to control the speaker’s response than a single operant with the same numeric value, as we have proposed here. In the meantime, we urge practitioners to use their own practice-based evidence to determine whether the convergence of multiple operants (e.g., mand and echoic) garners greater control over a response than a single operant (e.g., tact) of equal numeric value.

To date, the SCoRE assessment has been applied primarily to vocal verbal performers with a variety of verbal behavior deficits. However, given the prevalence of individuals with autism and other language disorders who use selection-based forms of communication, additional research should examine the extent to which the SCoRE may also be used to assess the environmental determinants controlling augmentative and alternative communication (e.g., picture exchange, speech-generating devices).

Other iterations of the SCoRE focus on the extent to which the intraverbal repertoire is influenced by varying degrees of derivational stimulus control (i.e., reflexive, symmetrical, and transitive). Measuring the development of equivalence formations may be important for researchers controlling for the prior establishment of reflexive, symmetrical, and transitive relations (Sidman, 2009), with the null hypothesis assuming proportional levels of responding at 33.33% apiece.

By numerically summarizing the degree of deviation for the entirety of a speaker’s verbal repertoire, the SCoRE further addresses Skinner’s (1957) concern of providing a parsimonious way of describing verbal behavior. Given the implications for a scientific account of language, the SCoRE may indeed afford a better way of talking about verbal behavior.