Token reinforcement procedures are used to increase appropriate behavior (e.g., completion of instructional tasks) and decrease problem behavior (e.g., aggression) in a wide array of clinical contexts, including schools (Kim et al., 2021), clinics (Falligant et al., 2021), hospitals (Frank-Crawford et al., 2019), and even correctional facilities (e.g., Brogan et al., 2018). A considerable body of basic and applied research suggests that tokens can establish preference for delayed reinforcers and promote self-controlled behavior (even among individuals with impulsive response tendencies; e.g., Fulton et al., 2020; Robinson & St. Peter, 2019).

Tokens are conditioned stimuli that help bridge temporal gaps between responses and deliveries of delayed reinforcers. There are three different components of token-reinforcement procedures that may affect responding in these arrangements. First, the token-production schedule specifies the number of responses the organism is required emit to earn a token. For example, in a fixed-ratio [FR] three token-production schedule, one token would be delivered following three responses. Second, the token-exchange schedule specifies the number of backup reinforcers each token is worth (see Hackenberg, 2009). For example, in an FR-1 token-exchange schedule each token is worth a single unit of the backup reinforcer (e.g., one piece of food). Third, the exchange-production schedule specifies how many tokens the individual must accumulate before they can exchange the tokens for backup reinforcers. For example, under an FR-10 exchange-production schedule, tokens cannot be exchanged for backup reinforcers until the individual has accumulated 10 tokens. In contrast, under an FR-1 exchange-production schedule, each token can be exchanged as soon as it is earned.

A number of translational research studies have examined various aspects of exchange-production schedules with reference to their second-order effects. Second-order effects refer to how changes in exchange-production schedules affect behavior under different token schedule arrangements (e.g., how increasing the exchange-production schedule changes responding under lean or dense token-production schedules; Argueta et al., 2019; Falligant & Kornman, 2019). Research has also examined how changing different token schedule parameters and components affect preferences for exchange-production arrangements (e.g., Falligant & Kornman, 2019; Falligant et al., 2021; Ward-Horner et al., 2017). For example, DeLeon et al. (2014) found that, in their sample of individuals with intellectual and developmental disabilities (IDD), accumulated (FR-10) exchange-production schedules were generally preferred relative to distributed (FR-1) exchange-production schedules when token-production schedules were very dense despite the delays to reinforcement that are associated with accumulated exchange-production schedules. However, Falligant et al. (2020) suggests that preferences for accumulated and distributed schedules vary depending on the token-production schedules, such that distributed schedules may be more preferred when token-production schedules are relatively lean. In other words, preferences for exchange-production schedules may shift from accumulated to distributed when token-production schedules are lean.

In addition, DeLeon et al. found that accumulated schedules were more effective at increasing (analog) target responding relative to distributed schedules with FR-1 token-production schedules. Similarly, Robinson and St. Peter (2019) found that, using dense schedules, accumulated arrangements were more effective at decreasing escape-maintained problem behavior than distributed arrangements. However, under leaner token-production arrangements, it is not clear if accumulated schedules are more effective at decreasing problem behavior. Indeed, Frank-Crawford et al. (2021) found mixed second-order schedule effects for reducing escape-maintained problem behavior across accumulated and distributed schedules during schedule thinning (i.e., in which the token-production schedules were systematically increased).

Once clinically relevant reductions in problem behavior are achieved with rich token-production schedules, reinforcement schedule thinning is often necessary because the token production schedules are too dense to sustain in the natural environment. During schedule thinning, the density (i.e., the availability, rate, or magnitude) of reinforcement is gradually reduced until a terminal schedule value is reached that is feasible to implement during community implementation and sustains the significant reduction in problem behavior and/or increase in appropriate behavior (e.g., Hagopian et al., 2011). Given that token-based behavioral interventions in the community often use fairly lean token-production schedules (Kim et al., 2021), it is critical to examine preference for and efficacy of these arrangements under more naturalistic token-production schedules. As described above, there is an emerging literature base examining this issue, but more research is needed–especially with respect to applications for decreasing escape-maintained problem behavior. Moreover, little research has examined the iatrogenic effects of exposure to non-preferred exchange-production arrangements during subsequent schedule thinning after a reliable preference for an alternative arrangement has been established. If individuals prefer distributed arrangements under very lean schedules, it is unknown if accumulated arrangements under these lean schedules are associated with problem behavior or poor compliance.

Notably, Andzik et al. (2020) demonstrated that, even when the contingency for problem behavior producing escape remained intact, differential reinforcement for compliance with tokens decreased problem behavior while also maintaining high levels of compliance. This suggests that in contexts where the contingency for negative reinforcement remains intact via continuous access to a break (i.e., there is an SD for a functional communicative response [FCR] producing escape), token delivery should theoretically limit the number of times an individual emits the FCR and prevent excessive manding for escape during reinforcement schedule thinning. However, it is unknown how differences in exchange-production schedules may affect the establishing operation (EO) for escape and otherwise affect motivation for accessing breaks via the FCR during schedule thinning.

The purpose of the current study was to examine the differential preference for and effectiveness of accumulated and distributed schedules, across a range of token-production values, during a treatment evaluation for escape-maintained problem behavior with an individual with severe problem behavior. We evaluated these schedules across a range of work requirements in which negative reinforcement (15-s break) was continuously available for an alternative response.

Method

Participant and Setting

Drew was an 11-year-old male diagnosed with autism spectrum disorder, attention-deficit/hyperactivity disorder, disruptive behavior disorder, oppositional defiant disorder, and a moderate intellectual disability. Drew communicated vocally using phrases and short sentences. He was also able to follow multistep directions. Sessions were conducted five days per week in padded treatment rooms or a bedroom in the hospital inpatient unit where Drew was receiving services.

Response Measurement and Interobserver Agreement

The primary dependent variable was total problem behavior which included self-injury, aggression, and disruption. Self-injury was defined as any attempts or successes of Drew hitting, punching, kneeing, or scratching himself, slamming his body into surfaces, or stomping his feet with force. Aggression was defined as any attempts or successes of punching, scratching, grabbing, hair pulling, kicking, pulling, shoving, or throwing objects within two feet of others. Disruptions were defined as any attempts or successes of hitting, banging, kicking objects, slamming or knocking over furniture, swiping objects from surfaces, tearing/ripping objects, ripping clothes, or throwing objects more than two feet of others. The secondary dependent variable was compliance, defined as completion of the targeted instruction within 5 s of a presented verbal or gestural prompt.

Interobserver agreement was assessed by having a second observer independently record the frequency of problem behaviors and compliance using a computerized data collection program (BDataPro; Bullock et al., 2017). Two observers recorded target behaviors simultaneously but independently during 31% of sessions. Agreement coefficients were calculated by using the exact interval agreement method, dividing the number of agreements by the number of agreements plus disagreements within 10-s intervals and multiplying by 100 for problem behavior and compliance. Mean agreement was 98.5% (range: 68%-100%) for problem behavior and 93% (range: 80%-100%) for compliance.

Procedure

Pre-experimental Procedures

A functional analysis of Drew’s problem behavior (i.e., self-injury, aggression, and disruptive behavior) was conducted as described by Iwata et al. (1982/1994). Results from the FA indicated Drew’s problem behavior was maintained by escape from demands. A tablet was identified as a high preference item via a paired-stimulus preference assessment (PSPA; Fisher et al., 1992). Additionally, token training was conducted with Drew prior to the treatment analysis using the response-stimulus plus exchange (RSE) method in which Drew initially earned tokens on a FR-1 token-production schedule and exchanged them according to a FR-1 exchange-production schedule which was subsequently thinned. Data from the pre-experimental procedures are available upon request to the corresponding author.

Baseline

During baseline, the experimenter continuously presented academic demands (e.g., matching, reading, counting) using three-step prompting (i.e., verbal, gestural, and repeated gestural prompts). For safety reasons, the experimenter did not use physical prompts. Contingent upon problem behavior, the experimenter provided Drew with 30 s of escape from the presented demands (e.g., immediately withdrew task materials). Contingent upon compliance, the experimenter provided Drew with a brief, neutral statement of praise and began the presentation of the next demand.

Token Evaluation

During token evaluation sessions, the experimenter delivered academic demands using the same three-step prompting that was employed during baseline. The experimenter ignored all instances of problem behavior and continued to present demands throughout the session. If problem behavior occurred, then the experimenter continued implementing demands using the three-step prompting sequence. However, a 15-s break was always available throughout the session. Drew had previously learned the FCR, “Break please,” to access a break appropriately. Thus, contingent upon Drew saying, “Break please,” the experimenter removed the demand and stopped engaging with Drew for 30 s. During these requested breaks, the session time was paused and neither attention nor tangible items were available. His token board consisted of an 8.5 in × 11 in laminated sheet with Velcro to which the token(s)–consisting of small picture icons of preferred stimuli (e.g., dump trucks)–could be affixed. Drew had separate tokens boards for the distributed and accumulated conditions of the token evaluation (see below).

Distributed Condition

Prior to the start of each session, the therapist placed the token board (which contained a single empty slot for a token) in front of the Drew, provided the SD (e.g., “It’s time to do some work. This is the distributed condition. You can do these problems if you want to. When you complete a problem, you will get one token right away to trade in for your tablet”) and delivered a demand. Contingent upon compliance in the absence of problem behavior, the experimenter placed one token directly on a token board. Following the delivery of one token on the board, Drew immediately exchanged the token by handing the experimenter the entire board to access a 30-s break with his tablet. After 30 s elapsed, the experimenter removed the tablet and continued presenting demands. This process continued until the experimenter presented demands for 10 total min, excluding time for reinforcer access.

The initial token-production schedule was FR 1 (i.e., the experimenter immediately delivered one token following one instance of compliance). However, to evaluate the effectiveness of the token reinforcement conditions across more effortful work requirements (viz. schedule thinning), the token production schedule was systematically thinned to a terminal variable-ratio (VR) 5 schedule. Regardless of the token production schedule, the exchange-production schedule in this condition remained constant at FR 1. In other words, Drew always exchanged the token board after earning one single token in the distributed token reinforcement condition.

Accumulated Condition

Prior to the start of each session, the therapist placed the token board (which contained 10 empty slots for tokens) in front of the Drew, provided the SD (e.g., “It’s time to do some work. This is the accumulated condition. You can do these problems if you want to. When you complete a problem, you will get one token right away. When you earn 10 tokens you can trade them in for your tablet”) and delivered a demand. Procedures during the accumulated token reinforcement condition were similar to the distributed token reinforcement condition except for a few differences. The exchange-production schedule increased from one token in the distributed condition to 10 tokens in the accumulated condition. In other words, Drew could not exchange the token board until he earned 10 total tokens. Each token was still worth 30 s of tablet access; thus, because Drew exchanged 10 tokens at once, he accessed his tablet for five continuous minutes.

In addition, the token-production schedule thinning that occurred during the distributed token condition also occurred during the accumulated token condition. For example, during the FR-2 token-production schedule, a token was delivered following two instances of compliance. However, Drew was still required to earn ten tokens to exchange the token board. Thus, in the FR-2, FR-4, and VR-5 accumulated token arrangement phases, Drew exchanged the token board following 20, 40, or an average of 50 compliance responses, respectively.

Concurrent Choice

We conducted a concurrent choice assessment to evaluate Drew’s preference for the two token reinforcement conditions. Prior to the beginning of a session, the experimenter presented Drew with the option to work for one token or 10 tokens using the tokens boards associated with the accumulated and distributed conditions. Contingent upon Drew selecting one of the two options, the contingencies of the selected arrangement were in place for the entirety of the 10 min session. Selection was defined as Drew touching one of the token boards or vocally saying “accumulated” or “distributed.” A similar token production schedule thinning was implemented across the choice sessions to assess if preference shifted across increasing work requirements.

Results and Discussion

Drew’s problem behavior data are depicted in the top panel of Fig. 1. He engaged in high and variable rates of problem behavior during baseline. Problem behavior occurred an average (SD) of 4.0 (5.7) instances per minute. During the token evaluation, problem behavior remained at near zero levels across both distributed and accumulated token reinforcement conditions. Problem behavior occurred an average rate of 0.02 (0.04) and 0.02 (0.06) per min across the distributed and accumulated reinforcement conditions, respectively. Problem behavior remained at low rates across the increasing token production schedules. In addition, Drew did not request a break during any baseline and accumulated token reinforcement sessions. Drew only requested a break in one distributed token reinforcement session.

Fig. 1
figure 1

Problem Behavior, Compliance, and Schedule Selections. BL = baseline; TE = token evaluation

Drew’s compliance data are shown in the middle panel of Fig. 1. During baseline, compliance occurred at an average percent (SD) of 50.9 (34.7). During the token evaluation, Drew’s level of compliance increased consistently across both conditions. Drew’s compliance occurred at an average percent of 84.1 (9.7) and 94.8 (4.0) across the distributed and accumulated token conditions, respectively.

Drew’s selections during the choice assessment are depicted in the bottom panel of Fig. 1. Across all thirteen sessions, Drew exclusively selected to work under the distributed token reinforcement condition over the accumulated token reinforcement condition. His preference for the distributed condition remained consistent across each of the increasing token production schedules. During the sessions that Drew selected to work under the distributed condition, Drew engaged in an average (SD) of 0.05 (0.1) instances of problem behavior. Similar to the treatment conditions, Drew did not request a break throughout the choice assessment sessions. Although the compliance data during the choice assessment are not shown, Drew engaged in average of 86.5 (9.0) percent of compliance behaviors across the choice sessions.

As discussed previously, there is little research examining how behavior changes under non-preferred exchange-production arrangements during schedule thinning (after a preference for an alternative arrangement has been identified). Some research indicates that individuals may prefer distributed arrangements under very lean schedules (e.g., Falligant et al., 2020). However, it is unknown if accumulated arrangements under these lean schedules are associated with problem behavior or diminished compliance. Thus, in a supplemental analysis, we assessed how Drew’s problem behavior and compliance were affected by a variety of schedule thinning and generalization tactics commonly used to make behavioral interventions easier to implement in the community (e.g., additional schedule thinning, incorporation of variable-ratio (VR) token-production schedules, extending session duration, generalizing treatment to novel staff and contexts). During this phase, we required Drew to exchange his tokens via the accumulated exchange-production schedule. Recall that Drew’s preference for distributed arrangements was very stable across a range of initial token-production schedules (FR 1, FR 2, FR 4), and both accumulated and distributed arrangements supported low rates of problem behavior and high levels of compliance. When we subsequently thinned the token-production schedule to FR 5, Drew’s compliance remained very high and he engaged in low to zero rates of problem behavior (Fig. 2). These results were generally observed across subsequent modifications when we transitioned to a VR 5 token-production schedule, extended session duration to 15 min (beginning in session 74), and conducted generalization sessions with novel staff and contexts. These findings suggest that accumulated exchange-production schedules may remain effective in supporting appropriate behavior–even when distributed schedules may be higher preferred–under schedule conditions that are similar to those used in community-based settings.

Fig. 2
figure 2

Problem Behavior and Compliance across Maintenance Sessions. Grey markers depict generalization sessions with novel staff or contexts. All sessions were conducted with accumulated exchange-production schedules

These results replicate and extend the current literature on preferences for different exchange-production schedules in token economies in several ways. We found that Drew’s preference for distributed arrangements was stable and did not change as a function of increasing work requirements during schedule thinning. Prior research has demonstrated that preferences shift from accumulated to distributed as work requirements increase (Falligant & Kornman, 2019; Falligant et al., 2020). However, it was not clear if or how preferences shift when distributed arrangements are initially preferred under dense token-production schedules given that accumulated schedules are typically preferred prior to schedule thinning. It is not clear why Drew preferred distributed arrangements under dense schedules, but one hypothesis may be related to a diminished EO for escape due to the continuously available SD for the break FCR. Although he rarely requested a break, its availability may have attenuated the aversiveness of work tasks and diminished the appetitive value of a prolonged break associated with the accumulated arrangement. It is also worth noting that this break differed from the enhanced break he received for completing tasks in the absence of problem behavior via his token reinforcement schedule (which involved both positive and negative reinforcement; Lalli et al., 1999).

Clinically, these outcomes demonstrate the efficacy of token-based interventions, as Drew progressed from struggling to complete any academic tasks without engaging in problem behavior to completing an average of 50 tasks in a single sitting to earn relatively brief access to his backup reinforcers. This significant increase in compliance and decrease in problem behavior across drastically increased work requirements was accomplished without full-physical guidance following noncompliance or excessive manding for a break (which was available throughout these sessions). Andzik et al. (2020) demonstrated that, even when the contingency for problem behavior producing escape remained intact, differential reinforcement for compliance with tokens decreased problem behavior and maintained high levels of compliance. Our results further support research demonstrating that conditioned reinforcers (tokens) for compliance may directly compete with behaviors that produce immediate access to a break. Here, an alternative response (i.e., emitting the FCR) was continuously reinforced but Drew rarely emitted the FCR despite increasing work requirements with either the distributed or accumulated arrangement. These findings suggest that the addition of supplemental alternative reinforcement in the form of tokens (regardless of the exchange-production schedule) may be a useful tactic to decrease excessive manding during schedule thinning (for applications in which the availability of the FCR for a break is gradually reduced; Hagopian et al., 2011). Future work in this area is required to more systematically examine the relation between changes to token-production schedules and preference for distributed and accumulated exchange-production schedules across a great number of individuals (with varied levels of adaptive functioning, functional classes of problem behavior, and skill levels). Nonetheless, the present study may serve as a useful proof of concept for future applied research on this topic.