Introduction

Differential reinforcement of other behavior (DRO) has emerged as one of the most common methods used to decrease a wide range of problem behavior [1]. DRO is a behavior reduction procedure in which reinforcers are provided at the end of an interval in which the child was engaging in other behavior in the absence of problem behaviors throughout the entire interval. Typically, DRO methods are paired with an extinction procedure in which maintaining reinforcers for problem behavior are withheld upon the occurrence of problem behavior. Additionally, this same reinforcer can be delivered in the DRO if there were no target behaviors during the DRO interval. In order to identify maintaining reinforcers, a functional analysis (FA) is necessary to identify relations between environmental factors and the consequences maintaining the target behavior. Due to the ease of implementation and direct use of reinforcement for not engaging in undesired behavior, DRO procedures are an appealing option for clinicians.

DRO procedures have been effectively implemented to reduce self-injurious behavior (SIB) to near-zero rates. Vollmer, Iwata, Zarcone, Smith, and Mazaleski [2] accomplished this while thinning the interval schedule from 10 s to 5 min for two of the three female participants. Due to the criteria of a DRO procedure, supervision throughout the whole interval is required to determine if the target behavior did or did not occur during the interval. It may be cumbersome for caregivers to provide continuous monitoring of behavior over long periods while other household duties are being accomplished [3].

In many environments, supervision of a child during an entire interval may not be viable. Momentary DRO may be a promising alternative to traditional whole interval DRO because constant observation of the participant is not required. Momentary DRO requires only a “snapshot” of observation to determine if reinforcement will be provided or withheld. Previous research has demonstrated that momentary DRO can be effective at maintaining responding suppressed through the use of whole interval DRO. Repp, Barton, and Brulle [4] found a 5-min whole interval DRO suppressed classroom disruptions, and these results were maintained by the use of a 5-min momentary DRO for one student. Due to the fact that only one participant was exposed to momentary DRO, additional replication was warranted. Barton, Brulle, and Repp [5] further demonstrated the effectiveness of momentary DRO at maintaining low levels of problem behavior that were previously suppressed in three schoolchildren by using whole-interval DRO.

Variable momentary differential reinforcement of other behavior (VM DRO) is a variation of momentary DRO in which the length of the interval throughout the session varies. In the minimal research that has been conducted on VM DRO, Lindberg, Iwata, Kahng, and DeLeon [6••] found that VM DRO is as effective as fixed-interval (FI) and variable-interval (VI) DRO schedules in reducing SIB maintained by social-positive reinforcement. However, this study was limited in that only two participants were included to compare the various DRO procedures. Additionally, only one participant was exposed to VM DRO alone. Results of this study warrant further investigation of VM DRO as a method of reducing problem behavior.

Kahng, Abt, and Schonbachler [7•] successfully implemented a VM DRO procedure to decrease low-rate, high-intensity aggression maintained by social-positive reinforcement for a 15-year-old girl with a profound intellectual disability. However, the purpose of the Kahng et al. study was to investigate the utility of an all-day FA; thus, limited information about the VM DRO procedures are provided. A more thorough description of VM DRO in this context may be warranted.

Lastly, Toussaint and Tiger [7•] expanded the use of the VM DRO procedure by successfully reducing covert SIB maintained by automatic reinforcement. In this study, automatically maintained skin-picking was reduced to near-zero rates, while the interval was extended to 5 min. Toussaint and Tiger discussed how VM DRO may be uniquely suited for the treatment of covert behaviors due to the nature of the procedure and the lack of observation required.

Even though limited research has been conducted on VM DRO, many beneficial factors have been recognized. VM DRO can be applied easily in most settings, results in high rates of reinforcement, and has been shown to be effective at reducing problem behavior. The effectiveness of VM DRO may be partially due to the criterion for reinforcement, given that it is less stringent than other DRO schedules [7•]. In Lindberg et al. [6••], VM DRO resulted in the highest rates of reinforcement for all three participants. Dot was exposed to FI, VI, and VM DRO schedules and reinforcement rates were 37%, 35%, and 58%, respectively. Jodi was also exposed to FI, VI, and VM DRO, and reinforcement rates were 65%, 55%, and 87%, respectively. Bridget was exposed only to VM DRO and earned over 90% of the scheduled reinforcers.

The purpose of the current study was to further evaluate VM DRO and determine the effectiveness of VM DRO in reduction of problem behavior maintained by social-positive reinforcement and maintenance of treatment effects. Given the results demonstrated in previous literature [4, 5], when the VM DRO procedure was ineffective at decreasing the target behavior, a FI DRO procedure was introduced to reduce the problem behavior to near-zero rates. As demonstrated in a majority of previous DRO literature, the terminal criteria for FI DRO were set at a 5-min interval. After reaching terminal criteria, VM DRO was re-introduced as a maintenance procedure. The additional phase was used to determine if VM DRO was a viable option to maintain low rates of problem behavior previously produced by a whole-interval DRO procedure.

Method

Participants and Setting

Two individuals diagnosed with developmental disabilities and autism spectrum disorder (ASD) participated. All participants were referred to a university-based behavior clinic for assessment and treatment of problem behavior. Charlie was a 5-year-old male diagnosed with ASD. He received 12 h of weekly early intensive behavioral intervention (EIBI) services three times a week, for 4 h each day. Dennis was a 4-year-old male with ASD, and he received 8 h of weekly EIBI services two times a week. Both participants had a limited verbal repertoire. Charlie had no vocal communication and used an augmentative and alterative communication (AAC) device in order to express wants and needs. Dennis had a limited vocal range that was composed of mostly vocal approximations in order to obtain desired items. Dennis primarily used sign language and a picture communication system to express wants and needs.

All sessions were conducted in a therapy room equipped with a video camera and a one-way mirror that allows observation from an adjacent room. In the observation room, two independent observers recorded instances of problem behavior in live time using the BDataPro collection software [8]. The therapy room contained session materials, a table, and two chairs. Sessions were conducted multiple times per day, 2–3 days per week depending on the participants’ schedule to receive services.

Response Measurement and Interobserver Agreement

The dependent measure was frequency of problem behavior. These data were converted to responses per minute by dividing the number of target responses by the total session time. During baseline and treatment conditions, reinforcer access was subtracted from the total session time to ensure the rate of problem behavior was not skewed by engagement with preferred items. Topographies of SIB for Charlie included the following: body hitting, the participant’s hand either open palm or closed fist making contact with his own body with force; biting, the participant’s open mouth and teeth coming into contact with his own skin between his teeth; head banging, participant’s head making contact with another object, body part, or surface with force. Topographies of yelling for Dennis included the following: any vocalization, with a clear beginning and ending, that was louder than the normal conversational level for a given setting. Each pause or new breath was considered a new instance of yelling.

Data collection was conducted by graduate students enrolled in an applied behavior analysis program. All graduate students were trained until they met fidelity criteria of 90% interobserver agreement (IOA) using BDataPro. A second observer independently scored the occurrence of SIB during 34% of FA sessions, 57% of baseline sessions, and 52% of treatment evaluation sessions for Charlie. For Dennis, the occurrence of yelling was scored by a second independent observer during 40% of FA sessions, 38% of baseline sessions, and 41% of treatment sessions. To assess IOA, each session was portioned into 10-s intervals and compared with an interval-by-interval basis, using exact agreement. To figure exact agreement, each session was partitioned into 10-s intervals and compared with an interval-by-interval basis. Each interval in exact agreement received a score of 1, and in all other intervals, a proportional score was provided by dividing the smaller number of instances by the larger number of instances. We then summed each interval’s score, divided the sum by the total number of intervals, and converted these scores to a percentage. The mean agreement scores of SIB for Charlie were 97% (range, 92 to 100%) during the FA, 92% (range, 80 to 100%) during baseline, and 96% (range, 77 to 100%) during treatment sessions. The mean agreement scores of yelling for Dennis were 100% during the FA, 95% during baseline (range, 87 to 100%), and 100% during treatment sessions. Procedural fidelity was collected on the accuracy of implementation during 54% and 41% of treatment sessions for Charlie and Dennis, respectively. The mean agreement scores were 97% (range, 62 to 100%) for Charlie and 91% (range, 63 to 100%) for Dennis.

Experimental Design

For Charlie, an ABCACBC reversal design was utilized to demonstrate experimental control of treatment. For Dennis, an ABAB reversal design was utilized to demonstrate experimental control of treatment. Intervention was implemented across participants to evaluate effectiveness of the VM DRO procedure.

Procedures

Preference Assessment

A multiple stimulus without replacement (MSWO; [9]) was conducted with each of the participants. This assessment was used to identify an array of highly preferred and medium preferred stimuli to use for both assessment and treatment (Table 1).

Table 1 Results of MSWO preference assessments for Charlie and Dennis

Functional Analysis

Prior to the study, a FA was conducted to determine the function of Charlie’s SIB similar to that of Iwata, Dorsey, Slifer, Bauman, and Richman [10]. Three conditions namely attention, tangible, and play (control) were conducted using a multielement design. Each condition lasted 10 min. Therapists wore condition-specific colored t-shirts to enhance discrimination of conditions. During the attention condition, the participant had access to moderately preferred leisure items and the therapist ignored the participant. Contingent on the target behavior, the therapist delivered a brief verbal reprimand and physical attention (e.g., hand on the back). During the tangible condition, the therapist provided access to preferred items for 30 s and then removed the items at the end of the interval. During the session, after removing the tangible items, therapist interacted with all of the preferred stimuli in their desired fashion until the target behavior occurred, and then items were delivered for 30 s. During the play condition, participants had access to moderately preferred items and receive a brief praise statement every 30 s, and no demands were placed on the participants.

Latency-Based Functional Analysis

Prior to the study, a latency-based functional analysis (LBFA) was conducted to determine the variables maintaining Dennis’ yelling. Therapist implemented a LBFA similar to that of Thomason-Sassi, Iwata, Neidert, and Roscoe [11•]. Conditions consisted of attention, play, tangible, and escape. All sessions were conducted in the same manner as a standard functional analysis [10], except that the first instance of the target behavior terminated the trial, and the prescribed consequence was delivered. If the target response did not occur throughout the test trial, the sessions lasted 5 min. A LBFA was chosen for Dennis due to limited session time. Additionally, a LBFA reduced Dennis’ contact with maintaining reinforcers that could lead to more persistent responding during subsequent treatment sessions.

Baseline

Procedures during baseline were identical to those of the FA condition in which the participant engaged in the highest rate of problem behavior. Baseline conditions were 10 min and consisted of a tangible condition for both Charlie and Dennis. The tangible baseline condition emulated that of a standard FA. The therapist provided access to preferred items for 30 s and then removed the items at the end of the interval.

VM DRO

Sessions in this condition were 10 min, consistent with baseline. The initial VM DRO schedules were based on mean interresponse time (IRT) of problem behavior during baseline sessions. Mean IRT of each session was calculated by dividing the total session time of baseline by the total number of responses during the baseline condition, rounded to the nearest second. To ensure client safety during VM DRO sessions, the participants were not blocked from engaging in the target behavior. All attempts of problem behavior were ignored, or a therapist provided the minimal amount of attention to ensure client safety. For Charlie, a small blue pad was placed underneath his head to prevent injury during episodes of severe head banging on walls, floors, and other surfaces. Due to the severity of the SIB, termination criteria were set for Charlie. Termination criteria included bleeding, swelling, bruising, wounds, and concussion symptoms (dizziness, impaired balance, nausea). Only one session during treatment met termination criteria (session 128), due to redness and swelling on Charlie’s hand caused by several bites. Delivery of reinforcement was contingent on not engaging in the target behavior at the moment the interval ends. Reinforcement was withheld if the participant engaged in the target behavior at the moment the interval ended. The reinforcers used during the VM DRO condition were the same as those used during baseline conditions. Additionally, reinforcer access was subtracted from the total session time to ensure the rate of problem behavior was not skewed by engagement with preferred items.

For each VM DRO session, experimenters calculated five intervals of 50%, 75%, 100% (mean), 125%, and 150% of the mean interresponse time. The therapist conducted momentary checks in a random counterbalanced order. Prior to the start of the session, experimenters wrote out a randomized order of the momentary checks. For this, the experimenter wrote all five intervals three times in a random order; then, the experimenter went down the list in sequential order placing a mark after each momentary check occurred. If the experimenter reached the bottom of the list during a session, they returned to the top of the list and continued. Experimenters notified therapist via a Bluetooth headset when the momentary check occurred. This method was utilized to keep momentary checks discreet and unpredictable to the client. The mean interval duration was increased by 50% after every two consecutive sessions in which problem behavior was below 90% of baseline. If the target behavior increased above this rate for two consecutive sessions, the mean interval was decreased to the previous level for the next session. For Charlie, advancement and regression criteria were altered to a more stringent schedule, at session 109 (Fig. 1, indicated by an asterisk), due to a repeated pattern in the data from sessions 88 to 108. After analyzing the data, experimenters hypothesized that limited exposure in each interval and rapid schedule thinning may have been a reason why low levels of problem behavior were not being maintained on thinner schedules of reinforcement. Advancement criteria for Charlie were increased to three consecutive sessions in which SIB was below the reduction line (0.2 RPM), while regression criteria were changed to four consecutive sessions in which SIB was above the reduction line (0.2 RPM). Advancement and regression criteria for Dennis were consistent with Toussaint and Tiger [7•] throughout the entirety of the study. Dennis’ 90% reduction was set at 0.5 RPM.

Fig. 1
figure 1

In the top panel are responses per minute of SIB during the attention, play, and tangible conditions of Charlie’s multielement functional analysis. In the bottom panel are the latencies to yelling during the attention, escape, play, and tangible conditions of Dennis’ multielement latency-based functional analysis

FI DRO

There were only two differences between FI DRO and VM DRO schedules. First, in the FI schedule, reinforcement was contingent on the participant not engaging in the target behavior for the entire interval. If the participant engaged in the target behavior any time during the interval, reinforcement was withheld. Second, during the FI schedule, the interval length was constant throughout the session. Furthermore, for Charlie, FI DRO session time was increased to 15 min at session 118 (Fig. 1). Session time was increased due to the FI DRO schedule being thinned to almost 3 min. Experimenters wanted to ensure that multiple intervals occurred during a single session with multiple opportunities to contact reinforcement.

Results

Charlie

Pre-Assessments

The results of the MSWO indicated that books, cars, and animals were all highly preferred stimuli for Charlie (Table 1). Additionally, the results of the FA for Charlie indicate that SIB was maintained by social-positive reinforcement, in the form of access to tangibles (Table 2). Initially, for Charlie, there was no responding in the attention and play conditions. During the second attention condition, the rate of SIB increased; however, in subsequent attention conditions, responding reduced and was variable. Responding eventually decreased to 0 for three consecutive sessions from sessions 27 to 29.

Treatment

Treatment results for Charlie are displayed in Fig. 1. The data display that Charlie’s baseline rates of SIB were steadily increasing. Based on the IRT of baseline, the initial VM DRO schedule was set at 40 s. During the first VM DRO treatment phase, rates of problem behavior decreased quickly. A spike in SIB occurred during treatment once the VM DRO schedule was thinned to 60 s and responding persisted even though the schedule returned to the initial interval. Due to the increase in responding, FI DRO was implemented, and rates of SIB decreased quickly with a few spikes in responding following schedule thinning. After responding remained low, Charlie returned to baseline to demonstrate control of the FI DRO over the rates of SIB. Rates of responding in the second baseline were variable, but rates of responding steadily increased from session 66 to session 68. Upon returning to the second FI DRO treatment condition, there was an initial spike in SIB followed by a rapid reduction in responding. Rates of SIB remained low until a repeated variable pattern of responding occurred between sessions 88 and 108. Following this pattern of responding, advancement and regression criteria were changed to a more stringent schedule (Fig. 1, indicated by the asterisk). Rates of SIB in subsequent sessions remained low and the schedule was thinned to terminal criterion of 300 s. While thinning the DRO schedule, sessions were increased to 15 min (session 118) to allow Charlie more exposure to the DRO schedule and more opportunities to contact reinforcement. VM DRO was implemented following mastery of FI DRO 300 s to determine if VM DRO could maintain the low levels of responding produced by an FI DRO procedure. Rates of SIB increased quickly; thus, FI DRO was re-introduced, reducing rates of SIB back to near-zero levels.

Dennis

Pre-Assessments

The results of the MSWO indicated that bubbles, ball, and scooter were all high-preference stimuli for Dennis (Table 1). Furthermore, the results of the LBFA for Dennis indicate that yelling was maintained by social-positive reinforcement, in the form of access to tangibles (Table 2). During the first 12 sessions of the multielement design, responding occurred only in one tangible condition. Due to the lack of responding, experimenters conducted pairwise LBFA to increase saliency between functional analysis conditions. The pairwise LBFA consisted of play and tangible conditions. Once the pairwise design was utilized, responding consistently occurred in the tangible condition, while responding in the play condition only occurred during one of the four sessions. The results of the LBFA for Dennis indicate that his yelling was maintained by social-positive reinforcement in the form of access to tangibles (Table 2).

Treatment

VM DRO treatment results for Dennis are displayed in Fig. 2. These data display that Dennis’ baseline rates of yelling were steadily increasing. Once the VM DRO procedure was implemented, an immediate and sustained decrease in yelling occurred while the schedule was thinned to 140 s. The second return to baseline was associated with an increase in responding and a higher rate of yelling compared with the first baseline. Responding was variable but stable around 9.0 RPM. During the second VM DRO phase for Dennis, the initial schedule was set at 19 s. An immediate reduction occurred in rates of yelling once the VM DRO schedule was implemented. Responding stayed low and at near-zero rates while the schedule was thinned to the terminal criterion of 300 s. The VM DRO was successful at reducing yelling for Dennis.

Fig. 2
figure 2

In the top panel are responses per minute of SIB during baseline, VM DRO, and FI DRO conditions for Charlie. The numbers above the data points indicate the mean VM DRO or FI DRO interval duration (in seconds). The horizontal dashed line is at 0.2 RPM and depicts the 90% reduction criterion used for decision making. An asterisk indicates a change in advancement and regression criteria, and the plus sign indicates an increase in session time to 15 min. In the bottom panel are responses per minute of yelling during baseline and treatment conditions for Dennis. The numbers above the data points indicate the mean VM DRO interval duration (in seconds). The horizontal dashed line is at 0.5 RPM and depicts the 90% reduction criterion used for decision-making

Concluding Remarks

In the current investigation, high-rate problem behaviors were reduced to near-zero rates through the use of FI and VM DRO procedures. The results of this study suggest that the success of VM DRO procedures may be idiosyncratic. For Charlie, VM DRO did not reduce SIB. Furthermore, the VM DRO did not maintain low rates of SIB that were produced by FI DRO. The results for Charlie are inconsistent with previous research on VM DRO effectiveness to reduce problem behavior maintained by social-positive reinforcement [6••, 7•]. Additionally, the results suggest that VM DRO may not be as effective as a momentary DRO at maintaining low rates of problem behavior produced by an FI DRO [5]. The VM DRO procedure may not have been effective for Charlie due to levels of reinforcement during treatment. Charlie only received 78% of reinforcement opportunities during VM DRO treatment conditions. Furthermore, personal characteristics of Charlie may have hindered the effectiveness of the VM DRO procedures. Since Charlie had a very limited mand repertoire, he may have engaged in SIB as a form of manding for the desired tangibles. Finally, since an alternative behavior was not trained and reinforced during treatment sessions, SIB may have persisted due to the potential of intermittent reinforcement during VM DRO sessions.

For Dennis, the VM DRO procedure was successful at reducing rates of yelling while thinning the VM DRO schedule a 300-s interval. The results for Dennis are consistent with previous research [6••, 7•]. The outcomes of the current study suggest that VM DRO procedures may be a viable method for some individuals that have problem behavior maintained by social-positive reinforcement. The VM DRO procedure may have been successful for Dennis since 98% of reinforcement opportunities were achieved. In addition to contacting almost all reinforcement opportunities, Dennis often used vocal approximations to mand for highly preferred tangibles during treatment sessions. These mands could have potentially contacted adventitious reinforcement during VM DRO sessions and persisted in subsequent treatment conditions.

One limitation of the current study is that FI and VM DRO procedures do not train and reinforce a socially valid alternative behavior. Due to the limited verbal repertoire of both participants, a differential reinforcement of alternative behavior (DRA) may have been a superior method of reducing the behavior while increasing a functional communication response (FCR). These methods could produce more significant gains in client outcomes.

A second limitation of the study is that only one participant (Charlie) was exposed to VM DRO as a maintenance procedure following a reduction of problem behavior produced by FI DRO. It could be possible that the outcomes demonstrated by Repp et al. [4] and Barton et al. [5] using momentary DRO can also be produced by VM DRO. Further investigation is needed in this area.

A third limitation of the study is the criteria for reinforcement using VM DRO. Due to the fact that the target behavior can occur in close temporal proximity of reinforcer access, the problem behavior could potentially be contacting adventitious reinforcement. This could potentially cause the behavior to persist during the VM DRO schedule and even strengthen the behavior over time. Clinicians should consider this possibility when using a momentary or VM DRO procedure for a client. For future studies utilizing a VM DRO schedule, a programmed delay or momentary window in which the check takes place may improve the use of a VM DRO schedule. By utilizing the momentary window, the therapist can ensure that the target behavior does not occur in close temporal proximity of reinforcer access.

An area of future investigation for VM DRO would be social validity measures. It may be possible that VM DRO would be more accepted by parents for generalization due to the flexibility of the procedure and that constant observation is not required. However, social validity measures and further research are required to test these suggestions.

In conclusion, this study demonstrated that VM DRO procedures may not be effective at reducing problem behavior for all individuals and outcomes may be idiosyncratic. Initial outcomes suggest that VM DRO may not be effective at maintaining results produced by an FI DRO, but further investigation is required. The findings provide more evidence to the limited research outcomes on VM DRO and further validate that it may be effective at reducing social positively maintained problem behavior for some individuals.