A token economy is a flexible and comprehensive reinforcement-based behavioral intervention in which individuals earn conditioned reinforcers, or tokens, as immediate consequences for specified behaviors. Learners later exchange earned tokens for highly valued items or activities referred to as backup reinforcers. Token economies are implemented to motivate and reinforce appropriate behaviors in numerous instructional and therapeutic contexts (e.g., early intervention, K–12 schooling, residential facilities) because they have many advantages. For example, token systems are often used when the immediate delivery of tangible reinforcers (e.g., edibles, leisure items) is impractical and/or may interrupt the continuity of work (Kazdin & Bootzin, 1972). In addition, tokens systems are portable, customizable, and can assist learners in tolerating delays to backup reinforcers (Kazdin & Bootzin, 1972). As such, token economies are among the most widely used procedures in behavior analysis with research on token economies spanning over 80 years (Hackenberg, 2009, 2018; Matson & Boisjoli, 2009). In fact, tokens are currently among the most delivered programmed consequences in educational settings with individuals with intellectual and developmental disorders (IDD), second only to praise/attention (Graff & Karsten, 2012).

Given their ubiquity, it is unsurprising that most textbooks in applied behavior analysis (ABA) provide guidance on how to effectively arrange token systems. For example, ABA textbooks often outline the essential components of token economies, suggest how to establish tokens as conditioned reinforcers, and describe how to implement token economies (e.g., Cooper et al., 2020; Miltenberger, 2015). Token training procedures recommended in textbooks and described in research vary widely, but often consist of several common steps. Training sometimes begins with a token conditioning phase, such as in Leon et al. (2016), during which tokens were presented along with backup reinforcers across several trials (i.e., stimulus–stimulus pairing). Training might also include exchange training, designed to establish an exchange response (e.g., handing the token to the therapist) as the final link in the chain leading to the backup reinforcer. For example, Fernandez et al. (2022) provided participants with a noncontingent token, which they then exchanged for a backup reinforcer by placing the token in the therapist’s hand.

Other studies have examined the schedule components comprising token reinforcement procedures. The token production schedule defines the work requirement to earn a token, and the exchange-production schedule specifies the contingencies that produce exchange opportunities (Hackenberg, 2018). For example, DeLeon et al. (2014) required participants to complete an academic task or activity of daily living to produce a token. The exchange-production schedule was then systematically increased, teaching the learner that they must accumulate a certain number of tokens before they can exchange them and how to store tokens. Argueta et al. (2019) systematically increased the exchange-production schedule across conditions. Each of these procedural components of a token economy can exert independent effects on behavior, and requires careful consideration to ensure desirable effects.

Researchers also recommend a variety of research-based solutions for addressing common issues that can arise when using token economies. For example, to maintain the efficacy of the backup reinforcers, Hine et al. (2017) recommends conducting frequent assessment of preferences among potential reinforcers and routinely rotating the items available as backup reinforcers. Along these lines, Moher et al. (2008) found levels of responding were less susceptible to fluctuations in motivating operations (MOs) generalized reinforcers (i.e., tokens exchangeable for more than one backup reinforcer). To increase the value of backup reinforcers, one might restrict access to those items outside of the context of the token economy. Several studies have shown that free access to reinforcers outside of the context in which they must be earned (i.e., an “open economy”) can reduce levels of responding within the earning context (e.g., Kodak et al., 2007; Roane et al., 2005) although these studies did not always use a token economy. Hine et al. (2017) further discussed issues that may arise from token production, exchange production, and token-exchange schedules, such as ratio strain (i.e., cessation of the target behavior because the response requirements in the token production schedule are too high).

Published research, like that summarized above (see also Hackenberg, 2018), is the best source of evidence-based recommendations for implementing a token economy. However, practitioners and educators seeking information on token economies may not always have ready access to the published literature, as the research is published in various journals and sometimes hidden behind paywalls. And even when it is published, studies often fail to describe their token training methods with sufficient detail to guide practice (e.g., Ivy et al., 2017). Add to this the incomplete and varied recommendations from manuals and online resources, such as how-to guides (e.g., Ackerman et al., 2020; First Path Autism, n.d.; Leaf et al., 2012; National Professional Development Center on Autism Spectrum Disorder, n.d.), and the result is wide variation in the recommendations for setting up and maintaining a token economy. As it currently stands, there is no established “gold standard” for evidence-based recommendations.

It would therefore not be surprising to find wide variation as well in the ways in which token economies are actually implemented. This is supported by our anecdotal observations in educational settings: the procedures often bear only superficial resemblance to the procedures described in research and behavior analysis textbooks. Still, our observations are based on a limited sample, and more data are needed. To that end, the purpose of the present study was to survey common practices for establishing and using token economies in clinical and instructional settings among BACB certificants. In particular, respondents were asked about the contexts in which tokens are used, how tokens are established as reinforcers, the schedules used to train and establish the token system, how backups reinforcers are selected, and how and under what conditions token economies are discontinued. The aim was to gather much needed data on the state of current practices, as a means of assessing the extent to which applications differ from research-based recommendations.

General Method

Subjects

We recruited participants through the Behavior Analysis Certification Board (BACB) mass email service. The invitation to participate in the survey was sent to 32,114 individuals within the United States who were certified as a board certified assistant behavior analyst (BCaBA), board certified behavior analyst (BCBA), or board certified assistant behavior analyst-doctoral (BCBA-D). We will refer to BCaBAs, BCBAs, and BCBA-Ds for the remainder of this article as respondents. In addition, the survey was only sent to respondents whose primary emphasis was behavior analysis, positive behavior supports, and education, and whose primary area of work was with autism, developmental disabilities, or special education. To meet the criteria for inclusion in the survey, respondents had to answer that they currently use token economies with individuals with neurodevelopmental disorders in clinical or educational settings.

Survey Structure and Contents

The study used a 50-item survey (available in the supplementary information) to identify common practices for establishing and implementing token economies used by respondents. The content of the survey questions and available responses were drawn from token economy literature and practices commonly observed in clinical practice by the researchers. The development process for the survey included two rounds of pilot testing by three graduate students and six BCBAs (i.e., one graduate student and two BCBAs per pilot test) who had experience using token economies as part of a research or clinical protocols. These individuals were excluded from further participation during the data collection phase. Feedback derived from the pilot tests, such as clarifications to questions and response options, were incorporated into the survey. The first section of questions inquired about the use of token economies and the context for that use. The next section inquired about token training and materials, including components of token training used, the stimuli used as tokens, and how back-up reinforcers were identified. A third section asked about how the schedule components of a typical token economy (i.e., token-production schedule, exchange-production schedule, token-exchange schedule) were determined and if, and how, token economies were faded out or discontinued. In acknowledgement that respondents may arrange token economies differently across individuals in their care, many of the questions began with “What is your default strategy for. . . .” We also inserted the following language in the instructions: “Although we understand that token economies are frequently tailored to the individual learner, we would like to know more about your default strategies. Please consider what you might typically arrange when responding to questions which refer to the learner.” The final section asked for respondent demographics (e.g., certification level, degrees, work settings).

General Procedures

The survey was conducted through Qualtrics as a data collection platform and participants could access the survey for 4 weeks. We distributed an invitation to participate which included informed consent within the body of the email sent by the BACB. The email also described that the survey would take approximately 10–15 min to complete. Clicking the provided hyperlink at the end of the email was considered consent to participate and directed participants to the survey proper.

Data Analysis

The percentage of participants selecting each response option, or combination of response options, was calculated out of the total number of participants who answered each question. There was, thus, a variable number of participants responding to each question based on their previous responses. Most of the survey items adopted a yes/no or multiple-choice format. Multiple choice questions always contained an “other” option in which the participant could elaborate on their answer textually. Many questions served as a screener for follow-up questions, thus not all participants encountered all 50 survey questions.

Results

Demographic data are shown in Table 1. We received 364 responses (a 1.1% return rate); however, the bounce rate, or the rate at which survey emails were returned as undeliverable, is unknown. Data sets were not included in data analysis if responses were not recorded (n = 32) or the respondent indicated that they did not use token economies with individuals with neurodevelopmental disorders (n = 23). In total, there were 309 respondents that began and responded to some portion of the survey and 255 respondents that completed the survey (a 17.5% attrition rate). Data were recorded and included in data analysis from respondents who started but did not complete the survey. The majority of respondents held a master’s degree (83.9%) and were certified as BCBAs (86.3%). This is consistent with distributions published by the BACB, which reveal that roughly 86% of respondents above the registered behavioral technician level in the United States are BCBAs (Behavior Analyst Certification Board, n.d.). The area in which respondents earned their highest degree was in behavior analysis (51.4%; Table 1). A plurality of respondents, 38.4%, worked in public or private school (K–12) settings and 25.1% worked in home-based programs.

Table 1 Demographic information by highest degree earned, level of certification, and work setting

Figure 1 shows the contexts surrounding the use of token systems (Fig. 1A) and the type of behavior targeted for training (Fig. 1B). A majority of respondents (54.4%, Fig. 1A) reported using token economies across all contexts offered: skill acquisition, skill maintenance, and behavior reduction. In addition, most respondents used a target response that was in the learner’s repertoire (i.e., a mastered task) during token training (74.7%), with about 21.8% indicating that the target behavior used in production training was “a task currently under acquisition” (Fig. 1B).

Fig. 1
figure 1

Questions on the context in which token economies are used and what target behavior used during token training. Note. Dark grey bars represent single item responses and light grey bars represent responses to a combination of items. Panel A: This figure shows the distribution of responses when certificants were asked about the context in which token economies were implemented. Panel B: This figure illustrates the target responses used during the token production component of token training

Figure 2 presents data on when token training is conducted (Fig. 2A), when training is terminated (Fig. 2B), and when token economies are terminated (Fig. 2C). Figure 2A reveals that the majority of respondents reported using specific procedures to establish tokens as conditioned reinforcers, either always (35%) or when the learner did not have prior experience with token economies (45.6%). Token training was usually terminated once the learner met a specified mastery criterion (69.4%; Fig. 2B). A majority (59.7%) of respondents report that they do not conduct a reinforcer assessment following token training to verify that tokens function as a conditioned reinforcer. Finally, a majority of respondents reported discontinuing token economies when appropriate (86.7%) (e.g., when the learner’s services are to be discontinued). The typical strategy used is schedule thinning (62%; Fig. 2C) rather than replacing it with a different intervention (33.5%; Fig. 2C).

Fig. 2
figure 2

Questions on when to conduct token training, when to terminate token training, and how to discontinue the use of token economies. Note. Dark grey bars represent single item responses and light grey bars represent responses to a combination of items. The other combinations category listed above included several unique arrangement combinations, each of which were rarely selected. Panel A: This figure illustrates the conditions under which certificants reported initiated token training. Panel B: This figure shows the conditions under which certificants reported terminating token training. Panel C: This figure describes the strategies used when discontinuing the use of token economies

Figure 3 includes data from questions concerning token training. Respondents who reported conducting token training were most likely to use a combination of verbal statements of the contingencies, pairing or conditioning procedures, production training, and accumulation training (57.8%; Fig. 3A). It is interesting that of all the token training variations reported, 60% of respondents did not include token exchange training. Of the 272 respondents that included a pairing or token conditioning procedure (i.e., method for establishing a relation between tokens and backup reinforcers) in token training, 15.1% used stimulus–stimulus (SS) pairing in which the token and backup reinforcer are presented together, and 27.9% used response–stimulus–stimulus (RSS) pairing, in which a target response is required before the token and backup reinforcer are presented together (Fig. 3B). It is interesting that five respondents (0.02%) selected the “other” response and reported using a method of token conditioning procedures in which a response is required to produce a token, which is then immediately exchanged for a backup reinforcer (here termed response–stimulus plus exchange (RSE) conditioning).

Fig. 3
figure 3

Questions on token training component and procedures. Note. Dark grey bars represent single item responses and light grey bars represent responses to a combination of items. The other combinations categories listed above included several unique arrangement combinations, each of which were rarely selected. Panel A: This figure depicts the different components that certificants reported using during token training. Panel B: This figure shows the distribution of pairing methods that certificants reported implementing. Please note that responses reporting the use of response-stimulus plus exchange (RSE) fell within the “other” category. Panel C: This figure illustrates that methods used by certificants to teach learners to exchange tokens. Panel D: This figure shows the methods used by certificants to teach learners to produce tokens

When training token exchange and accumulation, nearly 65% of respondents (Fig. 3C) reported using forward chaining alone or in combination with a verbal description of the contingencies. That is, one token is initially available for exchange. Once the learner reliably exchanges the token independently, the number of tokens delivered increased across trials, sessions, or phases. In backward chaining, multiple tokens are delivered noncontingently and exchanged for the backup reinforcer. Only 13.4% of respondents reported using backward chaining alone or in combination with a verbal description. When training token production, nearly 40% of respondents (Fig. 3D) required a target response to produce one token, which was immediately exchanged. The number of tokens the learner could produce prior to exchange was then increased across trials or sessions.

Figures 4 and 5 contain data from questions concerning token selection and token exchange, respectively. Figure 4A shows that stimuli used as tokens in clinical settings were more often selected (70.1%) based on the learner’s interests (e.g., Sponge Bob tokens, Paw Patrol tokens). Following token production, learners were typically allowed to manipulate the token (83.1%) and tokens were placed on a token board (80.8%; Fig. 4B). In most cases, learners were allowed to exchange their tokens once a specified exchange-production schedule had been met (62.5%; Fig. 5A). That is, the learner accumulated the number of tokens required to produce an exchange opportunity. Still, a sizeable proportion (10.3%) indicated that tokens were exchanged as they were earned. Tokens were usually exchanged all at once rather than one at a time (70.3%; Fig. 5B). It is interesting that 11.9% of respondents reported that tokens were never exchanged (Fig. 5A). That is, the backup reinforcer was automatically delivered once the exchange-production schedule was met. In these cases, tokens were used in clinical settings to mark the number of responses left before the end of the work session rather than as part of a traditional token economy.

Fig. 4
figure 4

Questions on stimuli to be used as tokens and where tokens are accumulated. Note. Panel A: This figure depicts how stimuli to be used as token were selected. Panel B: This figure shows where tokens were placed once they were earned

Fig. 5
figure 5

Questions on token exchange. Note. Panel A: This figure depicts the strategies reported by certificants on when tokens were exchanged. Panel B: This figure illustrates who (therapist or learner) exchanged the tokens and how they were exchanged. Panel C: This figure shows the strategies used by certificants to determine the token exchange schedule. Panel D: This figure depicts whether certificants reported using a consistent unit price, or the ratio of the response requirement to the magnitude of reinforcer

Figure 6 contains data from questions concerning the backup reinforcers. When identifying backup reinforcers, 37.3% of respondents (Fig. 6A) reported using some combination of interview or surveys, direct observation, and preference assessments. Of 262 respondents, 32.8% conduct preference assessments (direct or indirect) multiple times a day to identify new or additional backup reinforcers (Fig. 6B). Backup reinforcers included any mix of one or more edible items, one leisure item/activity, and a break opportunity (22.2%; Fig. 6C) and access to these items was restricted across all environments nearly half of the time (46.6%) or within the clinical context (47%; Fig. 6F). Further analysis of the backup reinforcers reported revealed that 76.3% of respondents reported using a break from instruction as a backup reinforcer in combination with other backup items. Backup reinforcers were not typically displayed using a token store (65.1%). However, for those respondents that reported using a token store, backup reinforcers were most often displayed in a pictorial format (51.7%; Fig. 6E). Clients typically selected their preferred backup reinforcer prior to session (55.6%; Fig. 6D). That is, clients were allowed to preselect the backup reinforcer for which tokens would be exchanged contingent upon completion of the work requirement. A majority of respondents reported communicating to the client how much one unit of the backup reinforcer costs (58.9%) and prices of backup reinforcers were typically determined based on the client’s preference hierarchy (47.7%; Fig. 5C). If tokens were exchanged for access to leisure items or activities, 51.7% of respondents reported that tokens were exchanged for a predetermined amount of time based on the number of tokens accumulated (Fig. 5D). For example, each token could be exchanged for 30-s access to the client’s iPad. However, 48.3% of respondents reported that clients gained access to leisure items or activities for some set amount of time regardless of the number of tokens accumulated. In these cases, there is no direct relationship between tokens earned and the price of each unit of the backup reinforcer (i.e., unit price).

Fig. 6
figure 6

Questions on backup reinforcers. Note. Dark grey bars represent single item responses and light grey bars represent responses to a combination of items. The other combinations categories listed above included several unique arrangement combinations, each of which were rarely selected. Panel A: This figure shows the methods used by certificants to identify potential backup reinforcers. Panel B: This figure illustrates how often certificants reported conducting preference assessments. Panel C: This figure depicts the variety of backup reinforcers that tokens were reportedly exchanged for. Panel D: This figure shows the strategy used by certificants on how and when backup reinforcers were selected. Panel E: This figure illustrates the modality used to present the available backup reinforcers if a token store was used. Panel F: This figure depicts whether certificants implemented a closed economy

Discussion

This study assessed self-reported practices used by respondents when establishing and implementing token economies in applied settings. The survey results show broad agreement in the time and care put into token training, the use of systematic assessments to identify backup reinforcers and set schedule parameters, and the use of generalized tokens exchangeable for a variety of backup reinforcers. However, the survey also revealed considerable variability in the specific ways in which token economies are implemented, consistent with our anecdotal observations. Variation was seen across even the most basic components of a token economy, including the methods for establishing the tokens as reinforcers, the use of generalized reinforcers, the types of schedules and tokens used, and the types of behavior targeted for reinforcement. These will be discussed in turn.

The methods for establishing the tokens as reinforcers varied widely across respondents, the majority involving some type of pairing procedure, in which tokens are presented together with a backup reinforcer. It is interesting that only five respondents (0.02%) indicated they use the response–stimulus–exchange conditioning procedure. To our knowledge, this method has not been explicitly evaluated in the applied token literature, but is most similar to practices in the basic literature in which token training typically begins with training the exchange response followed by training of the token production response. This procedure is also most similar to the procedures used effectively in establishing nontoken stimuli as conditioned reinforcers (Dozier et al., 2012; Holth et al., 2009) that have called into question the efficacy of simple pairing procedures. Future research should include direct comparisons of these different training methods as a basis for evidence-based recommendations.

A majority of respondents reported using a healthy range of backup reinforcers, ensuring generalized reinforcing functions of the tokens. This is consistent with research recommendations (Hackenberg, 2018). In combination with edibles and leisure items/activities, many in the current survey reported using a break from instruction as a back-up reinforcer. Although most backup reinforcers are positive reinforcers, if the learner engages in escape-maintained behavior, it may be advisable to use the functional reinforcer (i.e., a break from instruction) as a back-up reinforcer. For example, Wadsworth et al. (2015) used the results of a functional behavior assessment as the basis for backup reinforcer selection. They determined that noncompliance with academic demands was maintained by escape and a break from instruction was used as a backup reinforcer for two participants. Compliance with academic demands increased for both participants when compared to baseline. However, contingent upon meeting the exchange-production schedule, the participants were allowed to take a break and no formal token exchange occurred. Therefore, the token economy in Wadsworth et al. (2015) functioned more like a progress bar in which the token board merely signaled how much work needed to be completed before the participant could escape the academic context rather than a traditional token economy. It should be noted that the consumption of positive backup reinforcers inherently involves a break from token-earning requirements. However, it is possible that a break might be a sufficiently effective backup reinforcer, even in the absence of positive reinforcers, in a token economy such as in Wadsworth et al. (2015). Future research should compare the effects of a traditional token economy, in which tokens are exchanged for tangible items, and a progress bar, in which a formal exchange does not occur.

Future research should also compare token schedules, in which the number of tokens is correlated with the magnitude of the backup reinforcer in the exchange period, and standard chain schedules, with a single reinforcer in the exchange period. Basic research on token schedules is based mainly on true token schedules (Hackenberg, 2009, 2018). By contrast, the majority of token schedules in applied token economies are chain schedules rather than true token schedules. Although not directly queried on this distinction, that 70% of respondents indicated that tokens are exchange all at once instead of one at a time, and 48% answered that tokens are exchanged for a specific duration to backups regardless of the number of tokens earned, suggests the use of chain rather than true token schedules. In basic research, token schedules generate higher levels of responding than otherwise comparable chain schedules (Bullock & Hackenberg, 2015). And thus, although additional applied research is needed, the available evidence suggests using true token rather than chain schedules.

About 20% of respondents indicated using an acquisition task during production training. This introduces a level of complexity insofar as use of acquisition tasks in this context makes it challenging to disentangle the source of difficulty when token systems seem ineffective. In particular, failures to respond efficiently may be related to suboptimal token training, but they also may be related to skills deficits such that poor responding might have been observed even under the best of circumstances (i.e., the learners simply did not have the foundational skills to complete the task, independent of whether the tokens were effective consequences). Furthermore, few respondents reported conducting a reinforcer assessment to independently test the efficacy of tokens following the token training procedures. Reinforcer assessments are probably more common in research than in clinical practice, but they serve to ascertain that the token training procedures did in fact result in a consequence that could support responding. In the absence of a reinforcer assessment, it remains difficult to discern whether any subsequent failures of token systems to enhance target responding are attributable to skills versus performance deficits.

The tokens most commonly used were based on the learner’s interests (i.e., interest-based tokens), like a favorite character or animal, rather than novel tokens. However, there is little research on interest-based based tokens (Carnett et al., 2014; Charlop-Christy & Haymes, 1998). Although both studies found that interest-based tokens produced greater therapeutic effects than a previously existing token economy (Charlop-Christy & Haymes, 1998) and arbitrary token (i.e., the stimulus used as a token was selected arbitrarily; Carnett et al., 2014), neither article measured the effects of token-directed behavior with respect to each token type. That is, a token may evoke behavior that does not directly lead to the storage or exchange of the token, such as tapping the token in a stereotypic fashion. For example, nonhuman subjects have been shown to mouth the token if it was previously paired with food (Kelleher, 1958; Malagodi, 1967). Respondents reported that, when exchanges did occur, learners typically handled tokens during token production and exchange, which creates an opportunity in which token-directed behavior may interfere with the delivery of academic instruction. Thus, future research should evaluate the effects of interest-based and novel tokens on skill acquisition and token handling time during token production and exchange.

Some of the variability in the present results may have to do with the ways the research has been disseminated. Ivy et al. (2017) reported that only 52% of articles on token economies described token conditioning procedures in replicable detail. These data may be partly due to publication practices in which procedural details, particularly those of preexperimental procedures like token training, are edited down or included as supplementary information, reducing the likelihood that practitioners will contact this information. Relevant applied research is also published in a variety of journals, only some of which may be accessible to practitioners. For these reasons, researchers should continue to explore other avenues of dissemination. For example, popular behavior analytic podcasts might be one way in which researchers may present findings in an accessible manner that reaches a broader audience. In addition, both basic and applied researchers could create training modules (e.g., continuing education courses) and free online resources which consolidates information on token economies in the published literature (e.g., stimuli to be used as tokens, number and types of backup reinforcers to be used, and the effects of second-order schedules).

In balancing the number of questions on the survey and our ability to gather comprehensive information on respondent practices with token economies, there were several topics that we did not inquire about. For instance, although we asked respondent about token training components and procedures, we did not ask questions on the order in which these procedures were implemented. Further, we did not ask questions about the specific schedules of reinforcement used and when those schedules were thinned. Our questions regarding token production and accumulation training did not include response options in which the token production schedule may have been thinned first (i.e., all response options required that the exchange-production schedule was the first schedule modified). The basic literature on token economies suggests that, when conducting token training, the exchange-production schedule should be thinned before any modifications are made to the token production schedule (e.g., Bullock & Hackenberg, 2006; Hackenberg, 2018). However, we are unable to determine whether this practice is also reported in clinical and instructional settings.

The present study is not without limitations. One potential limitation is that only 309 certificants responded to the survey, 255 of which completed the survey. The return rate is comparable to those of other surveys that used the BACB email list as a recruitment method (e.g., Colombo et al., 2021; Sellers et al., 2019). Nevertheless, we must interpret the generality of our results with caution. Another potential limitation is that the survey relies on self-report of clinical practices and may not accurately reflect how respondents establish and use tokens. That is, reported practices were not corroborated through observations of practitioner behavior which could raise concerns about the validity of the present results. Recognizing these limitations, surveys can nevertheless yield useful data in the form in which they occur—as verbal responses. Indeed, an expanding experimental literature with human subjects based on survey methods, some of which has shown good predictive validity and test–retest reliability (see Reed et al., 2022 for a brief review). Although self-report measures may serve as a good alternative when direct observation of behavior is impractical (Reed et al., 2022; Roma et al., 2017), future research should examine the correspondence between practices reported in the present study and actual practices used in clinical and instructional settings.

The survey did not specifically gather information in relation to token economies applied in group settings, such as classrooms. Graff and Karsten (2012) found that tokens were the second most commonly used consequence in educational settings. In addition, the authors noted that 57% of respondents worked in public school settings and that 68% were not certified in behavior analysis. These token systems may not only differ in how they are established and used, but also in the context in which they are applied (see Kim et al., 2021, for a review). Future research should examine the extent to which token economy practices differ across therapeutic and classroom settings. In addition, researchers should identify whether these practices differ when the individual developing and implementing the token economy does or does not have formal behavior analytic training. Future research should also evaluate and compare common practices identified in the present study to practices that are typically recommended in textbooks and/or reported in research.

In sum, the results of this study show that current token economy practices in clinical and educational settings are highly variable and are not standardized. Further, many of the practices identified are disconnected from the token economy literature and lack a conceptual framework that is based on the general basic and applied principles of token economies. For these reasons, it is likely that token economies are not as effective in these settings as they could be with stronger guidance from research findings. Although one of the greatest advantages of token economies is their flexibility in tailoring to the individual needs of each learner, there is also a clear need for standardized methods. Manualized instruction based on evidence-based recommendations would best serve practitioners by providing guidelines on how best to set up and maintain a token economy, and how to troubleshoot components of a token economy when an initial token economy program fails to produce effective results. Providing practitioners with resources that outline standardized, evidence-based practices will lead to far more successful token economies in clinical and educational settings.