Keywords

1 Introduction

Addiction is one of the leading public health challenges in the USA (Nutt et al. 2006; NIDA 2005) with an estimated annual cost of over $600 billion dollars (Volkow 2011). More than 20 million Americans meet diagnostic criteria for substance use disorders (SUDs) other than for tobacco (U.S. Department of Health and Human Services (HHS) 2016). Although existing substance abuse services are efficacious and replicable, considerable opportunities for improvement remain (Rösner et al. 2010). For example, only one in nine individuals with alcohol use disorder (AUD) benefits from treatment with medication, and brief psychotherapeutic interventions produce only small reductions in alcohol consumption (Foxcroft et al. 2016; Klimas et al. 2012). Perhaps this limited success is a result of treatments that were not designed specifically to modify any core feature of the disorder (e.g., Ahn and Wampold 2001; Bell et al. 2013). This lack of target specificity of psychotherapeutic treatments raises the question of how to develop a treatment that targets the core features of a disorder.

One process-driven answer to that question is the Experimental Medicine Approach developed by Claude Bernard in his classic text published in 1865, An Introduction to the Study of Experimental Medicine (Bernard 1957; Nielsen et al. 2018). The Experimental Medicine Approach consists of four steps (Bernard 1957; Nielsen et al. 2018). First, develop a hypothesis of a core process of the disorder. Second, devise a way to measure that process. Third, ascertain if that process is observed in the disorder. Fourth, deploy an intervention that engages the disorder-related process and determine if those changes produce any concomitant changes in any other components of the disorder (e.g., clinically relevant behavioral outcome changes). If the intervention changes the targeted processes as well as some other component of the disorder in a therapeutically appropriate way, then the intervention could be used as a treatment or part of a treatment, and that intervention would be one targeting a specific feature of the disorder.

With respect to addiction, a hypothesis was developed, based on clinical observations and then confirmed by early research, that those suffering from this disorder focus on the short-term and display an immediacy bias (Bickel et al. 2017; Petry 2001). Following up on that hypothesis, delay discounting (also referred to as temporal discounting, intertemporal choice, or time preference) was employed as a sensitive measure of this bias with appropriate granularity. In short, delay discounting is the rate at which an individual devalues a reward as a function of its delay to receipt (see Box 1). While delay discounting is a universal phenomenon, an excessive preference for immediately available rewards (i.e., high rate of discounting) suggests a shortened time horizon in which immediately available rewards are overvalued and future, potentially negative consequences are undervalued. Delay discounting is particularly relevant to alcohol and substance abuse. Early empirical findings, subsequent research, reviews, and meta-analyses have demonstrated that the rate at which individuals discount future rewards is strongly associated with their propensity for substance abuse as well as myriad other negative health behaviors (Amlung et al. 2016a, b for meta-analyses; see MacKillop et al. 2011; Snider et al. 2018a). Thus, this prior work can be seen as mapping on to the first three steps of the Experimental Medicine Approach. The fourth step, target engagement, informed a new conceptual model of addiction, referred to as Reinforcer Pathology Theory. Below we first describe this model and then review the data showing the effects of target engagement.

Box 1 Discounting and Valuation

Delay discounting (e.g., temporal discounting, intertemporal choice, time preference) reflects the devaluation of a reinforcer as a function of the delay to its receipt. Discounting tasks offer a choice between a smaller and larger reward with a conditional delay or level of uncertainty in its receipt (e.g., $50 now or $100 later; 100% chance of $50 now or 75% chance of $100 now). Monetary reinforcers are commonly used due to their universality and fungible nature, though other reinforcers including food, drugs, and sex have also been utilized.

Reinforcer valuation (e.g., craving, purchase tasks, self-administration) can be assessed using several well-validated measures. Craving for a commodity can be assessed through multiple methods, including the measurement of self-reported intention to and/or relief from using the commodity (Sayette et al. 2000). Purchase tasks can assess behavioral economic demand for a reinforcer by allowing an agent to make real or hypothetical purchases across trials involving a range of prices (Roma et al. 2016). During self-administration tasks, individuals have the opportunity to work for and consume single units of a preferred commodity (Bickel et al. 1990).

2 Reinforcer Pathology Theory

The concept of a reinforcer pathology is defined as the interaction between two important behavioral economic processes: (1) excessive preference for immediately available rewards and (2) the overvaluation of a particular commodity that offers brief, intense reinforcement (Bickel et al. 2011a, 2014). First, excessive preference for immediate rewards, or immediacy bias, is a process that may be measured by delay discounting. As mentioned above, the process of delay discounting is strongly associated with alcohol and substance use severity. For example, current cigarette smokers discounted future monetary rewards significantly more than never-smokers and ex-smokers (Bickel et al. 1999), heroin-dependent participants discounted the future significantly more than non-users (Kirby et al. 1999), and individuals with AUD discounted both future monetary and alcohol rewards significantly more than non-drinkers (Petry 2001). Second, overvaluation of a reward may be measured by an individual’s demand, craving, or self-administration of that commodity (see Box 1). Again, decades of literature have demonstrated a relationship between high value for a rewarding substance and severity of its use. For example, high demand and craving for alcohol predicted alcohol abuse and AUD (MacKillop et al. 2010a; Skidmore et al. 2014). The interaction of these two processes (discounting and valuation) has been described by an initial and an expanded version of Reinforcer Pathology Theory – 1.0 and 2.0, respectively.

Reinforcer Pathology Theory 1.0

Reinforcer Pathology Theory 1.0 describes that delay discounting and valuation interact to synergistically predict severity of use. The concept of a reinforcer pathology may be illustrated as a 2 × 2 matrix (Fig. 1). That is, the individuals with the highest discounting rates and greatest valuation for their substance may be those at the greatest risk of SUD (cell D). In contrast, individuals with the lowest rates of delay discounting and very little valuation for substances of abuse are at the least risk for SUD (cell A). Individuals who fall in cells B and C demonstrate intermediate risk between the extremes of cells A and D. Perhaps these individuals overuse, but can retain a job and fulfill family obligations. We note that the delineation of these metrics into one of four cells is a simplified heuristic. We have added shading to indicate that the probability of individuals at risk may fall along a continuum.

Fig. 1
figure 1

Reinforcer pathology and substance abuse risk. Individuals in cell A (low discounting, low demand) and cell D (high discounting, high demand) display the lowest and highest risk for developing substance use disorders, respectively. Individuals with an intermediate risk for substance use disorders, cells B and C, may display a combination of high discounting/low demand or low discounting/high demand

Reinforcer Pathology Theory 1.0 is supported empirically by studies examining how both delay discounting and valuation of a reward relate to severity of misuse. For example, monetary discounting and demand for alcohol predicted alcohol-related consequences, as measured by the Young Adult Alcohol Consequences Questionnaire, in college students (Lemley et al. 2016). That is, those with the greatest discounting rates and highest demand for alcohol demonstrated the greatest number of alcohol-related consequences. Reinforcer Pathology Theory 1.0 does not specify whether delay discounting, demand, and craving for alcohol are independent processes; however, they are all associated with AUD criteria in alcohol users and are intercorrelated with each other suggesting their overlap (MacKillop et al. 2010a).

The processes contributing to reinforcer pathology also map onto neurobiological mechanisms. The Competing Neurobehavioral Decision Systems (CNDS) theory describes two decision-making systems: (1) the impulsive decision system and (2) the executive decision system (Bechara and Damasio 2005; Bickel et al. 2007). The theory posits that the relative control between these two decision systems promotes either impulsive or self-controlled decision-making, respectively. For example, brain regions associated with the impulsive decision system including the ventral striatum, medial orbitofrontal cortex, and medial prefrontal cortex increased in activation when the subjective value of a reward increased and the delay to its receipt decreased (McClure et al. 2004). In contrast, the left dorsolateral prefrontal cortex (PFC), a structure of the executive decision system, promoted self-controlled decisions when presented with a delay discounting choice (Figner et al. 2010). Together, these examples illustrate that the processes driving reinforcer pathology can be derived from neuro-mechanistic underpinnings. Below we describe the neural circuits underlying these processes in greater detail. Importantly, the ability to identify functional phenotypes of severity of use may have a significant impact on precision medicine. Understanding the risk phenotypes will help to both identify individuals at risk (Fig. 1; cell D) and develop effective treatment interventions. More recently, the concept of reinforcer pathology has evolved to interpret the interaction of delay discounting and valuation in greater detail.

Reinforcer Pathology Theory 2.0

Reinforcer Pathology Theory 2.0 describes how delay discounting and demand may interact by illustrating how individuals integrate valuation of rewards as a function of their temporal horizon (Fig. 2). Specifically, delay discounting functionally measures the temporal window (i.e., how far the individual can imagine into the future) over which reinforcer value can be integrated. That temporal window interacts with the value of different reinforcers depending on the length of the temporal horizon. Consider the following example: alcohol and other substances of abuse deliver brief, intense reinforcement with immediate and reliable effects. In contrast, prosocial reinforcers (e.g., employment, relationships) are lower intensity, inconsistent, and accrue their value over longer temporal windows. If an individual’s temporal window is constricted (i.e., excessive delay discounting, Timepoint T/2), the summed relative value (area under the curves) is much greater for substance use than for prosocial reinforcers. This relative value translates, therefore, to overconsumption (i.e., overvaluation) and a lack of regard for delayed consequences – a reinforcer pathology. From this perspective, Reinforcer Pathology Theory 2.0 suggests the temporal window (i.e., an individual’s rate of discounting) as the target for intervention. Reinforcer Pathology Theory 2.0 would predict that expansion of the temporal window should decrease overvaluation and excessive demand of substances of abuse and conversely constriction of the temporal window would increase valuation. If these observations are supported, then this would be an example of how the Experimental Medicine Approach could identify potential interventions that target a disorder-specific process. Below we describe evidence for interventions that modulate the temporal window.

Fig. 2
figure 2

The integrated values of two reinforcers over time (brief and extended reinforcers). The gray curve represents the subjective value of a brief reinforcer (e.g., drugs) at different time windows (T ). The black curve represents the subjective value of an extended reinforcer (e.g., prosocial reinforcers) over the same windows. The areas under these curves represent integrated subjective value over the course of repeated choices for each reinforcer. Depending on the temporal window considered (T/2, T, or 2T), the integrated value of the two reinforcers may reverse

3 Lengthening and Shortening the Temporal Window

Reinforcer Pathology 2.0 suggests that interventions altering the temporal window should change valuation of substances of abuse. Hence, interventions that have been identified to modify discounting (Bickel et al. 2016a, b; Koffarnus et al. 2013; Rung and Madden 2018) have the potential of altering the valuation of substances. To date, only a small number of these interventions have investigated both discounting and valuation of reinforcers. In this section, we discuss a novel approach that has shown changes in both the temporal window and the valuation of substances, namely, narrative theory.

3.1 Narrative Theory

Human beings are storytellers by nature (Gregg 1991; Hermans 1993; McAdams 1988; McAdams et al. 2006). Over the last few decades, the field of psychotherapy has been greatly influenced by the increasing interest in studying narratives and discovering the power of telling a tale (e.g., Krippner et al. 2007; McLeod 1997; Meier 2012; Speedy 2008; White and Epston 1990). A novel framework of narrative theory (Bickel et al. 2017) that aims to harness humans’ unique sensitivity to language and storytelling (Huth et al. 2016; Nummenmaa et al. 2014) has been utilized to study human behaviors and promote behavioral change. For example, narratives simulating future thinking (e.g., episodic future thinking (EFT)) are used to expand the temporal window of reward valuation, while narratives describing circumstances of insufficient resources (i.e., economic scarcity) are used to shorten the temporal window of reward valuation. Below, we discuss these examples in greater detail.

Episodic Future Thinking (EFT)

EFT is a narrative intervention based on the new science of prospection that was first identified by Gilbert and Wilson in a Science publication in 2007. A growing body of evidence suggests that prospection is crucial for understanding human cognition, affect, motivation, and action (Seligman et al. 2013). Individuals with damaged frontal lobe areas, as well as individuals with addiction, show impaired prospective thinking (i.e., deficits in planning prospectively; Griffiths et al. 2012; Heffernan 2008; Kurczek et al. 2015). One systematic method to stimulate prospection is EFT, which is a narrative manipulation in which participants generate narratives describing their own potential future experiences. Previous studies indicated that EFT expands the temporal window of reward valuation (i.e., shift one’s preference from immediate to long-term rewards) in individuals with AUD (Bulley and Gullo 2017; Snider et al. 2016), smokers (Chiou and Wu 2017; Stein et al. 2016, 2018), those who are overweight/obese, and controls (Daniel et al. 2013a, 2015; Kaplan et al. 2016; Peters and Büchel 2010). Consistent with reinforcer pathology, EFT not only expands the temporal window of reward valuation but also decreases behavioral economic demand for addictive substances, such as alcohol (Bulley and Gullo 2017; Snider et al. 2016), cigarettes (Stein et al. 2018), and food (Sze et al. 2017) in alcohol-dependent individuals, smokers, and overweight/obese individuals, respectively. Moreover, EFT reduced self-administration of rewards, such as highly palatable snacks among the obese (Daniel et al. 2013b, 2015; O’Neill et al. 2016) and cigarettes among smokers (e.g., Stein et al. 2016). For example, in a study by Stein et al. (2016), 42 participants were randomly assigned to the EFT or the episodic recent thinking groups (ERT; a commonly used control for the effects of prospection in EFT in which participants imagine real-life past events; Daniel et al. 2015; Lin and Epstein 2014). The findings of the study indicated significantly lower rates of discounting and self-administration of cigarette puffs (with a medium effect size) among participants in the EFT group compared to the ERT group (Fig. 3). Similarly, other narratives that lengthen the temporal window such as those describing a long-term romantic relationship increased preference for larger delayed rewards and decreased craving for cigarettes among cigarette smokers (Athamneh et al. 2019).

Fig. 3
figure 3

Discounted value of $1,000 across increasing delays and corresponding area under the curve (AUC) values in episodic future thinking (EFT) and episodic recent thinking (ERT) participants (left panels). Also pictured is the number of cigarette puffs earned in the cigarette self-administration task (right panel) in EFT and ERT participants. Data replotted from Stein et al. (2016)

Economic Scarcity

Economic scarcity is a narrative manipulation that describes situations of insufficient resources. Research and interest in investigating the effect of economic scarcity on decision-making have been increasing (Shah et al. 2012). Previous studies indicated that economic scarcity narratives (e.g., job loss and negative income shock) shorten the temporal window (Bickel et al. 2016c; Haushofer et al. 2013; Sze et al. 2017). Consistent with Reinforcer Pathology Theory 2.0, scarcity narratives shorten the temporal window and increase demand for unhealthy food among overweight/obese individuals (Mellis et al. 2018; Sze et al. 2017). Similarly, other narratives that shorten the temporal window such as those describing a short-term sexual relationship increased preference for smaller immediate rewards and increased valuation of cigarettes among cigarette smokers (Athamneh et al. 2019). Moreover, reading a narrative describing a natural disaster (i.e., a hurricane) shortened the temporal window, increased demand for highly palatable snack foods, and increased total consumption of these snacks among obese individuals (Snider et al. 2019).

3.2 Other Interventions Manipulating the Temporal Window

In this section, we highlight some other interventions that have manipulated the temporal window, but that have not necessarily altered valuation mainly because valuation was not assessed. As a caveat, the following will not be an exhaustive discussion of these various manipulations, as several recent reviews have provided an excellent discussion (Koffarnus et al. 2013; Rung and Madden 2018) and readers are encouraged to consult these reviews for more details. Rather, we will highlight some of the promising approaches that may operate similarly to those interventions that have influenced both delay discounting and valuation measures and therefore directly manipulate the temporal window.

3.2.1 Lengthening the Temporal Window

Outcome Framing

Several approaches have manipulated the way in which outcomes are framed to participants. In typical discounting tasks, two options are presented: an amount of money available now (e.g., $500 now) and an amount of money available after some delay (e.g., $1,000 in 1 year). Presenting options in this way implicitly suggests that if the immediate option is chosen, then any (larger) later amount of money is forgone. The converse is also true such that if the latter option is chosen, then any (smaller) immediate amount is forgone. In one type of framing manipulation, the money forgone is explicitly stated in each trial. When the explicit-zero manipulation is applied, the two discounting options become an amount of money available now and no money later (e.g., $500 now and $0 in 1 year) and no money now and an amount of money after some delay (e.g., $0 now and $1,000 in 1 year).

Several studies have investigated whether explicitly framing outcomes reduces discounting rates (Koffarnus and Bickel 2014; Magen et al. 2008; Naudé et al. 2018; Radu et al. 2011; Wu and He 2012). Overall, these studies reported reductions in discounting after exposure to the explicit-zero manipulation, and as noted in the meta-analysis by Rung and Madden (2018), this manipulation resulted in significant decreases in impulsive choice (cf. Naudé et al. 2018).

A number of explanations have been proposed for why the explicit-zero manipulation results in decreased impulsive choice. Most relevant to Reinforcer Pathology Theory 2.0, Radu et al. (2011) propose that this intervention is acting upon the temporal window to shift focus toward more distal outcomes. By including “$0 in X delay” in the immediate option, attention is shifted away from the sooner option, and relatively more attention is allocated toward the larger monetary amount associated with the delayed alternative.

Delay Framing

As depicted in the aforementioned examples, the delays associated with the outcomes are usually framed in terms of days, weeks, and years. Another way in which preference toward delayed outcomes has been manipulated is by changing the way delays are presented. Presenting options in terms of days, weeks, and years is considered delay framing. Alternatively, presenting options in terms of specific, concrete dates is considered date framing. For example, instead of presenting the larger, later option as “$1000 in 1 year,” the option is framed as “$1000 on [actual date one year from today].”

A number of studies have evaluated how date framing affects discounting (DeHart and Odum 2015; Dshemuchadse et al. 2013; Klapproth 2012; Leboeuf 2006; cf. Naudé et al. 2018; Read et al. 2005). The fact that all of the aforementioned studies found that date framing reduced discounting is especially impressive given that these studies have recruited a diverse population including college students, adults, and substance users, as well as using a variety of different discounting tasks and measured outcomes. In a recent meta-analysis, Rung and Madden (2018) found framing delays as dates significantly reduced impulsive choice.

Several explanations have been proposed as to why date framing consistently reduces discounting (Rung and Madden 2018), including shifting attention toward the monetary amounts, rather than the delay, interfering with heuristics, and increasing the objective evaluation of the delay (i.e., a specific date may be perceived as more concrete). One explanation, consistent with Reinforcer Pathology Theory 2.0, could be that such manipulations are acting upon the temporal window. Two pieces of evidence may support this claim. First, as noted by Rung and Madden (2018), Klapproth (2012) found that substance users’ discount rates after the date manipulation did not significantly differ from those of healthy controls. Expanding the temporal window by presenting the discounting task with dates instead of delays led to the absence of significant difference in discounting rates of substance users when compared to healthy controls. Second, Naudé et al. (2018) observed a rate-dependent effect such that those with higher initial discount rates reduced their discounting after the date manipulation; however, the date manipulation did not further reduce discount rates among those with already low initial discount rates. In addition, the date manipulation did not differentially affect participants with different smoking status (i.e., never, ex-, current smoker), although this may be due to individuals’ initial discount rate explaining variance that would otherwise be attributed to differential smoking status.

Working Memory Training

One potential intervention for reducing impulsive choice is through improving working memory (Brooks et al. 2017; Wesley and Bickel 2014). Working memory involves the central executive system which is important for self-regulation, decision-making, and problem solving (Barkley 2001; Barrett et al. 2004; Finn 2002). For example, in a study by Bickel et al. (2011c), participants completed several tasks related to working memory, such as auditory and visual recall. Participants who were exposed to this training showed decreases in discounting by approximately 50%. In a recent study, Felton et al. (2019) found that improvements in working memory, but not working memory training directly, significantly predicted decreases in discount rate. In addition, working memory training has been shown to improve processes that modulate delay discounting (e.g., EFT; Snider et al. 2018b). More research is needed, however, as this effect has not been consistently observed and may be due to a variety of reasons such as the specific impulsivity task used, the working memory training protocol implemented, or the way these results have been analyzed (Rass et al. 2015; Wanmaker et al. 2018). Nonetheless, improvements in working memory may be, to some extent, operating on the temporal window.

Transcranial Magnetic Stimulation (TMS)

Transcranial magnetic stimulation, whereby cortical excitability is increased or decreased via electrical currents, has shown initial promise in modulating temporally related decision-making (Cho et al. 2015; Figner et al. 2010; Sheffer et al. 2013, 2018). For example, several studies have shown that application of TMS to the left dorsolateral prefrontal cortex (DLPFC; Sheffer et al. 2013, 2018) and medial prefrontal cortex (Cho et al. 2015) decreased discounting of monetary gains. Though Sheffer et al. (2013) did not find that an acute session of TMS affected subsequent consumption of cigarettes, a more recent longitudinal application (8 sessions) of TMS decreased the risk of relapse in abstinent smokers (Sheffer et al. 2018).

3.2.2 Shortening the Temporal Window

Transcranial Magnetic Stimulation (TMS)

TMS has been shown to lengthen the temporal window by virtue of decreased discount rates and to shorten the temporal window. In one application (Figner et al. 2010), TMS applied to the left dlPFC resulted in greater preference for immediate rewards over-delayed rewards. Together, these studies suggest that TMS may alter the temporal window in two opposite directions. However, concurrent fMRI scanning after TMS application is necessary to determine whether a particular TMS procedure increases or decreases brain activity.

3.3 Other Interventions Changing Valuation

Several manipulations have been applied to changing valuation of substances, as measured via behavioral economic demand and cravings. In line with Reinforcer Pathology Theory 2.0, those interventions that alter the temporal window should also alter valuation for substances. Heretofore, we have discussed interventions targeting the temporal window, some of which have also shown changes in valuation. In contrast, we now discuss two primary manipulations that have targeted valuation, but that have not necessarily measured changes in discounting. An important note is that, consistent with Reinforcer Pathology Theory 2.0, changes in valuation do not necessarily have to result in changes in discounting. As few studies have examined changes in the temporal window concurrently with interventions targeted at changing valuation, more research is needed in this area to determine under what conditions a symmetrical effect is observed.

External Contingencies in the Alcohol Purchase Task

Although relatively fewer studies have examined potential interventions within the area of behavioral economic demand, several manipulations may have implications for altering the temporal window. In recent years, behavioral economic demand has most frequently been evaluated using a simulated or hypothetical purchase task. Purchase tasks provide a brief instruction set specifying assumptions (e.g., imagine a typical situation in which you normally drink alcohol/smoke cigarettes; imagine you have the same income and savings), and respondents indicate how much of a substance (e.g., alcoholic drinks, cigarettes) they would purchase and consume at a range of prices per unit of substance (e.g., price per drink, price per cigarette; for reviews see Kaplan et al. (2018) and MacKillop (2016)).

A number of manipulations have modified the instruction set to include additional external contingencies. Most notably has been the use of next-day responsibilities (Gentile et al. 2012; Gilbert et al. 2014; Skidmore and Murphy 2011) and driving after drinking (Teeters and Murphy 2015). Skidmore and Murphy (2011) evaluated how demand changed when two next-day responsibilities, including a class and an exam in a sample of students, were introduced. The researchers found that demand was highest under the control condition (no next-day responsibilities), followed by a next-day class, followed by a next-day test. In a similar experiment, Teeters and Murphy (2015) evaluated changes in demand in a situation where participants were told to imagine they would be driving home in the evening after drinking at a bar. Participants showed reduced demand under this condition compared to a control condition with no external contingencies. Taken together, these manipulations may be indicative of operating on the temporal window, whereby imagining consequences in the future (whether that is at the end of the night or the next day) altered valuation by way of reducing demand for the target substance.

Cues

Another method in which valuation has been changed is by the use of cues. Cues have been shown to reliably elicit cravings, which in turn results in increased self-administration in the laboratory (Perkins 2009; Tiffany and Conklin 2000), and from a behavioral economic perspective cravings may be related to in-the-moment valuation of a substance. To date, cue exposure has been shown to increase cravings and demand valuation for cigarettes (Acker and MacKillop 2013; MacKillop et al. 2012), alcohol (MacKillop et al. 2010b), and cannabis (Metrik et al. 2016). Michael, Amlung, and MacKillop (2014) found that alcohol-related cues increased craving and some aspects of valuation related to alcohol, but that these cues did not affect discounting rate. On the other hand, Metrik et al. (2016) found that cannabis-related cues increased craving and valuation measures for cannabis, as well as increased the attentional bias toward cannabis-related stimuli, which could provide some evidence suggesting cues shortening the temporal window. Up until this point, we have broadly discussed Reinforcer Pathology Theory 2.0 in the context of behavioral interventions. In accordance with the Experimental Medicine Approach, once the target engagement demonstrates effects on aspects of the disorder, examination of the associated neuroscience becomes an important avenue for investigation.

4 Neural Circuits Underlying Time, Addiction, and Reinforcer Pathology

Healthy and non-addictive decision-making, as viewed within Reinforcer Pathology Theory 2.0, results from neural systems that are in balance. As discussed earlier, the CNDS theory posits that functional behavior results from a balance between impulsive reward system and executive system. The impulsive reward system is checked and balanced by the calculated, executive system (Bickel et al. 2007). The executive system consists of regions of the prefrontal and parietal cortices and regulates our executive functions such as attention, working memory, decision-making, planning, and behavioral inhibition (Bettcher et al. 2016). When faced with rewarding stimuli in our environment (e.g., alcohol, drugs, high-fat foods) that activate the impulsive system, consisting of limbic and paralimbic brain structures (e.g., midbrain, amygdala, posterior hippocampus, habenular commissure, striatum, insula, nucleus accumbens), the executive system ensures that control is maintained during consumption of these stimuli.

Addictive decision-making is marked by an overvaluation of immediate rewards and an overactivation of the impulsive system along with a devaluation of future rewards and an underactivation of the executive system (Bickel et al. 2014). As discussed previously, individuals with SUDs devalue the future, preferring smaller, immediate rewards over larger, long-term rewards. Short-term reinforcers, like alcohol and drugs, are powerful because they operate within short temporal windows. On the other hand, prosocial reinforcers such as family or employment have little value as these reinforcers tend to be rewarding within long temporal windows. Within this framework, we see that temporal organization may be altered in individuals with SUD. Indeed, individuals with SUDs show impairments in many types of memory including working memory, episodic memory (a form of long-term memory for autobiographical events), prospective memory, as well as other areas of cognitive functioning (Domínguez-Salas et al. 2016; Gould 2010). These memory systems give us an organized timeline of events in which we can frame our experience. We remember when we did something, how long ago it occurred, and when we need to do something in the future. That is, the brain allows us to frame our conscious experience within distinct temporal windows (i.e., short versus long).

The delay discounting paradigm as a temporal window evaluator is sensitive to many maladaptive health behaviors including addiction, which make it an excellent behavioral marker to investigate brain mechanisms underlying addiction and recovery processes. Functional magnetic resonance imaging studies have shown that during delay discounting paradigms, when choosing the immediate over delayed reward, dopaminergically innervated areas of the impulsive system, including the ventral striatum, medial orbitofrontal cortex, and medial prefrontal cortex, are primarily activated (McClure et al. 2004, 2007). Areas of the executive system, on the other hand, including the dorso- and ventrolateral prefrontal cortex, the lateral orbitofrontal cortex, and the intraparietal cortex, show a greater level of relative activation (compared to the impulsive system) when choosing delayed over immediate rewards (McClure et al. 2004, 2007). In healthy individuals, greater discounting is associated with decreased activation of a frontoparietal-striatal network and a heightened activation of a temporal lobe network (Elton et al. 2017). Greater discounting is also associated with decreased prefrontal volume, leading to decreased executive decision system regulation, and heightened striatal and parahippocampal/hippocampal volume, leading to increased impulsive decision system regulation (Owens et al. 2017; Suckling and Nestor 2017; Tschernegg et al. 2015; Yu 2012). In addition, individuals with SUDs show altered functional connectivity between these networks, with greater alterations associated with greater discounting, greater levels of drug-related harm, and increased rates of drug relapse (Clewett et al. 2014; Contreras-Rodríguez et al. 2015; Yu 2012). Clearly, impulsive and addictive behaviors are associated with disordered brain structure, function, and connectivity in both impulsive and executive systems.

In order for recovery to occur, behaviors need to shift away from impulsive and unhealthy actions to planned and contemplative healthy decisions. Mechanistically speaking, the executive system needs to become fully functional and homeostatic balance needs to be restored between the impulsive and executive systems. As discussed above, researchers are focusing on developing behavioral and other interventional strategies to decrease impulsivity and possibly help improve recovery outcomes. In order to provide a mechanism for how decreasing delay discounting may serve as a therapeutic target for individuals with SUDs, we will need to understand the neural systems involved in delay discounting and how the brain modulates shifts in delay discounting during and/or after exposure to an intervention. A few recent studies have begun exploring these areas of research.

As discussed above, EFT, often referred to as mental time travel or prospective thinking, serves as a successful intervention to shift choices away from immediate and toward delayed rewards. A pivotal study by Peters and Büchel (2010) examined the neural mechanisms underlying the effect of EFT on delay discounting using functional magnetic resonance imaging. The authors found that valuation signals in the anterior cingulate cortex (ACC) and functional coupling or co-activation between the ACC, the hippocampus, and the amygdala supported the shift toward decision-making that favored long-term, patient choices (Peters and Büchel 2010). A similar study showed that activation in the medial rostral prefrontal cortex predicted future-oriented choices and that this effect was also associated with functional connectivity between the medial rostral prefrontal cortex and the hippocampus (Benoit et al. 2011). These findings suggest that the prefrontal cortex may be using information from the amygdala and hippocampus to guide healthy decision-making.

A recent study showed that remembering positive autobiographical memories before delay discounting reduced subsequent delay discounting (Lempert et al. 2017). The authors suggest that the neural mechanism underlying this positive memory retrieval reduction in delay discounting is an increase in activity in the striatum and temporoparietal junction, which occurs during the process of memory retrieval (Lempert et al. 2017). In addition, individuals who showed the largest decreases in impulsivity showed the greatest levels of similarity in ventromedial prefrontal cortex activation during memory recall and intertemporal choice (Lempert et al. 2017).

Working memory training has also yielded reduced discounting rates in stimulant addicts (Bickel et al. 2011c). To determine the unique brain regions of plausible causality between working memory and delay discounting, Wesley and Bickel (2014) performed a matched activation likelihood estimation meta-analysis. The study findings revealed that a region of the left lateral prefrontal cortex is involved in both working memory and delay discounting (Wesley and Bickel 2014), which indicates that this region may be a key target for therapeutic interventions.

Beyond behavioral interventions, both pharmacological and non-invasive brain stimulation interventions have been shown to alter rates of delay discounting. Modafinil, an atypical dopamine reuptake inhibitor typically used for narcolepsy and sleep-wake disorders, was administered to alcohol-dependent patients. Modafinil decreased delay discounting, and this effect was accompanied by increased activation in frontoparietal regions, reduced activation in the ventromedial prefrontal cortex, and increased functional connectivity between the superior frontal gyrus and ventral striatum (Schmaal et al. 2014). TMS, a non-invasive procedure currently used to treat depression and other psychiatric disorders (Brunoni et al. 2019), has also been used as an acute intervention to decrease delay discounting. This technique uses pulses of electrical currents applied to the cortical surface to entrain neuronal firing beneath the electrode sites to particular frequencies, increasing or decreasing the activity of neuronal networks. A review examining non-invasive brain stimulation procedures, cognitive functioning, and impulsivity identified the dorsolateral prefrontal cortex as an important therapeutic target to alter delay discounting (Brevet-Aeby et al. 2016), with high-frequency (10 Hz) repetitive TMS (rTMS) being an effective protocol for decreasing delay discounting (Cho et al. 2015). More recently, a study in smokers showed that eight sessions of high-frequency rTMS of the left DLPFC in combination with smoking cessation education materials decreased delay discounting, increased abstinence, reduced risk of relapse, and increased study engagement (Sheffer et al. 2018).

In addition, new work from our lab shows that demand, the other behavioral economic component of reinforcer pathology, recruits similar brain regions to delay discounting (Deshpande et al. 2019). Namely, both tasks engaged the superior/middle frontal cortex and superior/inferior parietal lobes, areas of the executive system (Deshpande et al. 2019). In a real-world cannabis purchasing task, the decision to purchase cannabis was associated with activation of the dorsal striatum, frontoparietal and posterior parietal regions, anterior and posterior cingulate cortex, anterior insula, DLPFC, and middle and superior temporal gyri, again regions involved in both the impulsive and executive systems (Bedi et al. 2015). More work is warranted to investigate the neural substrates underlying changes in demand, though we hypothesize that these may be similar therapeutic targets to those shown to underlie changes in delay discounting.

As a final note, imaging studies have shed some light on brain biomarkers in both the impulsive and executive systems that may predict successful abstinence and treatment response. For example, in alcohol-dependent patients, larger frontal and parietal cortices predict longer time to any alcohol use and heavy drinking relapse (Rando et al. 2011). In addition, methamphetamine users who remained abstinent compared to those who relapsed 1 year after study completion showed decreased activity in the inferior frontal gyrus (IFG) and striatum during reinforcement learning, but greater activity in the striatum, insula, IFG, and ACC during response feedback (Stewart et al. 2014). Previous research has also shown that the longer the period of abstinence, the more the brain recovers from addiction-related brain changes. Specifically, dopamine transporters in the striatum, which are a marker of dopamine terminals, significantly increased from a period of less than 6 months of abstinence to between 12 and 17 months of abstinence (Volkow et al. 2001).

Collectively, these studies suggest that interventions targeting a range of brain regions in either the impulsive or executive systems may be instrumental in shifting the temporal window to favor long-term, goal-oriented choices. Altering decision-making in individuals with SUDs to favor future outcomes over immediate rewards will most likely require distinct changes not only in prefrontal cortical networks that underlie executive functioning and basal ganglia circuits that underlie reward and motivation, but other regions that support these two key systems such as the hippocampus (memory), amygdala (emotion), and hypothalamus (stress).

5 Considerations for Intervention Development

As demonstrated above, substantial evidence exists to support the malleability of delay discounting and valuation, two key processes of Reinforcer Pathology Theory 2.0. Some manipulations may also intervene on addiction-common neural pathways, therefore providing a mechanism for addiction-related behavioral change. To date, however, important questions remain regarding both the permanency and generalizability of interventions that target the temporal window. Before interventions on delay discounting or valuation, apart from or in tandem with other target behaviors (e.g., substance use) can be widely implemented, the following gaps in knowledge must be addressed.

First, despite the evidence that the temporal window can be changed, long-term follow-ups have not been conducted to establish the permanency of experimentally caused changes. Whereas some interventions may not be expected to produce lasting changes on the temporal window (e.g., date/delay framing or explicit-zero manipulation), more intensive interventions may create such a change. Interventions on delay discounting aimed at expanding the temporal window such as working memory training (Bickel et al. 2011c), financial education (DeHart et al. 2016; Lahav et al. 2015) or TMS (Cho et al. 2015) may create lasting, positive behavioral changes. Unfortunately, long-term follow-ups have yet to be conducted even when promising expansions of the temporal window have been established. In addition, the parameters of effective interventions including dose magnitude and treatment length are unknown. For example, if EFT can produce lasting changes in delay discounting, how often and for how long EFT must be administered and how often the individual must create new cues before they habituate remains unknown.

In regard to delay discounting specifically, while most interventions target monetary discounting, their effects on the discounting of other outcomes (e.g., food, alcohol) are less established. Some research suggests that delay discounting is a single, unitary trait-like process meaning that delay discounting is consistent across time (Kirby 2009) and between outcomes (Bickel et al. 2011b; Friedel et al. 2014). If this is true, then an intervention that changes monetary discounting should also change the discounting of other commodities. However, conflicting evidence also suggests that delay discounting can be domain specific (Jimura et al. 2011; Lawyer and Schoepflin 2013) meaning that how an individual discounts one outcome is not necessarily related to how they discount other outcomes. Consistent with Reinforcer Pathology Theory 2.0, an individual who steeply discounts one outcome likely discounts most outcomes steeply, reflecting constrained variability of the temporal window. Therefore, in these individuals, an intervention that extends the temporal window could result in a decrease in delay discounting across multiple domains. However, the predictions of Reinforcer Pathology Theory 2.0 for individuals who do not typically discount steeply are less clear. In this instance, interventions that reduce delay discounting may be domain specific, reflecting variability of the temporal window.

Perhaps the most important question regarding interventions that expand the temporal window is if temporary or permanent changes result in lasting improvements in related maladaptive behaviors such as substance use or overeating outside of laboratory settings. For example, due to the strong relationship between delay discounting and cigarette smoking, an intervention that directly targets delay discounting could, in turn, result in a reduction in cigarette smoking. In another example, O’Neill et al. (2016) found that EFT did reduce short-term out-of-lab calorie intake but delay discounting as the mechanism of EFT change was not assessed and a long-term follow-up was not conducted. To date, no study has addressed these limitations. This gap presents a significant shortcoming in the scientific literature as the utility of delay discounting as a focus of intervention depends on its ability to produce meaningful behavioral changes and quality of life improvements beyond the laboratory. Furthermore, because of the wide range of behaviors that are related to delay discounting (Snider et al. 2018a), a reduction in delay discounting may not only improve the maladaptive target behavior but may improve overall functioning in a variety of domains (e.g., eating, exercise, finances). Such a finding would further establish delay discounting as a key behavioral phenotype and target of intervention.

6 Conclusion

Here we outlined Reinforcer Pathology Theory 2.0, its neurobiological/behavioral underpinnings, and its implications for treating SUDs. For decades, the development of effective remedies for addictions and other psychiatric disorders has been hampered by the lack of specific treatable targets. Recently, efforts have been made to remedy this by searching for precise, heritable mechanisms that undergird multiple disorders (Insel 2014). By utilizing the Experimental Medicine Approach, the temporal window of valuation, as measured by delay discounting, has emerged as one such mechanism, and its integration with reinforcer valuation into the theory of reinforcer pathology allows for an experimental framework through which to develop novel interventions. We have provided an overview of a variety of ways in which interventions have been used to manipulate aspects of Reinforcer Pathology Theory, including narrative theory, framing manipulations, and neuromodulation (e.g., working memory training, TMS). These manipulations appear to be promising avenues for the modulation of the temporal window and/or valuation of reinforcers among those individuals who demonstrate a reinforcer pathology. Importantly, this body of research demonstrates that temporal discounting is a determinant of the valuation of these reinforcers. This growing body of evidence is currently being translated to real-world settings, and if empirical findings continue to support the Reinforcer Pathology Theory, then the outcomes of many individuals suffering from substance use and obesity disorders may be improved.