Keywords

1 Introduction

How is behavior that is rash and disinhibited different from behavior that is inflexible and repetitive? Such ‘failures’ in behavioral control, typically considered impulsive and compulsive in character, are present in all people and thus represent dimensional constructs or endophenotypes (Gottesman and Gould 2003) where only extreme ‘outliers’ inform clinical diagnoses and intervention (Fineberg et al. 2010; Robbins et al. 2012; Berlin and Hollander 2014). Understanding the neurobiology and psychology of impulsivity and compulsivity has become increasingly important in recent years in light of the widely recognized and much discussed inadequacies of a categorical-based diagnostic system for psychiatric disorders (Robbins et al. 2012; Insel et al. 2010; Millan et al. 2015). Of particular note, impulsivity and compulsivity are widely documented ‘neurocognitive endophenotypes’ present in a remarkably broad range of psychiatric disorders, including drug addiction (Brewer and Potenza 2008; de Wit 2009; Ersche et al. 2010), personality and mood disorder (Perry and Korner 2011; Lombardo et al. 2012), schizophrenia (Kaladjian et al. 2011), problem gambling (Verdejo-Garcia et al. 2008), suicide (Dougherty et al. 2004), attention-deficit/hyperactivity disorder (ADHD) (Sonuga-Barke et al. 1992), and obsessive compulsive disorder (OCD) (Fineberg et al. 2010).

Impulsivity can be defined as the tendency toward rash decisions without adequate forethought and often results in mistimed and premature actions (Evenden 1999; Durana and Barnes 1993). Such behavior is clearly different from compulsivity (Latin ‘compellere’: feeling forced or being cornered) where inflexible choices and actions are repeatedly carried out irrespective of changes in setting and the growing and obvious undesirable consequences of such behavior [e.g., in the context of OCD (Denys 2011)]. Historically, impulsivity and compulsivity have been regarded as widely contrasting constructs along a single continuum. Thus, whereas impulsivity may arise through abnormalities in reward seeking (positive reinforcement) compulsive behavior is thought to arise and persist by the need to avoid harm or unpleasant subjective feelings (negative reinforcement), a view championed by Koob and Le Moal in their opponent process theory of addiction (Koob and Le 2008). However, just as it is reasonable to question the singularity of the impulsivity construct and the inter-relatedness of different impulsivity subtypes (Winstanley et al. 2006; Dalley et al. 2011), compulsivity can be deconstructed in several ways from rigid stereotyped movements, maladaptive stimulus-response (SR) habits, attentional biases, perseveration, and a failure to extinguish responding when rewards are omitted (Robbins et al. 2012). As reviewed in this article, supporting evidence for the multifaceted nature of impulsivity and compulsivity constructs is compelling and represents tractable endophenotypes for translational research. The concept of endophenotypes (i.e., measurable heritable traits) in psychiatric research is not new (Gottesman and Gould 2003). However, with the increasing availability and sophistication of high-throughput translatable methodologies (e.g., behavioral/cognitive screening, brain imaging, genomics, proteomics, and metabolomics), the delivery of fine-grain mechanistic explanations of psychopathology is now a realistic prospect. By continuing to actively research their considerable phenotypic heterogeneity and underlying neurobiological mechanisms, further significant advances are anticipated in our understanding of complex polygenic mental disorders such as ADHD, OCD, and addiction (Fineberg et al. 2010). In this article, we review the latest research in this field and examine the implications of this work for the biological origins and treatment of impulsive and compulsive disorders.

2 Impulsivity

Impulsivity can be broadly divided into motor and decisional subtypes reviewed below. Motor impulsivity includes (i) waiting impulsivity or premature anticipatory responding prior to a cue predicting reward and (ii) response inhibition or stopping inhibition of a prepotent response. Decisional impulsivity includes (iii) delay and probabilistic discounting of reward and (iv) reflection impulsivity—the tendency to make rapid decisions without adequate accumulation and consideration of the available evidence.

2.1 Waiting Impulsivity

This form of impulsivity requires action restraint during the waiting period leading up to an expected reward. Traditionally, waiting impulsivity is assessed using so-called differential reinforcement of low rates of responding (DRL) schedules, first described by Ferster and Skinner (1957). DRL schedules set a minimum period of time between responses in order for reinforcement to occur and in rodents typically involve delays of 15–30 s (Evenden 1999). In humans, DRL schedules can be used to curb excessive behavior such as fast eating (Lennox et al. 1987) and even the number of times children seek assistance from classroom teachers (Austin and Bevan 2011).

Action restraint on DRL tasks is analogous to premature responding on serial-reaction time tasks where subjects must wait for several seconds for a reward-predictive cue to respond. In a popular variant of this task—the 5-choice serial-reaction time task (5CSRTT)—subjects (typically mice and rats) are trained to detect brief visual targets and to refrain from responding prior to their onset (Robbins 2002) (Fig. 1). High trait-like levels of impulsivity on this task predict the escalation of cocaine and nicotine self-administration (Dalley et al. 2007a; Diergaarde et al. 2008), an increased propensity to relapse following voluntary drug abstinence (Economidou et al. 2009), and the subsequent development of compulsive cocaine self-administration (Belin et al. 2008). Compulsivity in this context was assessed by the emergent tendency of high-impulsive (HI) rats to discount cocaine-associated adversity (i.e., the delivery of a mild electric shock rather than cocaine on roughly 50 % of trials). This inflexible form of behavior is outwardly similar to drug addicts who despite acknowledging the deleterious impact of chronic drug abuse (i.e., personal and wider harms) rarely ever achieve spontaneous voluntary abstinence. In addition being a trait marker for addiction, premature responding can be modulated in a state-dependent manner (i.e., a secondary consequence of drug exposure). For example, opiate and stimulant drugs, given acutely, strongly increase premature responding (Cole and Robbins 1987; Pattij et al. 2009), similar to methamphetamine after a forced protracted period of withdrawal (Dalley et al. 2007b).

Fig. 1
figure 1

Translational studies of waiting impulsivity in rodents and humans. a Rodent 5-choice serial-reaction time task (5-CSRTT). Rats are trained to wait for the occurrence of a brief visual stimulus presented in one of five open apertures in order to earn a food reward. Responding early (‘premature’ response), in a non-illuminated aperture (‘incorrect’ response), or not at all (‘omission’) is signaled by a loss of food reward and the houselight being briefly extinguished. be represent studies conducted using the 5-CSRTT in rats. bd Reduced gray matter density in the NAcb core of high-impulsive (HI) rats compared with low-impulsive (LI) rats, measured using voxel-based morphometry, and associated with reduced levels of glutamic acid decarboxylase (GAD), microtubule-associated protein (MAP2), and spinophilin in the left NAcb core. E. [18F]fallypride-PET coregistered with horizontal MRI scans showing reduced D2/3 receptor availability in the ventral striatum of HI rats compared with LI rats prior to the oral administration of methylphenidate (‘pre-MPH’). Following MPH administration (‘post-MPH’), D2/3 receptor availability in the ventral striatum increased in HI rats but decreased in LI rats, according to the model of rate dependency. f Human 4-choice serial-reaction time task (4-CSRT). Subjects hold down the space bar with their index finger and release to press the box on the touch screen in which the green target appears. A premature response is measured as a release of the space bar before target onset. gi represent studies conducted using the 4-CSRT. G and H. Resting state functional connectivity correlating with waiting impulsivity in humans: subgenual cingulate and subthalamic nucleus (g) and ventral striatum and subthalamic nucleus (h). i Enhanced waiting impulsivity with central 5-HT depletion in healthy humans (TD = tryptophan depletion). j Elevated waiting impulsivity in abstinent alcohol (Etoh) and methamphetamine (Meth) dependence and current cannabis users (Cann) with no differences observed in obese subjects with and without binge eating disorder (BED)

Evidence for a role for premorbid impulsivity in predicting the development of alcohol use disorders is less clear with mixed results depending on the precise animal strain. Premature responding is associated with alcohol-preferring mouse strains compared with non-alcohol-preferring strains (Sanchez-Roige et al. 2014a) and is also associated with greater withdrawal severity from chronic alcohol among different mouse strains (Gubner et al. 2010). Acute alcohol exposure (Oliver et al. 2009) and early but not late abstinence following chronic alcohol exposure is also associated with increased premature responding in mice (Walker et al. 2011) suggesting that premature responding can be secondary to both acute and chronic alcohol exposure. However, alcohol-preferring rats do not differ in premature responding compared with non-alcohol-preferring rats either at baseline or following chronic exposure to alcohol (Pena-Oliver et al. 2015). Interestingly, high premorbid premature responding in rats is also associated with greater escalation of sucrose-seeking behavior and reinstatement following extinction in rodents (Diergaarde et al. 2009).

In humans, enhanced waiting impulsivity can be assessed using the analogous 4-choice serial-reaction time task across multiple substance use disorders (e.g., abstinent methamphetamine and alcohol use disorders) along with current cannabis users and current smokers (Voon et al. 2014) (Fig. 1). In this task, a premature response is defined as early anticipatory release of the space bar prior to the onset of a target green stimulus within one of 4 boxes on a touch screen (Voon 2014). That waiting impulsivity is associated with current, but not ex-smokers, or never-smokers, suggest either that nicotine exerts state-dependent effects or that those with higher levels of waiting impulsivity find it difficult to quit smoking. College-age binge drinkers at elevated risk for the development of later alcohol use disorders also show enhanced waiting impulsivity as tested using either the 4-CSRT (Morris et al. 2015) or the Sussex-5-CSRT (Sanchez-Roige et al. 2014b). Differences exist between the two tasks with the former associated with a target cue predicting reward, whereas the latter does not (Voon 2014). In contrast, obese subjects with and without binge eating disorder (BED) did not show impaired waiting impulsivity (Voon et al. 2014). This may reflect a diminished sensitivity of obese people to monetary rewards but further studies (e.g., with food outcomes used instead) would be needed to test this possibility.

2.1.1 Neural Networks and Neurochemistry

The neural network underlying premature responding in the 5CSRTT has been extensively mapped in rodents using excitotoxins, intracerebral pharmacology, and selective immunotoxins [for review, see (Robbins 2002; Dalley et al. 2008)]. These studies indicate key roles of the infralimbic cortex—probably equivalent to the human subgenual anterior cingulate cortex (ACC)—nucleus accumbens (NAcb), and subthalamic nucleus (STN) (Chudasama et al. 2003; Baunez et al. 1995; Baunez and Robbins 1997; Aleksandrova et al. 2013) (Fig. 2) with dissociable contributions from dopaminergic, norepinephrinergic, serotonergic, GABA, and glutamatergic mechanisms (Winstanley et al. 2006; Dalley et al. 2011; Pattij et al. 2007; Hayes et al. 2014). Salient findings from research in rodents and humans are reviewed briefly below.

Fig. 2
figure 2

Cortico-striatal circuitry of subtypes of impulsivity. A simplified box and arrow model of relevant fronto-striatal regions are shown to illustrate key anatomical commonalities and differences between subtypes of impulsivity. Abbreviations: NAcc nucleus accumbens; STN subthalamic nucleus; pre-SMA presupplementary motor area; VP ventral pallidum; GPe globus pallidus externa; GPi globus pallidus interna; SNr substantia nigra pars reticulate

Acute amphetamine increases premature responding in rodents, an effect attenuated by dopamine (DA)-depleting 6-hydroxydopamine lesions of the NAcb and D1/2 receptor antagonists (Cole and Robbins 1989). More specifically, stimulants, such as amphetamine, nicotine, and cocaine, and DA reuptake inhibitors increased premature responding, which was blocked by intra-NAcb D1 and D2 receptor antagonists (Pattij et al. 2007). HI rats have lower ventral striatal D2/3 receptor availability (Dalley et al. 2007a) and lower left gray matter density, and markers of GABA and dendritic spine function, in the NAcb core (Caprioli et al. 2013) (see Fig. 1). In rodents, the influence of methylphenidate on premature responding appears to be mediated by beta-adrenergic and D4 receptors (Milstein et al. 2010) and is influenced by dose (Navarra et al. 2008) and baseline impulsivity (Tomlinson et al. 2014). Atomoxetine, a selective norepinephrine (NE)-reuptake inhibitor, also dose-dependently decreases premature responding (Robinson et al. 2008a; Baarendse et al. 2013) in rodents and zebrafish (Parker et al. 2014) and decreases the likelihood of transition toward compulsivity as measured using a schedule-induced polydipsia procedure in HI rats (Ansquer et al. 2014). Crucially, convergent evidence indicates that the NAcb core and shell are functionally opposed with deep brain stimulation (DBS) of these regions producing opposite effects on premature responding (Sesia et al. 2008). DBS targeting the shell but not the core increases premature responding and DA levels in the NAcb, presumably via anti-dromic stimulation of ventral tegmental area projections (Sesia et al. 2010). Moreover, infusions of methylphenidate, a mixed DA/NE reuptake inhibitor, in the core but not the shell increased premature responding, whereas infusions of atomoxetine into the shell but not the core decreased premature responding (Economidou et al. 2012). In rodents, infusion of quinpirole, a D2/3 agonist, into the NAcb core increased premature responding in HI rats but within the shell increased locomotor activity instead (Moreno et al. 2013). Morphine, a mu-opioid receptor (MOR) agonist increases premature responding in the 5CSRTT, an effect blocked by the MOR antagonist naloxone (Pattij et al. 2009). MOR antagonism but not kappa-opioid receptor antagonism also selectively remediates amphetamine-induced impairments in premature responding (Wiskerke et al. 2011). This effect appears to be mediated in the NAcb shell and may interact with the mesolimbic dopaminergic system. Acute methylphenidate challenge is also associated with greater waiting impulsivity in healthy humans (Voon et al. 2015a) consistent with rodent studies of methylphenidate infusion in the NAcb core (Economidou et al. 2012), mediated most parsimoniously by increased DA neurotransmission in this region.

Central serotonin (5-HT) depletion and prefrontal and intra-NAcb 5-HT2C receptor antagonism is associated with greater premature responding in rodents, whereas prefrontal and intra-NAcb 5-HT2A receptor antagonism is associated with a decrease in premature responding (Winstanley et al. 2004a; Robinson et al. 2008b). Similarly, 5HT2A receptor antagonists have been shown to decrease cocaine-induced premature responding in rodents (Anastasio et al. 2011) and 5HT2C receptor agonists decrease nicotine-induced premature responding (Higgins et al. 2012). In humans, tryptophan depletion or acute central 5-HT depletion enhances waiting impulsivity (Worbe et al. 2014) thus corroborating findings in rodents (Winstanley et al. 2004a).

Greater premature responding in the human 4-CSRT task is associated with decreased resting state functional connectivity of the bilateral STN with bilateral subgenual cingulate and right ventral striatum (Morris et al. 2015) (Fig. 1). These findings provide translational evidence in humans for a similar network implicated in rodents. Furthermore, these findings indicate a dissociation with motor response inhibition or action cancelation as captured by the stop-signal task (see Sect. 2.2), which instead was associated with lower connectivity between hyper-direct projections of the right presupplementary motor area (SMA) and left STN together with dorsal caudate and STN connectivity. Furthermore, this network was shown to be dimensionally relevant across alcohol misuse with impairment as a function of alcohol severity in social drinkers and in binge drinkers and alcohol use disorders. Using machine learning techniques, STN connectivity successfully classified those who misuse alcohol from healthy volunteers. Connectivity of the STN and ventral striatum also improved with abstinence suggesting a state effect of alcohol on connectivity strength. Thus, human studies converge on rodent studies implicating a role in substance use disorders possibly mediated by dopaminergic, norepinephrinergic, and serotonergic mechanisms and similar underlying neural networks. Key translational findings of premature responding are shown in Fig. 1.

2.2 Stop-Signal Reaction Time

Response inhibition describes the capacity to inhibit a prepotent response. In human studies, differing subtypes of stopping behavior have been described including: fast reactive stopping in response to an external stop signal; proactive stopping in response to a cue predicting a stop signal (Aron 2011; Jaffard et al. 2008), and stopping in response to an internal signal (Schel et al. 2014). Fast reactive stopping is the most commonly tested form with extensive translational evidence as action restraint or cancelation and includes (i) stopping prior to movement initiation (action restraint) as measured using the Go/NoGo task in which subjects must inhibit responding to an infrequently presented stop signal while responding rapidly to a frequently presented stream of Go signals; and (ii) stopping after movement initiation (action cancelation) as measured using the stop-signal task (SST) in which subjects inhibit responding to an infrequent stop signal with onset following a delay after the Go signal (Aron 2011; Chambers et al. 2009; Eagle et al. 2008). In the SST, the stop-signal delay changes as a function of successful stopping such that successful and failed stops occur with a 50 % probability. The shorter the delay the greater likelihood of stopping, while the longer the delay the lower likelihood of stopping. Whereas the Go/NoGo task measures discrete commission errors, the SST assesses the internal speed of stopping (stop-signal reaction time SSRT) by assessing the speed at the Go signal and the probability of stopping at the stop-signal delay (Logan et al. 1984; Verbruggen and Logan 2009). This is conceptually based on the Logan’s Race model or a race competition between the process of Go and Stop in which the process that crosses the threshold first is executed. As evidence for the neurobiology underlying human and rodent SSRT and Go/NoGo tasks have been extensively discussed (Aron 2011; Chambers et al. 2009; Eagle et al. 2008), only a brief summary will be presented here.

Meta-analyses show deficits in response inhibition particularly with the SSRT and to a lesser extent with Go/NoGo in ADHD (Lipszyc and Schachar 2010) and across most but not all substance use disorders including stimulants, nicotine, alcohol, and in pathological gambling and Internet use disorder but not opioid or cannabis abuse (Smith et al. 2014). Impairments in the SST have also been shown in unaffected siblings of stimulant-dependent subjects suggesting an endophenotypic risk factor for the development of addiction (Ersche et al. 2012) and predicted adolescent alcohol and drug-related problems (Nigg et al. 2006) and the progression from heavy alcohol use in adults to alcohol dependence (Rubio et al. 2008). Similarly, both OCD subjects and unaffected family members show impairments in the SST suggesting a cognitive endophenotype (Menzies et al. 2007) underlying the development of OCD. However, unlike premature responding, rodent studies have not yet shown that this form of impulsivity predicts compulsive substance use.

2.2.1 Neural Substrates

In rodent studies, the SST appears to be influenced by NE-ergic mechanisms with a rather more limited influence from DA to 5-HT. In rodents, methylphenidate and acute amphetamine influences SST as a function of baseline (Eagle et al. 2007; Feola et al. 2000). Low-dose acute amphetamine also improved the impairment in response inhibition produced by a medial striatal lesion but high doses further impaired response inhibition (Eagle and Robbins 2003). Atomoxetine infused into the orbitofrontal cortex (OFC) and dorsal prelimbic cortex improved response inhibition on the SST (Robinson et al. 2008a; Bari et al. 2011; Bari et al. 2009), an effect blocked by guanfacine, an alpha-2a receptor agonist, but not by alpha-flupenthixol, a D1/D2 receptor antagonist (Bari et al. 2011). Similarly, an alpha-2a antagonist improved response inhibition unlike DA receptor agonists and antagonists (Bari and Robbins 2013). Further, neither D1/D2 receptor antagonists (Eagle et al. 2007) nor DA transporter inhibition (Bari et al. 2009) was shown to affect SSRT, similar to the lack of effects of 5-HT depletion and 5-HT transporter inhibition (Bari et al. 2009; Eagle et al. 2009). D1 and D2 antagonists infused in the dorsomedial striatum have been shown to improve and impair SSRT respectively.

In humans, methylphenidate improves SST performance in those with impaired SSRT such as cocaine-dependent subjects (Li et al. 2010) as well as children (DeVito et al. 2009; Tannock et al. 1989) and adults (Aron et al. 2003a) with ADHD. In healthy subjects, direct comparisons show that acute methylphenidate but not atomoxetine or citalopram improves response inhibition on the SST (Nandam et al. 2011) although a direct comparison shows efficacy of atomoxetine but not citalopram in improving response inhibition on the SST (Chamberlain et al. 2007a). However, higher doses of atomoxetine (80 mg vs. 40 mg) impaired response inhibition on a Go/NoGo task (Graf et al. 2011). Thus, converging with rodent studies, NE appears to play a role in reactive stopping in humans with a possible U-shaped dose–response relationship.

Another type of inhibitory control, which has been more extensively studied in human subjects, is that of proactive stopping in which motor control is preparatory and goal-directed (Aron 2011; Jaffard et al. 2008). This form of stopping has similarities to ‘braking’ and ‘conflict-induced slowing’ in which slowing of reaction time may occur in the context of a conflict to prevent impulsive decisions until a decision is made (Frank et al. 2007; Frank 2006). Proactive stopping can be differentiated from reactive stopping by comparing response inhibition tasks in which either the stop signal is acted upon to countermand the action (proactive condition) or the stop signal is ignored or not present (baseline condition). Proactive inhibition is associated with a decrease in motor-evoked potential, an index of cortical excitability, which is suppressed to a greater extent than at rest when anticipating a stopping response (Cai et al. 2011).

Response inhibition implicates a network across rodent and human studies including the SMA, right inferior frontal cortex (rIFC), the STN, and caudate and has been extensively reviewed elsewhere (Dalley et al. 2011; Morris et al. 2015; Aron 2011; Aron et al. 2003b) (Fig. 2). Studies of proactive inhibition implicate the same regions as reactive stopping including the SMA, rIFC, and STN (Jaffard et al. 2008; Zandbelt and Vink 2010; Obeso et al. 2013; Ballanger et al. 2009). The hyperdirect connections from the SMA and rIFC to STN are thought underlie reactive stopping, whereas fronto-striatal circuitry via the direct and indirect pathways appears to be more important for proactive stopping (Zandbelt and Vink 2010; Smittenaar et al. 2013a; Majid et al. 2013).

2.3 Delay Discounting

Animal and humans demonstrate an inherent tendency to discount or devalue future outcomes (Ainslie 1975). Impulsive choice or delay discounting is a form of impulsivity that can be measured using intertemporal choice tasks. In these tasks, subjects choose between a small immediate reward and a larger but delayed reward. Intertemporal choice tasks used in animal studies are invariably based on feedback and range in duration from seconds to minutes, whereas, in human studies, hypothetical tasks involving longer durations from days to years are generally used. However, discounting can occur in humans over relatively short delay intervals of just seconds in humans (Schweighofer et al. 2008; Gregorios-Pippas et al. 2009). The Experiential Discounting Task is an intertemporal choice task paired with real-time coin machine feedback developed to be sensitive to state changes in discounting and to model naturalistic choice context (Reynolds and Schiffbauer 2004). The devaluation of future reward can be reliably modeled by a hyperbolic function (VS = VA/1 + Kd) with steeper slopes closer to the time of reward receipt than an exponential function (VS = VAe−kd) with equal slopes over delay intervals (Ainslie 1975; Mazur 1987). With such functions, the subjective value, VS, is a modification of the actual value, VA, by the delay (d) and a discount constant (K). K represents the steepness of the temporal discounting curve and represents a measure of impulsivity. A hyperbolic fit implies that when the smaller reward is imminently available, the subjective value of the smaller immediate reward will be greater than the subjective value of the larger delayed reward, thus resulting in a preference reversal away from the larger future outcome toward the smaller immediate outcome (Ainslie 1975).

In animal studies, pre-existing impairments in impulsive choice predispose to greater cocaine self-administration and reinstatement of cocaine-seeking behavior (Perry et al. 2005, 2008), and greater use of alcohol (Mitchell et al. 2006; Poulos et al. 1995) and nicotine (Diergaarde et al. 2008). In humans, delay discounting is a core impairment implicated in ADHD (Noreika et al. 2013) and substance addictions across multiple drug categories, pathological gambling (Bickel et al. 2013; Bickel et al. 2014), and in obesity both with and without BED (Mole et al. 2014; Voon 2015).

2.3.1 Neural Substrates

Converging evidence implicates a role for DA in delay discounting. In primates, single unit striatal recordings to reward-predictive cues show that dopaminergic neuronal activity scales with magnitude and decreases with delay, thus reflecting the subjective devaluation of the delayed reward (Kobayashi and Schultz 2008). Similarly, in rodents, DA levels in the NAcb, assessed by in vivo voltammetry, scale with reward magnitude and decrease with delay (Saddoris et al. 2015). Notably, higher DA levels correlated with choice preference at shorter delays (Saddoris et al. 2015) and optogenetically enhanced DA release in the NAcb during reward-predictive cues shifted choice preference for delay but not magnitude-related decisions (Saddoris et al. 2015).

In rodents, delay discounting has been shown to correlate with ‘trait’ premature or anticipatory responding (Robinson et al. 2009), the latter associated with lower D2/3 receptor density in the ventral striatum (Dalley et al. 2007a). Acute administration of low and moderate doses of amphetamine decreases impulsive choice in rodent studies (Floresco et al. 2008; Richards et al. 1999; Wade et al. 2000; van Gaalen et al. 2006), whereas high, chronic doses of methamphetamine and cocaine increase impulsive choice in rodents (Richards et al. 1999; Roesch et al. 2007; Simon et al. 2007).

Studies in healthy humans show that increased questionnaire-based trait impulsivity (although not specifically delay discounting) is correlated with decreased D2/3 autoreceptor midbrain availability and with increased amphetamine-induced DA release in the striatum (Buckholtz et al. 2010). Striatal DA release was associated with greater wanting or desire for the stimulant. Using mediation analysis, the relationship between D2/3 receptor binding and impulsivity was in part mediated by striatal DA release. Lower D2/3 receptor availability in the ventral striatum appears to correlate more specifically with greater delay discounting in pathological gamblers (Joutsa et al. 2015), methamphetamine dependence (Ballard et al. 2015), and alcohol use disorders (Oberlin et al. 2015). Levodopa, a precursor to DA in healthy humans, increases impulsive choice (Pine et al. 2010) and increases delay aversion in patients with PD (Cools et al. 2003). Prefrontal cortical DA has also been implicated in impulsive choice. Thus, genetic polymorphisms associated with catechol-o-methyltransferase (COMT), an enzyme found in the PFC and responsible for DA breakdown, are associated with a U-shaped relationship between prefrontal dopaminergic function and impulsive choice (Kayser et al. 2012).

Norepinephrinergic and serotonergic mechanisms have also been implicated in impulsive choice. Atomoxetine decreases impulsive choice in rodents (Robinson et al. 2008a; Sun et al. 2012). By contrast, central 5-HT depletion increases impulsive choice in rodents (Mobini et al. 2000) and attenuates the decrease in impulsive choice induced by low-to-moderate doses of d-amphetamine (Winstanley et al. 2003). Furthermore, 5-HT1A receptor agonists similarly attenuate the effects of d-amphetamine but have no effect in rats depleted of DA in the NAcb (Winstanley et al. 2005). Interestingly, during the waiting period for primary and conditioned rewards, serotonergic activity is tonically increased in rats (Miyazaki et al. 2011) with optogenetic activation of dorsal raphe serotonergic neurons increasing the ability of rats to withhold responding to delayed rewards (Fonseca et al. 2015). In humans, the role of 5-HT is less clear since tryptophan depletion does not influence delay discounting in healthy controls (Worbe et al. 2014), with and without a family history of alcohol dependence (Crean et al. 2002) or with simulated binging of alcohol (Dougherty et al. 2015).

Rodent lesion studies implicate the NAcb core, OFC, amygdala, and hippocampus in delay discounting (Cardinal et al. 2004; Cardinal et al. 2001; Winstanley et al. 2004b). Human imaging studies also implicate the ventral striatum, OFC, lateral prefrontal cortex (lPFC), insula, amygdala, posterior cingulate, and parietal cortex in delay discounting for secondary rewards (Ballard and Knutson 2009; Tanaka et al. 2004; Kable and Glimcher 2007; McClure et al. 2004) and primary rewards (McClure et al. 2007) (Fig. 2).

More specifically, subregions of the OFC have been implicated in delay discounting with lesions of the medial OFC increasing delay discounting in rats and lesions of the lateral OFC decreasing delay discounting (Mar et al. 2011). These lesion studies converge with studies of single neuron activity in the OFC showing higher activity with time-discounted rewards after a short delay and lower activity after a long delay, independent of the encoding for absolute reward magnitude (Roesch et al. 2006). Similarly, stroke-induced lesions of the medial OFC increased delay discounting in humans (Sellitto et al. 2010). Thus, the OFC appears to play a specific role in encoding time-discounted rewards beyond value encoding to guide choice behavior.

The ventral striatum is a key structure implicated in single and dual valuation theories of temporal discounting in human studies. In the dual valuation system, the beta system activates limbic systems (ventral striatum and medial PFC) and is associated with the choice of the immediate reward, whereas delta regions (lateral prefrontal and parietal cortices) are activated during all decisions (McClure et al. 2004). The beta system is hypothesized to overvalue immediate rewards while the delta system is considered to discount rewards over a constant rate with time. An alternate dual valuation system is hypothesized in which delay is coded in the lPFC and magnitude coded in the ventral striatum (Ballard and Knutson 2009). The differential involvement of cortico-basal-ganglia loops has been implicated with ventro-anterior striatum and insula being preferentially involved in immediate choices and dorso-posterior striatum and insula being preferentially involved in delayed choices (Tanaka et al. 2004). In contrast, others have argued for a single valuation system with the ventral striatum representing the subjective value of the delayed choice (Kable and Glimcher 2007).

2.4 Reflection Impulsivity

Reflection impulsivity is predominantly assessed in humans and describes the accumulation of evidence, evaluation of options, and rapid hypothesis testing prior to a decision (Kagan 1966). This form of impulsivity can be divided into perceptual and probabilistic decisions. Perceptual tasks include the matching familiar figures task (MFFT) in which subjects decide whether a pattern matches a series of similar patterns of which all but one differs (Kagan 1966). The impulsivity score captures the core feature of the extent of information sampling or reaction time and accuracy of the decision. Children with ADHD perform more impulsively on the MFFT, which improves with psychostimulant treatment (Brown and Sleator 1979). MDMA users but not cannabis users or alcohol-dependent subjects are also impaired on the MFFT (Quednow et al. 2007; Morgan et al. 2006; Weijers et al. 2001). This task may have overlaps with other perceptual decision tasks focusing on speed-accuracy trade-offs involving sensory discrimination (e.g., random dot motion task) (Gold and Shadlen 2007; Banca et al. 2014) and conflict-induced slowing to either probabilistic or perceptual conflict (Frank 2006; Wylie et al. 2009).

Other reflection impulsivity tasks assess probabilistic decisions more directly by measuring the extent of information sampling or evidence accumulation [e.g., the Beads-in-a-Jar task (‘Beads task’) (Volans 1976) and the information sampling task (IST) (Clark et al. 2006)]. In the Beads task, subjects must decide from which of two jars beads are being selected based on known probabilities of the color ratio of the beads within the jars. Participants are aware of the explicit probabilities of the alternate options with each piece of evidence accumulated associated with an expected probability of being correct. Using this task, elevated probabilistic reflection impulsivity has been observed in substance use disorders, pathological gamblers (Djamshidian et al. 2012), binge drinkers (Banca et al. 2015), and patients with PD and medication-induced behavioral addictions (Djamshidian et al. 2012). Reflection impulsivity tested using the Beads task is exacerbated by DA receptor agonists though not by Levodopa in studies of PD (Djamshidian et al. 2013). The IST is a similar information sampling paradigm that asks participants to decide which color is predominant in a 5 × 5 matrix by opening boxes to make a decision (Clark et al. 2006). Current or former amphetamine and opiate users sample less information compared to healthy volunteers (Clark et al. 2006). One study has shown an impairment in binge drinkers (Banca et al. 2015) although this was not confirmed in a second study (Townshend et al. 2014).

The IST is conceptually similar to the Beads task, yet a recent study in schizophrenia unexpectedly did not demonstrate impairments in the IST despite consistent reports of impairments in the Beads task (Huddy et al. 2013). Similarly, a binge drinking study showed an impairment with the Beads task but not the IST (Banca et al. 2015). One reason for this apparent discrepancy is that unlike the Beads task, the IST presents the available information for sampling in a more explicit manner, thus encouraging thinking ahead of all possible outcomes and overall task representation. That the generative probability distribution is more uncertain and closer to 50 % in the IST as compared to the known explicit probabilities in the Beads task will also shift subjects toward being more cautious.

2.4.1 Neural Substrates

Volumetric differences between the IST and the Beads task have also been reported (Banca et al. 2015). Greater impulsivity in the Beads task was associated with smaller dlPFC and left inferior parietal volumes (Banca et al. 2015). The mechanisms underlying evidence accumulation can be subdivided into evidence-seeking or decision-making. The Beads task is associated with parietal activity during evidence-seeking and dlPFC activity during both evidence-seeking and decision-making (Furl and Averbeck 2011). The dlPFC is important for the resolution of uncertainty (Huettel et al. 2005) and computing differences between costs and benefits (Basten et al. 2010) with the accumulated difference represented in the parietal cortex signaling the final decision and confidence (Kiani and Shadlen 2009; Stern et al. 2010). In contrast, greater impulsivity in the IST was associated with greater left dorsal cingulate and right precuneus volumes (Banca et al. 2015). Similarly, in an fMRI study investigating evidence accumulation, greater uncertainty during evidence accumulation was associated with ACC and precuneus activity, whereas greater uncertainty during decision execution was associated with greater lateral frontal and parietal activity (Stern et al. 2010). The dACC is implicated in error and conflict monitoring processes (Scheffers and Coles 2000; Botvinick et al. 2001), and in coding unexpected and unpredicted outcomes during evidence accumulation (Stern et al. 2010; Oliveira et al. 2007).

3 Compulsivity

Compulsivity is generally understood to involve excessive repetitive actions that are incongruous to a situation. OCD is the canonical disorder of compulsivity characterized by anxiety-provoking obsessions and compulsive rituals (Stein and Hollander 1995). However, compulsivity in the form of repetitive and rigid stereotyped behaviors extends to autism, Tourette’s syndrome (TS), disorders of impulse control (pathological gambling, trichotillomania), eating disorders, and substance-related and addictive disorders. Various methods have been used in animals to research OCD subtypes and related impulsive–compulsive disorders. Though beyond the scope of the present article, these include genetic approaches (e.g., the Sapap3 knockout mouse showing excessive self-grooming and anxiety (Welch et al. 2007), pharmacological ‘models’ (e.g., quinpirole-induced compulsive checking (Szechtman et al. 2001), and behavioral models of autism (Kas et al. 2014), pathological gambling (Zeeb et al. 2009) and OCD [e.g., the signal attenuation task and marble burying behavior in rodents (Albelda and Joel 2012)]. Other compulsivity constructs include cognitive inflexibility assessed by response perseveration and impaired attentional set-shifting, dominance of S-R habits, and rigid stereotyped behavior, are reviewed below.

3.1 Cognitive Inflexibility

Adaptive goal-directed behavior requires flexible cognitive control over reinforcement learning, working memory, and attentional set-shifting. Deficits in cognitive flexibility occur in PD, OCD, autism, ADHD, Alzheimer’s disease, schizophrenia, addiction, among other disorders (Nilsson et al. 2015) endorsing the view that many measurable and potentially translatable constructs in psychiatry cut across diagnostic boundaries. Cognitive flexibility is most commonly assessed by reversal learning and attentional set-shifting tasks adapted for use in rodents, non-human primates, and humans (Iversen and Mishkin 1970; Fellows and Farah 2003; Tait et al. 2014). Optimal reversal learning requires the capacity to flexibly switch responding to changing stimulus-response contingencies and usually involves a single perceptual dimension where one stimulus is rewarded and the other is not. Attention set-shifting tasks such as the Wisconsin Card Sorting Test (Grant and Berg 1948), the CANTAB intra-/extra-dimensional (IED) set-shift task (Robbins 2000) involve at least two superimposed perceptual dimensions, each containing at least two different stimuli. Set-shifting assesses the capacity to switch responding to previously irrelevant stimuli or to switch in response to changes in rules requiring attentional flexibility.

Current amphetamine and methamphetamine users show impaired set-shifting (Clark et al. 2006; Ornstein et al. 2000) which improves with prolonged abstinence (van den Hout et al. 2009; Johanson et al. 2006; Toomey et al. 2003). In alcohol dependence, impairments in set-shifting are associated with years of abuse (Tarter 1973) and with relapse (Pothiyil and Alex 2013) without any improvements with abstinence (Nowakowska et al. 2007). Impaired set-shifting is also observed in obese individuals with and without BED (Duchesne et al. 2010; Wu et al. 2014). Pathological gambling is associated with both reduced and unaffected set-shifting in the IED task (Grant et al. 2011) and the WCST task (Goudriaan et al. 2006). OCD patients and unaffected first-degree relatives exhibit deficits in ED set-shifting (Chamberlain et al. 2007b) suggesting that impaired set-shifting may be a cognitive endophenotype for OCD. Similarly, patients with TS show impaired ED set-shifting (Watkins et al. 2005).

Substance use disorders can also be associated with impaired reversal learning. Thus, reversal learning is impaired in cocaine use disorders (Camchong et al. 2011; Fernandez-Serrano et al. 2012; Ersche et al. 2008; Fillmore and Rush 2006) but not in amphetamine and opiate abusers (Ersche et al. 2008). The influence of alcohol dependence on reversal learning is less clear with deficient aversive eye-blink conditioning (Fortier et al. 2008) and evidence of slower reversal but with no increase in perseverative errors (Vanes et al. 2014). However, impaired reversal learning is present in pathological gambling with gain and loss outcomes (de Ruiter et al. 2009; Patterson et al. 2006) and in TS (Watkins et al. 2005).

3.1.1 Neural Substrates

Behavioral flexibility is widely accepted to depend on the OFC, lPFC, ACC, and caudate nucleus (Clarke et al. 2008; Cools et al. 2002; Rogers et al. 2000). Thus, damage to the OFC disrupts reversal learning in humans (Fellows and Farah 2003; Hornak et al. 2004), monkeys (Iversen and Mishkin 1970; Dias et al. 1996), and rats (Schoenbaum et al. 2002; Chudasama and Robbins 2003). Reversal learning is also impaired in rats by excitotoxic lesions and DA depletion of the dorsomedial striatum (or caudate) (Castane et al. 2010; O’Neill and Brown 2007), with lesions of this region also impairing attentional set formation (Lindgren et al. 2013). In contrast, lesions of the lPFC impair ED set-shifting in marmosets (Dias et al. 1996) while a brain imaging study in humans found dissociable activations in lOFC, lPFC, and ACC, respectively, at the point of reversal, attentional control, and when new searches were initiated (Hampshire et al. 2012).

Impaired reversal learning in rats, monkeys, and humans has been strongly linked with a reduction in brain 5-HT (Fineberg et al. 2010) and specifically within the OFC (Clarke et al. 2004). Moreover, intra-OFC 5-HT2C antagonism has been shown to improve both spatial and visual reversal learning in rats (Boulougouris and Robbins 2010; Alsio et al. 2015). In addition, a recent study found reduced markers of 5-HT function in the OFC and raphé nucleus of rats exhibiting behavioral inflexibility on a spatial serial reversal learning task (Barlow et al. 2015). Such changes were accompanied by reduced and increased expression of monoamine oxidase in the raphé nucleus and OFC, respectively. Regulation of reversal learning by 5-HT appears to be mediated cortically rather than within the caudate (Clarke et al. 2007, 2011).

In contrast, DA acting at the level of the caudate has been shown to play a key role in behavioral flexibility. In humans, reversal learning performance correlates with methylphenidate-induced DA release in this region (Clatworthy et al. 2009) while reduced D2 receptor availability predicts reduced ventral striatal activation during probabilistic reversals (Jocham et al. 2009) and is linked to OCD (Denys et al. 2004). Intriguingly, the D2 receptor agonist, bromocriptine, improved performance of a task-set-shifting task but only in individuals with genetically reduced levels of DA (van Holstein et al. 2011). This improvement was abolished by pretreatment with the D2 antagonist sulpiride. In rats and monkeys, neurochemically selective depletion of DA in the dorsomedial striatum (caudate) impairs reversal learning (O’Neill and Brown 2007; Clarke et al. 2011), consistent with evidence that systemic blockade of D2 but not D1 receptors impairs reversal learning in monkeys (Lee et al. 2007).

3.2 Stimulus-Response (Habit) Learning

Converging animal and human studies suggests that two different processes of learning are associated with decision-making: the acquisition of goal-directed actions involves decisions made on affective outcome and are governed by knowledge of the association between actions and the value of consequences or response-outcome (R-O) associations. In contrast, habitual choices are made on previously reinforced choices or learned stimulus-response associations (S-R) based on predictive stimuli and are divorced from the value of the outcome. Both goal-directed and habitual learning are used in parallel but with extended training shifts toward habitual control (Adams and Dickinson 1981; Dickinson and Balleine 2002). The relationship between these two forms of learning are formally assessed following training in rodent studies with testing in extinction (without the outcome) with outcome devaluation or contingency degradation.

A set of paradigms using a computational account based on reinforcement learning have been applied to goal-directed and habit learning also termed model-based and model-free control (Daw et al. 2011; Dolan and Dayan 2013). Goal-directed control is prospective and computationally demanding and is based on a learned internal model of the environment, whereas habitual control is retrospective, efficient, and based on the memory of the previously reinforced actions divorced from the predicted outcomes. The two-step task is a sequential two-choice decision task in which subjects make two choices on every trial leading to a rewarded or non-rewarded outcome (Daw et al. 2011). Choices at the first stage are associated with a likely (common) and an unlikely (uncommon) transition. Model-free habitual control is based on the repetition of a previously rewarded action regardless of this transition, whereas model-based goal-directed control takes into account the probability of state–state transitions and selects actions that will more likely lead to reward on future trials. Thus, after uncommon transitions, a reward will lead a model-free subject to choose the same first-stage stimulus on the next trial since the action values are updated based on the reward from the previous action. A model-based subject representing the task structure after receiving a reward following an uncommon transition would switch to the alternate first-stage stimulus since this would be more likely to lead to reward at the second stage. Behavior is best reflected by a hybrid model integrating both model-based and model-free learning (Daw et al. 2011). Healthy volunteers are shown to use a relative mix of both types of control. The outcomes of the two-step task have been shown to correlate with outcomes in conventional overtraining and outcome devaluation tasks (Friedel et al. 2014).

3.2.1 Neural Substrates

Lesions of the posterior dorsomedial striatum or prelimbic cortex prevent the expression of goal-directed learning leaving intact habit learning insensitive to outcome devaluation and contingency degradation (Yin et al. 2005; Balleine and Dickinson 1998). Lesions to the dorsolateral striatum leave intact goal-directed behaviors and lesions to infralimbic result in intact sensitivity to outcome devaluation despite extended training (Balleine and Dickinson 1998; Yin et al. 2004; Killcross and Coutureau 2003).

Human studies have translated tasks in animals to investigate neural correlates. The relationship between goal-directed behaviors and outcome value was assessed in a study in which subjects were moderately trained on two instrumental tasks with differing food outcomes, one of which was devalued by feeding to satiety (Valentin et al. 2007). Behaviors remained goal-directed with decreased actions for the devalued outcome in extinction associated with a decrease in OFC activity. In a study investigating overtrained habitual behaviors, subjects were trained on action–outcome contingencies in which one group that was extensively trained did not retain outcome sensitivity with testing in extinction relative to the other group that was minimally trained. Greater habitual behaviors over the course of learning were associated with increased cue-related activity in the posterior putamen (Tricomi et al. 2009).

An alternate design in humans uses a conflict procedure (de Wit et al. 2009). Subjects must first learn the contingencies between a cue (fruit) and response (left or right button) and outcomes (fruit) for points. When the cue and outcome were congruent, both goal-directed and habitual systems were recruited, whereas only the habitual system was predominantly used when the cue and outcome were incongruent as using the goal-directed system would be disadvantageous. The authors show that under conditions in which goal-directed action predominated, ventromedial prefrontal cortical activity was enhanced. Following acquisition, an instructed outcome devaluation test was then performed in which subjects were presented with two open boxes with one fruit previously associated with a left button press and another with a right button press. One fruit was shown with a cross indicating it had been devalued. An additional testing ‘slips-of-action’ testing phase in which subjects were instructed that two of the six different fruit outcomes were devalued or would be associated with loss of points. Subjects were then shown boxes with fruit for which points could be earned for pressing valued fruit outcomes and avoiding losing points by withholding pressing for devalued fruit outcomes. Habitual ‘slips-of-action’ toward the outcomes that were no longer rewarding were associated with greater white matter tract strength between the premotor cortex and the posterior putamen, whereas goal-directed actions were associated with greater tract strength in the ventromedial prefrontal cortex and caudate (de Wit et al. 2012a).

Human functional imaging studies focusing on the encoding of reward value signals relevant for action selection implicate the medial OFC extending dorsally along the medial PFC. These regions represent action–outcome associations (Daw et al. 2006) separate from stimulus-related value signals (Valentin et al. 2007). The caudate is also implicated in the online computation of action–outcome contingency to guide goal-directed learning (Tanaka et al. 2008; Liljeholm et al. 2011). Using the two-step task, both outcome prediction error (the difference between received and expected outcomes used in model-free learning) and state prediction error (the discrepancy between the observed and expected state transition used in model-based learning) converge on the ventral striatum (Daw et al. 2011). Greater model-based learning is associated with greater medial OFC and caudate volumes (Voon et al. 2015b).

State-dependent prediction error relevant to goal-directed behavior in the two-step task is also represented in the lPFC and intraparietal sulcus (Glascher et al. 2010). Transcranial magnetic stimulation to the right dlPFC impairs model-based but not model-free control; in contrast, the left dlPFC disrupts model-based choices dependent on working memory capacity (Smittenaar et al. 2013b). Model-based control can be impaired based on dual-task performance with a demanding task (Otto et al. 2013) and under conditions of stress.

A study in healthy volunteers using the three-step task, an update of the two-step task, showed that values associated goal-directed forward planning trials were associated with caudate activity, whereas habitual trials over-trained over 3 days were associated with posterior putamen activity irrespective of the final choice (Wunderlich et al. 2012). The ventromedial PFC increased connectivity with both caudate and putamen during choice and encoded the chosen value suggesting an active role as a value comparator.

In rodent studies, DA appears to strengthen habit formation as shown by the effects of a sensitizing regimen of amphetamine (Nelson and Killcross 2006). This effect depends on D1 receptor activation (Nelson and Killcross 2013). Furthermore, selective lesions of the nigrostriatal dopaminergic system impair habit formation (Faure et al. 2005). In humans, decreasing DA function with an acute dietary intervention encourages greater habitual control in the slips-of-action task (de Wit et al. 2012b). Consistent with these findings greater presynaptic DA synthesis, as measured using F-DOPA PET, correlates with greater model-based learning (Deserno et al. 2015).

A role for 5-HT in habit learning has also been suggested. Decreasing forebrain 5-HT and systemic 5HT2C antagonism enhanced compulsive cocaine seeking in rodents, which was reversed by a 5HT2C agonist (Pelloux et al. 2012). Overexpression of 5-HT6 receptors in the rodent dorsolateral striatum was associated with decreased habit learning with reduced lever pressing under extinction (Eskenazi and Neumaier 2011). Acute tryptophan depletion in healthy humans produces a shift toward habitual responding on the slips-of-action task (Worbe et al. 2015a). Similarly, this intervention impaired model-based goal-directed behaviors to reward outcomes but enhanced model-free habitual behaviors to loss outcomes in the two-step task. One possible mechanism whereby tonic 5-HT might enhance goal-directed behaviors is changing the long-run average reward representation by providing a positive or negative signal of the ‘goodness’ or ‘badness’ of the environment (Daw et al. 2002). Indeed, 5-HT signaling may signify the cost associated with deliberation (Niv et al. 2007; Keramati et al. 2011).

OCD subjects also show impaired goal-directed knowledge in the instructed outcome devaluation test with greater responding to devalued outcomes indicating greater habitual responding in the slips-of-action test (Gillan et al. 2011). Similarly, OCD subjects were impaired on model-based goal-directed learning to reward on the two-step task. This has now been independently replicated at two sites with compulsivity correlating with model-based behaviors (Voon et al. 2015b, c). The phenomenology of compulsive symptoms in OCD may be better captured by aversive avoidance which has been shown in which OCD subjects showed greater habitual responding following overtraining to a virtual devaluation of a shock outcome in an aversive shock habit task (Gillan et al. 2013). These results contrast with a study using monetary loss outcomes in the two-step task where OCD subjects had greater model-based behaviors to loss outcomes (Voon et al. 2015c). These differing results may be related to sensitivity to motivational status with decreased sensitivity to monetary rewards and enhanced sensitivity to monetary losses in the two-step task, differential behavioral responding to losses as compared to shock outcomes, or differences in task design.

Alcohol-dependent subjects show impaired goal-directed learning based on the outcome devaluation test along with decreased activity in the ventromedial PFC and anterior putamen and increased activity in the posterior putamen (Sjoerds et al. 2013). Similarly, heavy drinkers showed greater activity in the dorsal striatum to drinking cues, whereas light drinkers showed greater prefrontal and ventral striatal activity (Vollstadt-Klein et al. 2010). One study testing alcohol-dependent subjects after 2 weeks showed impaired model-free behaviors (Sebold et al. 2014), whereas another study did not show any differences from healthy volunteers but did show that abstinence improved model-based learning (Voon et al. 2015b). Subjects with methamphetamine dependence and obese subjects with BED were also shown to have impaired model-based learning on the two-step task (Voon et al. 2015b).

3.3 Motor Stereotypy

Motor stereotyped behavior such as pacing, route tracing, and repetitive orofacial movements (e.g., sniffing, rearing, licking, and gnawing) can be induced in animals by high doses of psychostimulant drugs (Divac 1972; Fog 1972; Kelly et al. 1975), D1 and D2 receptor agonists (LaHoste and Marshall 1993; Kreipke and Walker 2004), and environmental variables such as feeding times in captive and commercial animals (Lawrence and Terlouw 1993). Intense orofacial stereotypes, however, appear to require the activation of both D1 and D2 receptors (Delfs and Kelley 1990). Motor stereotypes are recognized to reflect a disruption within basal ganglia circuitry and specifically dopaminergic modulation of the direct and indirect pathways of the dorsal striatum (or caudate putamen in humans) (Arnt 1985; Langen et al. 2011). Although the release of behavior is thought to depend on the balance in activity between the D1 and D2 receptor-modulated direct and indirect pathways (DeLong and Wichmann 2015), relative activity in the striosomal (patch) and extrastriosomal (matrix) compartments of the striatum may also be important. Thus, the intensity of psychostimulant-induced stereotypy in rats has been shown to correlate with greater relative activity in striosomal neurons than matrix neurons (Canales and Graybiel 2000). Striosomal neurons receive inputs preferentially from the frontal cortical regions (e.g., ACC, prelimbic cortex) and, for the most part, express dynorphin and substance P to form the direct striatonigral pathway (Gerfen 1992; Ragsdale and Graybiel 1990). Thus, the expression of motor stereotypies appears to involve an imbalance between cortico-striatal circuits at different levels of anatomical and functional organization, which may not be mutually exclusive (Langen et al. 2011).

Motor stereotypies in humans can include behaviors such as hand flapping in autism or punding behaviors observed at peak dose in cocaine users (Rylander 1972) and in the context of parkinsonian medications (Evans et al. 2004). Punding involves excessive non-goal-oriented repetitive behaviors. Punding has been reported on high doses of cocaine which included simple motor actions such as repeatedly playing with the intravenous pole or driving motorcycles around the block in circles. Similarly, punding has been associated with high doses of Levodopa and DA agonists such as apomorphine in patients with PD (Evans et al. 2004; Miyasaki 2007) or restless legs syndrome (Voon et al. 2011) and can include simple acts such as shuffling papers, collecting buttons, rearranging handbags, to more complex behavioral sequences such as repeatedly taking apart lawn mowers or hobbies such as gardening or painting. Although systematic studies are lacking, descriptive studies suggest a link between these behaviors and previous occupations or gender stereotypes (e.g., an accountant subsequently shuffling papers or a seamstress collecting buttons) suggesting a possible role for disinhibition of previously learned motor repertoires (Evans et al. 2004; Voon 2004).

4 Synthesis and Future Perspectives

Impulsivity and compulsivity are often used interchangeably but despite being neurally and psychologically distinct can be present in the same disorder (Fineberg et al. 2010). Distinct and overlapping fronto-striatal networks implicated in various impulsivity and compulsivity subtypes are depicted in Fig. 3. We emphasize the convergent and divergent roles of both neural anatomy and neurochemistry and its relevance to impulsivity and compulsivity.

Fig. 3
figure 3

Fronto-striatal substrates of impulsivity and compulsivity. The prefrontal and striatal images represent actual resting state connectivity patterns in healthy volunteers based on well-defined prefrontal functionally defined seeds (Morris et al. submitted). The columns to the right represent the impulsivity (right) and compulsivity (left) subtypes that map onto these fronto-striatal networks. Abbreviations: DLPFC dorsolateral prefrontal cortex; VLPFC ventrolateral prefrontal cortex; IPFC inferior prefrontal cortex; LOFC lateral orbitofrontal cortex; D cing dorsal cingulate; SG cing subgenual cingulate; VMPFC ventromedial prefrontal cortex; MOFC medial orbitofrontal cortex; SMA supplementary motor area; PMC premotor cortex; pre-SMA presupplementary motor area; Ant PFC anterior prefrontal cortex

4.1 Neural Considerations: Fronto-striatal Nodes and Opponency Processes

Multiple levels of striatal anatomy and connectivity allow for convergent and divergent functional expression. The NAcb/VS and STN are critical convergent nodes in behavioral control (Fig. 2) and targets for therapeutic intervention, e.g., DBS for PD (Deuschl et al. 2006), OCD (Mallet et al. 2008), and addiction (Muller et al. 2013). Opponency processes also exist in the striatum on multiple levels providing fine-grained control and heterogeneity of function. Broadly, ventral striatal regions are more closely associated with impulsivity and dorsal striatal with compulsivity with diverging prefrontal inputs. The NAcb/VS is implicated in waiting impulsivity (Dalley et al. 2007a; Caprioli et al. 2013) and delay discounting (Cardinal et al. 2004, 2001); Winstanley et al. 2004; Ballard and Knutson 2009; Tanaka et al. 2004; Kable and Glimcher 2007; McClure et al. 2004), subtypes of impulsivity which have been shown to correlate in rodents but not in humans (Voon et al. 2014). In rodent studies, delay discounting studies are conducted with real-time feedback with short delays, whereas human studies typically use hypothetical monetary feedback with long delays. Whether waiting impulsivity correlates with delay discounting in tasks involving shorter delays with real-time feedback remains to be established. Dorsal striatal regions are implicated in proactive stopping with the putamen more specifically implicated in proactive stopping of specific motor responses (Zandbelt and Vink 2010; Smittenaar et al. 2013a; Majid et al. 2013). The dorsomedial/caudate and dorsolateral/putamen are associated with the opponent processes of goal-directed and habitual behaviors, respectively (Daw et al. 2011; Yin et al. 2005; Balleine and Dickinson 1998; Yin et al. 2004; Killcross and Coutureau 2003; Tricomi et al. 2009; de Wit et al. 2012a; Voon et al. 2015b; Wunderlich et al. 2012). The NAcb shell and dorsolateral striatal regions are integrated and linked via spiraling loops from the ventral to dorsal midbrain (Haber et al. 2000). The STN in the indirect pathway is a critical node relevant particularly to impulsivity and receives projections from fronto-striatal circuitry implicated in waiting impulsivity (Baunez and Robbins 1997; Eagle and Baunez 2010) with hyperdirect cortical connections to the STN implicated in fast reactive stopping (Aron 2011; Ballanger et al. 2009) and conflict-induced slowing (Frank et al. 2007).

Opponency processes exist on several levels in the striatum. The NAcb in rodents can be divided into a core and shell with differing afferent and efferent connections with opposing effects on waiting impulsivity of DBS and DA- and NE-based interventions (Sesia et al. 2008; Sesia et al. 2010; Economidou et al. 2012; Moreno et al. 2013). Waiting impulsivity in humans similarly implicates the ventral striatum (Morris et al. 2015) although it is unclear whether a similar subregional functional distinction exists as in rodents. Lesions of the NAcb core also impair delay discounting in rodents (Cardinal et al. 2001) with striatal DA coding for delayed rewards (Kobayashi and Schultz 2008) and optogenetic manipulation of the dopaminergic system influencing delay-related choices (Saddoris et al. 2015). The fronto-striatal direct and indirect pathways, associated with D1 and D2 receptors, respectively, provide facilitating and inhibitory functions (DeLong and Wichmann 2015). Activation and antagonism of D1 and D2 receptors also has a differential effect on stereotypies and locomotor behaviors (DeLong and Wichmann 2015). On a microstructural level, stereotypies are associated with greater engagement of striosomal (patch) compartments receiving prefrontal inputs and outputs to the direct pathway relative to extrastriosomal (matrix) compartments (Langen et al. 2011; Canales and Graybiel 2000). Thus, subtypes of impulsivity and compulsivity map onto a striatal neural organization on both macro- and microstructural levels have both convergent and divergent structural and functional organizations.

4.2 Neurochemical Considerations: Dopamine and Serotonin

DA influences multiple forms of impulsivity and compulsivity which may in part be related to differential receptor expression in the striatum in impulsivity and compulsivity implicating ventral and dorsal striatal regions, respectively. Thus, low D2/3 receptor availability in the ventral striatum, without changes in DA release, predicts high levels of premature responding in rats (Dalley et al. 2007a), and knocking down D2 receptors in the rodent putamen generates binge-like eating suggestive of compulsive behaviors (Johnson and Kenny 2010). Lower ventral striatal D2/D3 receptor availability has also been shown to correlate more specifically with greater delay discounting in pathological gamblers (Joutsa et al. 2015), methamphetamine dependence (Ballard et al. 2015), and alcohol use disorders (Oberlin et al. 2015). Further, low midbrain D2/3 receptors have been shown to correlate with questionnaire-based impulsivity in healthy volunteers, and striatal DA release (Buckholtz et al. 2010).

Impulsivity and compulsivity can be further dissociated by serotonergic function and opposing actions of the 5HT2A and 2C receptors, which may be related to opposing effects on DA function (Cunningham and Anastasio 2014; Howell and Cunningham 2015) and differential actions on prefrontal and striatal regions. In rodents, prefrontal 5-HT depletion enhances waiting impulsivity (Winstanley et al. 2004a; Robinson et al. 2008b) and central 5-HT depletion enhances delay discounting (Mobini et al. 2000) with optogenetic activation of dorsal raphe neurons enhancing the capacity to wait for delayed rewards (Fonseca et al. 2015). In humans, central 5-HT depletion similarly enhances waiting impulsivity (Worbe et al. 2014). More specifically, systemic, intra-NAcb, and prefrontal 5HT2A antagonism decrease premature responding, whereas similar 5HT2C antagonism increases premature responding (Winstanley et al. 2004a; Robinson et al. 2008b). Although one study showed effects in both prefrontal and accumbal regions, another study demonstrated specificity to accumbal and not prefrontal regions. Systemic 5HT2C and 5HT2A antagonists enhance and impair reversal learning, respectively (Boulougouris et al. 2008) with effects of 5HT2C antagonists specific to the OFC (Boulougouris and Robbins 2010). Thus, 5HT2A antagonism within the NAcb decreases premature responding and 5HT2C antagonism within the PFC improves reversal learning. In contrast, decreasing forebrain 5-HT and systemic 5HT2C antagonism enhances compulsive cocaine seeking or habitual behaviors in rodents which can be reversed by 5HT2C agonist (Pelloux et al. 2012). These findings converge with human studies in which central 5-HT depletion encourages goal-directed learning rather than habitual behaviors (Worbe et al. 2015b).

5 Conclusion

In summary, we highlight the translational and back-translational relevance of subtypes of impulsivity and compulsivity. These constructs map onto distinct fronto-striatal neural and neurochemical systems interacting both at nodal convergent points and as opponent processes highlighting both the heterogeneity and the commonalities of function. We emphasize the relevance of these constructs for understanding dimensional psychiatry.