Introduction

Gambling disorder (GD) is a growing public health concern. Access to gambling opportunities is growing steadily, yet the aetiology of GD remains unknown and treatment options lack efficacy (Grant et al. 2012; Limbrick-Oldfield et al. 2013). GD was recently reclassified as an addiction disorder in DSM5, in part due to the similarity in phenomenology between GD and drug addiction (Potenza 2006). In contrast to chemical dependency, a prototypical neurobiological phenotype, such as low D2/3 receptor expression, has not been identified for GD (Limbrick-Oldfield et al. 2013; Volkow et al. 2004; Volkow et al. 2007). Given the centrality of dopamine (DA) to substance use disorder and its undisputed role in motivational processes, this neurotransmitter system has understandably been the focus of much research into the neurobiological basis of GD. Nevertheless, a definitive understanding of the role played by dopaminergic dysfunction in the formation or maintenance of GD has remained elusive (Potenza 2013), potentially due to the highly heterogeneous nature of both the GD and the structural characteristics of different gambling scenarios (see Cocker and Winstanley 2015 for discussion).

However, one form of GD that can be causally linked to aberrant dopaminergic function is the iatrogenic gambling seen following DA replacement therapy. This particularly devastating form of pathological gambling has been predominantly described in Parkinson’s disease (PD) but has also been observed in patients with restless leg syndrome (RLS), fibromyalgia and prolactinoma (Clark and Dagher 2014; Voon et al. 2007a; Weintraub and Potenza 2006). Although there are instances of GD and impulse control disorders (ICDs) emerging following l-DOPA treatment (Dodd et al. 2005), the majority of such cases develop in response to treatment with agonists that predominantly activate the D2/3 receptor subtypes, such as ropinirole and pramipexole (Voon et al. 2007b; Weintraub et al. 2015). Animal models have previously shown that both acute and chronic treatments with DA agonists with high affinity for the D2-like family promote a compulsive-like behavioural phenotype (Eagle et al. 2014; Szechtman et al. 1998), raising the possibility that dysfunction within the DA system could precipitate a compulsive form of gambling.

In support of this hypothesis, recent animal work has confirmed that repeated injections of pramipexole increase choice of uncertain outcomes in a probability-discounting task (Rokosik and Napier 2012). However, patients who develop GD as a result of DA agonist therapies appear to prefer simple, repetitive games, such as slot machines, in which there is little (if any) explicit computation of utility (Rossi et al. 2010). Within the panoply of gambling games, slots and other forms of electronic gaming machines (EGMs) have been suggested as a particularly virulent form of gambling, precisely because these machines are engineered to promote compulsive, or unthinking, play (see Murch and Clark 2015 for discussion). Thus, given the qualitatively similar style of gambling engagement, there may be particularly relevant parallels between the developments of iatrogenic GD following DA agonist therapy and compulsive engagement with EGMs. It would therefore be useful to determine the impact of chronic D2/3 agonist treatment in a rodent model of this type of gambling.

We have developed and validated a rodent analogue of a simple slot machine, wherein rats share key behavioural features with human gamblers. Specifically, reward-salient cues are able to evoke the expectation of imminent reward. We have previously argued that this phenomenon is translationally analogous to the so-called near-miss effect. Acute administration of the D2-like agonist quinpirole potentiated erroneous expectations of reward (Cocker et al. 2014; Winstanley et al. 2011). Thus, our previous investigations indicate that acutely augmented dopaminergic signalling through D2-like receptors may increase animals’ responsivity to reward-related cues in a manner potentially comparable with human gamblers (Habib and Dixon 2010). Given the critical role, D2-like receptors play in modulating performance on the rodent slot machine task (rSMT) and the observation that repeated administration of D2-like agonists can promote compulsivity (Eagle et al. 2014; Szechtman et al. 1998), chronic administration of D2-like agonists on the rSMT may represent a putative model for problematic engagement with gambling.

Here, we intended to first determine if chronic administration of the D2-like agonist ropinirole, a drug that is commonly associated with the formation of ICDs in PD, alters animals’ performance on the rSMT. Secondly, we assessed whether ropinirole promotes continued task engagement during extinction and reinstatement sessions. A failure to alter responding as contingencies change is a canonical measure of habitual responding (Balleine and O’Doherty 2010). The transition from goal directed to habitual control of behaviour has been suggested to underlie the development of pro-addictive behaviours (Everitt et al. 2008), thus, potentiated task involvement could be considered evidence of the efficacy of chronic ropinirole as a model of problem gambling. Finally, we explore any ropinirole-induced changes in receptor expression and intracellular signalling pathways ex vivo in order to gain insight into the possible mechanism underlying behavioural change and hence potentially inform our understanding of how DA may contribute towards the aberrant decision making observed in problematic gambling. Such data may also be highly impactful for clinical use of D2/3 agonists in PD treatment; although a previous history of ICDs or earlier age of PD onset have been indicated as risk factors for the development of iatrogenic ICDs (Voon et al. 2007b), little is known about the aetiology of these psychiatric side effects or how to prevent them. As such, many neurologists have dropped this class of drugs from their already limited toolbox of effective treatments for PD, resulting in a heavy reliance on l-DOPA with all the associated motor side effects (Calabresi et al. 2010).

Methods and materials

Subjects

Subjects were 32 male Long Evans rats (Charles River Laboratories, St. Constant, Canada) weighing 275–300 g at the start of testing. Subjects were food restricted to 85 % of their free feeding weight and maintained on 14-g rat chow given daily. Water was available ad libitum. All animals were pair housed in a climate-controlled colony room maintained at 21 °C on a reverse 12-h light-dark schedule (lights off 8 a.m.). The testing and housing were in accordance with the Canadian Council of Animal Care, and all experimental protocols were approved by the Animal Care Committee of the University of British Columbia.

Overview of experimental timeline

Animals were first trained to perform the rSMT until behavioural stability was reached. Osmotic mini-pumps were then implanted containing either ropinirole or saline. One day later, rSMT testing was recommenced 6 days a week. Thus, a total of 24 sessions were completed by all animals during the 28 days of ropinirole/saline administration. The behavioural protocol was identical to pre-operative training for the first 14 sessions. For sessions 15–20 (6 sessions), rats were run in extinction, in that win trials did not result in reward delivery. In the final 4 sessions (sessions 21–24), reinforcement was reinstated for win trials. Immediately following the last day of ropinirole/saline administration, half of the animals from each drug group were sacrificed. Osmotic mini-pumps were surgically removed from the remaining animals, and behavioural testing continued on the rSMT for a further 4 weeks (20 sessions) in order to elucidate any long-term effects of ropinirole administration. Animals were sacrificed by live decapitation immediately following the last behavioural test session.

Rodent slot machine task

Testing took place in 16 standard five-hole operant chambers (Med Associates, St. Albans, VT, which have previously been described in detail (Cocker et al. 2012a)). A full description of the behavioural apparatus, habituation and training can be found in our previous publications (Cocker et al. 2016a; Cocker et al. 2014; Cocker et al. 2016b; Winstanley et al. 2011), but in brief, animals were habituated to the operant chambers over two successive days during which time the boxes were turned on and the nose poke apertures baited with sugar pellets. Animals were then trained on simplified versions of the rSMT of increasingly complexity before finally transitioning to the rSMT proper.

A diagram illustrating the different stages of each rSMT trial is shown in Fig. 1. Rats responded to a series of three flashing lights, analogous to the three wheels of a slot machine, which caused the lights to set to on or off. A win (i.e. reward available) was signalled by all three lights setting to on, whereas any other light pattern indicated a loss. At the end of the trial, rats chose between responding on the left (collect) lever, which resulted in 10 sugar pellets on win trials but a 10-s time penalty on loss trials, or starting a new trial instead by responding on the right (roll) lever. Hence, the optimal strategy was to choose the collect lever on win trials and the roll lever on all other trial types. The use of three active holes resulted in eight possible trial types ([1,1,1]; [1,1,0]; [1,0,1]; [0,1,1]; [1,0,0]; [0,1,0]; [0,0,1]; [0,0,0]). The incidence of the different trial types was distributed evenly throughout the session such that each trial type occurred at least once every 8 trials and not more than twice in every 16 trials. The exact sequence of trials was randomised within these constraints. Animals received five to six daily testing sessions per week until statistically stable patterns of choice had been established over five sessions across the different trial types, indicative of plateaued asymptotic performance (101 sessions total). All sessions lasted for 30 min, and animals could complete an unlimited amount of trials within this time.

Fig. 1
figure 1

Trial structure of the rSMT All testing took place in five-hole operant chambers, of which the three middle holes (holes 24) were used for this task. a Animals initiated each trial by responding on the ‘roll’, lever. This lever retracted and the light inside hole 2 began to flash. Once the rat responded at this aperture, the light inside set to on or off for the remainder of the trial and either a 20 kHz (light on) or 12 kHz (light off) tone sounded for 1 s, after which the light in hole 3 began to flash. b Again, a nose poke response resulted in the light setting to on or off and the sounding of the 20/12 kHz tone, after which the light in hole 4 started to flash. c Once the rat responded in hole 4 and the light inside set to on or off, again accompanied by the relevant tone, both the ‘collect’ and roll levers were presented. The rat was then required to respond on one of the levers; the optimum choice was determined by the pattern of lights in holes 2–4. d On win trials, all three lights were set to on [1,1,1], and a response on the collect lever led to delivery of 10 sugar pellets. e If any of the lights had set to off (i.e. a ‘loss’ trial), a response on the collect lever led to a 10 s time-out period, during which reward could not be earned. If the rat chose the roll lever on any trial type, then the collect lever retracted, the potential reward or time-out was cancelled and a new trial began. Hence, on win trials, the optimal strategy was to respond on the collect lever to obtain the scheduled reward, whereas on loss trials, the optimal strategy was to instead respond on the roll lever and start a new trial (optimal strategy represented pictorially with ‘smiley’ emoticon). If the rat chose to collect, both the collect and roll levers retracted until the end of the reward delivery/time-out period, after which the roll lever was presented and the rat could initiate the next trial. f There were 8 possible trial types. Figure is modified from Winstanley et al. (2011)

Osmotic pump implantation

Animals were divided into two groups and matched for their task performance. Subsequently animals were anaesthetised using 2 % isoflurane in O2 and subcutaneously implanted with osmotic mini-pumps (Alzet, DURECT corporation, Cupertino, CA) delivering either 5 mg/kg/day ropinirole hydrochloride (Tocris, Ellisville, MO) or 0.09 % saline solution for 28 days. Doses were calculated for each rat using the Alzet guide and based on previous reports showing that this dose of ropinirole was similar to that used in a prolonged release formulation of ropinirole in human patients (Nashatizadeh et al. 2009) and showed neuroprotective and anxiolytic effects in rats (Iida et al. 1999; Matsukawa et al. 2007; Millan et al. 2004; Rogers et al. 2000). Animals were administered anafen and buprenorphine for systemic and local analgesia respectively. Animals were allowed to recover in their home cage for 1 day before testing resumed.

Extinction and reinstatement

The purpose of the extinction/reinstatement phase was to determine if chronic ropinirole administration would prevent task performance from declining when reward was no longer available, as might be expected if behaviour had become habitual rather than truly goal directed (Balleine and Dickinson 1998). Both ropinirole- and saline-treated animals performed the rSMT in extinction during which a collect response following a win trial no longer led to the delivery of sugar pellets. Animals were run on the extinction version of the program until the decrease in the number of trials completed by both groups appeared to plateau. Animals were then run on the regular version of the rSMT to reinstate performance.

Ex vivo analysis

Tissue samples from the medial prefrontal cortex (mPFC), orbitofrontal cortex (OFC), dorsal striatum and nucleus accumbens (NAC) were harvested and flash frozen. Tissue from the dorsal striatum and NAC was analysed via Western blotting to determine protein levels for DA D1 and D2 receptors, dopamine and cyclic adenosine (cAMP)-regulated phosphoprotein with molecular weight 32 kDa (DARPP), phosphorylated DARPP at serine 34 monophosphate (pDARPP34) and threonine-75 (pDARPP75), cAMP response element binding protein (CREB), pCREB, glycogen synthase kinase-3beta (GSK3β) and β-tubulin. Quantitative polymerase chain reaction (qPCR) was used to determine RNA levels of D1, D2 and serotonin (5-HT) 5-HT2A receptors in the mPFC and OFC.

Western blotting

Frozen tissue samples were defrosted on ice in 50 μL fresh lysis buffer: RIPA buffer (50 mM Tris, 150 mM NaCl, 10 % SDS, 1 % IGEPAL, 0.5 % Sarkosyl, pH 8.0, 4 °C) enriched with protease (complete protease inhibitor cocktail, Roche Diagnostics, Laval, QC, Canada) and phosphatase inhibitors (Halt phosphatase inhibitor cocktail, Thermo Scientific, Rockford, IL, USA). Samples were homogenised by sonification and centrifuged for 15 min (15,700×g, 4 °C). Protein levels in the supernatant were determined with a NanoDrop 2000 spectrophotometer (Thermo Scientific), and 75 μg protein per sample was loaded onto 10 % Tris polyacrylamide gel for electrophoresis separation. Samples were subsequently transferred to a polyvinyl difluoride membrane, washed briefly in phosphate-buffered saline (PBS), and blocked for 1 h at room temperature (Odyssey blocking buffer, LI-COR Biosciences, Lincoln, NE). Blocked membranes were incubated overnight at 4 °C with primary antibodies 1/1000 CREB, 1/500 pDARPP32 (Thr34), 1/500 pDARPP32 (ser75), 1/1000 GSK3β, and 1/500 pGSK3β (Cell Signalling Technology, Danvers, MA); 1/1000 pCREB (Ser133) and 1/200 D2 receptor (Millipore, Billerica, MA); 1/500 DARPP32 (BD Biosciences, Mississauga, ON, Canada); and 1/200 D1 receptor (Santa Cruz Biotechnology, Dallas, TX) in blocking buffer containing 0.1 % Tween-20. Membranes were thoroughly washed in PBST (4× each for 10 min) and incubated for 2 h at room temperature with the appropriate IRDye® secondary antibodies (1/10,000 goat anti-mouse, 1/10,000 goat anti-rabbit, LI-COR Biosciences) in blocking buffer containing 0.1 % Tween-20; goat anti-rabbit incubation solution also comprised 0.005 % SDS. Membranes were thoroughly washed with PBST (4× each for 10 min) and then with PBS (4× each for 10 min). Visualisation and quantification of protein levels were undertaken with the Odyssey Imaging System (LI-COR Biosciences), and protein content was normalised to levels of β-tubulin (0.2 μg/mL; anti-β-tubulin antibody, Millipore).

Quantitative polymerase chain reaction

Frozen tissue samples were homogenised in TRIzol (Invitrogen, CA). Genomic DNA was removed from the sample using the DNA-free kit (Ambion, TX) and cDNA synthesis was performed using the SuperScript vilo kit (Invitrogen, CA). Transcripts for genes of interest were quantified using real-time qPCR (SYBR GreenER; Invitrogen, CA) on a StepOnePlus 96-well thermocycler (Applied Biosystems, CA). All primers were custom synthesised by the UBC campus NAPS unit (Michael Smith Laboratories, UBC, Canada; for sequences, see Table 1) and validated for linearity and specificity. All PCR data were normalised to levels of glyceraldehyde-3-phosphate dehydrogenase (GAPDH), which did not vary across group, based on the following formula ∆Ct = Ct (gene of interest)-Ct (GAPDH). Adjusted expression levels for ropinirole-treated animals were calculated relative to the saline-treated animals as follows: ∆∆Ct = ∆Ct (ropinirole group) − ∆Ct (saline group). Consistent with previous reports, expression levels relative to controls were then calculated using the following expression 2−∆∆Ct.

Table 1 The sequence of the primers used to detect mRNA for target genes using qPCR

Data analyses

All analyses were conducted using SPSS (v23.0). One of the key dependent variables analysed was the percentage of trials on which animals pressed the collect lever on different trial types. Where applicable, data were subjected to an arcsine transformation to limit the impact of an artificial ceiling (i.e. 100 %), and all data were analysed by within-subject analysis of variance (ANOVA). During training, collect lever choice was analysed in five session bins with session (five levels) and trial type (eight levels) as within-subject factors. A stable baseline was defined as the lack of a significant effect of session or trial type by session interaction. To determine the impact of the number of lights illuminated regardless of spatial position, data were pooled across two light trials ([1,1,0], [1,0,1], and [0,1,1]) and one light trial ([1,0,0], [0,1,0], and [0,0,1]). ANOVAs were then performed with session and lights illuminated (4 levels, 0–3) as within-subject factors. Likewise, ANOVAs were performed to determine whether choice was largely governed by illumination of a particular aperture by grouping trial types according to that hole’s light status (e.g. [1,1,0], [1,0,0], [0,1,0] vs [0,1,1], [0,0,1], [1,0,1]). The average latency to respond on the collect lever was analysed in a similar fashion with session and trial type as within-subject factors. The latency to respond in each aperture and the latency to respond to the subsequent hole based upon the illumination status of the preceding hole were analysed similarly with session and hole (3 levels, 1–3) or hole light status (4 levels, (H2byH3on/off and H4byH3on/off) as within-subject factors respectively. Lastly, the number of trials completed was analysed via repeated measures ANOVA with session as a within-subject factor.

During ropinirole administration data were parsed into bin 1 (sessions 1–5), bin 2 (sessions 6–10) and bin 3 (sessions 11–14) in order to meaningfully analyse main effects of session over 14 days. Extinction sessions were analysed separately (sessions 15–21) as were reinstatement sessions (sessions 22–24). All analyses remained the same as above with the exception that drug treatment (two levels; saline vs ropinirole) was used as a between-subject factor. Similarly, data from the animals used to assess the effects of a washout period was broken down into four 5-session bins.

Western blot and PCR analysis was conducted with repeated measures ANOVA with drug (2 levels) and time point (2 levels—sacrificed immediately following cessation of ropinirole (early) or following a 4-week washout (late)). The significance level for all effects was p ≤ 0.05. Analyses for which p ≤ 0.1 were described as trends. All data are presented as mean + standard error of the mean (SEM).

Results

Data from 12 of the animals were excluded from the analysis due to failure to meet established performance criteria of at least 50 trials completed per session and at least 50 % accuracy on clear loss trials ([0,0,0]) (Cocker et al. 2016a; Cocker et al. 2014; Cocker et al. 2016b; Winstanley et al. 2011). The total number of animals’ included in the analysis was therefore 24 (12 ropinirole and 12 saline).

Baseline behaviour

Prior to osmotic mini-pump implantation, animals were matched for baseline performance into two groups. There were no statistical differences between animals in the ropinirole or saline group on any behavioural variable, and consequently the data for both groups is pooled here (group, all Fs < 0.63, NS). Consistent with all previous reports using this task (Cocker et al. 2016a; Cocker et al. 2014; Cocker et al. 2016b; Winstanley et al. 2011), responses on the collect lever varied significantly across different trial types (Fig. 2a; trial type, F 7,154 = 117.48, p < 0.0001), with erroneous responses becoming increasingly likely as the number of illuminated apertures within the array increased (Fig. 2b; lights illuminated, F 3,66 = 355.99, p < 0.0001; 3 vs 2, F 1,22 = 148.93, p < 0.0001; 2 vs 1, F 1,22 = 153.3, p < 0.0001; 1 vs 0, F 1,22 = 54.92, p < 0.0001). The latency to respond on the collect lever also varied between trial types and was contingent on the number of lights illuminated within the array (Tables 2 and 3; trial type, F 7,154 = 10.98, p < 0.0001; lights illuminated, F 3,66 = 12.94, p < 0.0001). Interestingly, there was no difference between winning or near-miss (two light) trials, but similar to previous reports, responses became progressively quicker when 1 and 0 lights were illuminated (3 vs 2, F 1,22 = 0.51, NS; 2 vs 1, F 1,22 = 22.54, p < 0.0001; 1 vs 0, F 1,22 = 17.32, p < 0.0001), indicating that erroneous responses on these trial types are likely as a result of disinhibited or impulsive responding (Cocker et al. 2016a). Animals’ latency to nose poke into subsequent apertures was significantly reduced if the light in the preceding hole had set to on, indicating that signals concordant with a winning outcome serve to invigorate future responding, potentially indicative of the attribution of incentive salience to the illuminated lights (Tables 4 and 5; light status, F 1,22 = 19.13, p < 0.0001; see Cocker et al. 2016a; Cocker et al. 2014 for discussion). Animals completed an average of 83.12 ± 3.71 trials per session (saline 84.32 ± 3.36; ropinirole 81.92 ± 4.05).

Fig. 2
figure 2

Baseline rSMT performance. a, b Animals (n = 24) showed optimal responding on win trials (1,1,1), choosing to collect the available reward nearly 100 % of the time. Similarly, animals showed a marked preference for the optimal response, now the roll lever, when no lights were illuminated (0,0,0), only responding on the collect lever approximately 15 % of the time. Erroneous collect responses increased to 39.98 % when one light was illuminated. However, when two lights were illuminated in the array, animals responded erroneously on the collect lever at a far greater then chance level (72.38 % ± 4.76 (SEM)), indicating that rats, like humans, treat such stimuli as more indicative of a win than a loss and are hence susceptible to the near-miss effect. There were no differences between the saline and ropinirole groups. All data shown are the mean across five sessions + SEM

Table 2 Latency to respond on the collect lever by trial type at baseline and for different time points during saline administration
Table 3 Latency to respond on the collect lever by trial type at baseline and for different time points during ropinirole administration
Table 4 Latency to respond at subsequent hole based on the statues of the previous hole for baseline and different time points during saline administration
Table 5 Latency to respond at subsequent hole based on the statues of the previous hole for baseline and different time points during ropinirole administration

Ropinirole

Chronic administration of the D2-like agonist initially produced robust impairments in animals’ ability to differentiate winning from losing trials in comparison to saline-treated controls (Fig. 3a, b; bin 1; trial type × drug F 7,154 = 3.21, p = 0.003; Fig. 3c; bin 2; trial type × drug F 7,154 = 3.07, p = 0.005). These deficits in performance are comparable with those observed following acute administration of a D2-like agonist (Cocker et al. 2014; Winstanley et al. 2011). However, over time, animals receiving ropinirole actually began to show improvements in performance on the rSMT (Fig. 3c, d), such that by the third trial bin, they were almost less susceptible to erroneous collection responses on losing trials then control animals, although these results fell just short of statistical significance (Fig. 3e, f; bin 3; trial type × drug F 7,154 = 2.03, p = 0.06). The initial deficits observed may be best attributed to impaired performance on 2 and 1 light trials, although parsing the error rate by lights illuminated only reached trend-level significance (bin 1; lights illuminated × drug F 3,66 = 2.44, p = 0.072). In contrast, the impairments observed in the second bin of sessions were not dependent on the number of lights illuminated (bin 2; lights illuminated × drug F 3,66 = 0.26, NS), but rather on the status of the last light within the array, such that errors in bin 2 were more likely on trials in which the last light set to on (bin 2; last light × drug F 1,22 = 7.58, p = 0.01). Thus chronic ropinirole administration appeared to alter sensitivity to reward-related stimuli presented within close temporal proximity to a decision point. Four weeks after drug delivery had ceased, there was no difference in performance between ropinirole- and saline-treated animals (Fig. 5a; time point × drug F 7,49 = 0.76, NS).

Fig. 3
figure 3

Effect of chronic ropinirole on reward expectancy. a, b During the first trial bin, the D2/3 agonist led to an impairment in animals performance on the rSMT (n = 12 in each group), driven by an increase in erroneous collection responses on two and one light trials. c, d During the second trial bin, ropinirole-treated animals continued to show impairments in performance, although the deficits here were observed on trials wherein the last light in the array remained on. e, f By the final time bin, in contrast to the first two, ropinirole administration improved animals’ ability to differentiate winning from non-winning outcomes on the rSMT, albeit only at the level of a strong trend. In a similar manner to bin 2, these improvements were seemingly dependant on the status of the last aperture with improvements in performance on trials when the last light set to off. All data shown are the mean across four or five sessions + SEM

With regard to the speed at which collect responses were made, ropinirole-treated animals were slower to respond on the collect lever during the first bin of drug treatment, but this effect abated in the subsequent bins (Tables 2 and 3; bin 1; time point × drug F 7,154 = 2.94, p = 0.006; bin 2 F 7,154 = 0.37, NS; bin 3 F 7,154 = 1.15, NS). The tendency observed at baseline for animals to respond quicker in the subsequent hole if the light in the preceding hole that had set to on was still present in both ropinirole- and saline-treated animals in the first two bins (Tables 4 and 5; bin 1; light status F 1,22 = 7.77, p = 0.011; light status × drug F 1,21 = 0.09, NS; bin 2; light status F 1,22 = 7.87, p = 0.01; light status × drug F 1,22 = 2.66, NS). However, in the last bin, the characteristic ‘slowing’ of the next response when a light in the stimulus array set to off was less apparent, though still present, in ropinirole-treated rats (Tables 4 and 5; bin 3; light status F 1,22 = 14.57, p = 0.001; light status × drug F 1,22 = 5.04, p = 0.035; ropinirole treated; light status F 1,11 = 10.30, p = 0.008; saline treated; light status F 1,11 = 9.77, p = 0.01). After 4 weeks of washout, any differences in the latency to respond at an aperture based on the preceding one were no longer evident (light status × drug F 1,6 = 1.80, NS).

In contrast to the bidirectional effects ropinirole produced on animals’ erroneous attempts to collect reward on non-winning trials, chronic ropinirole produced a clear, robust and sustained increase in the number of trials animals completed. This effect started to manifest after the first five sessions and became more pronounced throughout treatment (Fig. 4; bin 1; drug F 1,22 = 0.06, NS; bin 2; drug F 1,22 = 6.50, p = 0.02; bin 3; drug F 1,22 = 17.03, p < 0.0001). During extinction, both groups significantly reduced the number of trials completed, although animals receiving ropinirole did not extinguish responding as rapidly nor reach as low a level of responding, as saline-treated animals (extinction; drug F 1,22 = 28.47, p < 0.0001; session × drug F 5,110 = 4.37, p = 0.001; ropinirole treated; session F 5,55 = 17.74, p = 0.0001; saline treated; session F 5,55 = 27.84, p < 0.0001). Furthermore, the increase in the number of trials completed in the ropinirole group was preserved throughout reinstatement (reinstatement; drug F 1,22 = 17.65, p < 0.0001) and remained elevated following the cessation of ropinirole treatment, albeit only at the level of a strong trend (drug F 1,7 = 5.09, p = 0.06). This effect began to ameliorate, such that by the last five sessions, there were no significant differences between the two groups (Fig. 5b; drug: washout session nos. 1–5 F 1,7 = 11.25, p = 0.01; nos. 6–10 F 1,7 = 6.44, p = 0.04; nos. 11–15 F 1,7 = 6.65, p = 0.04; nos. 16–20 F 1,7 = 2.21, NS).

Fig. 4
figure 4

Effect of chronic ropinirole on the number of trials completed. In contrast to the bidirectional effects on reward expectancy, ropinirole administration produced a robust and sustained increase in the number of trials completed (n = 12 in both groups). This increase in trials, which could be argued to represent a compulsive style of engagement, presented after the first trial bin and continued throughout ropinirole administration including throughout extinction and reinstatement. All data shown as the mean + SEM

Fig. 5
figure 5

Long-term behavioural effects of chronic ropinirole administration. a The significant improvement seen during ropinirole administration abated such that there were no significant differences on collect lever choice between either saline or ropinirole-treated animals during washout (n = 5 ropinirole treated, 4 saline treated). b The robust increase in the number of trials completed during ropinirole was preserved through the majority of the washout period but began to return to baseline and was not significantly elevated compared to control animals after 20 sessions. All data shown are the mean across 20 sessions + SEM

Western blot

Given that ropinirole’s primary pharmacological mechanism of action is as an agonist at DA D2/3 receptors, one obvious hypothesis would be that chronic ropinirole brings about behavioural change via alterations in dopaminergic signalling. In the dorsal striatum, there was a slight increase in D2 receptor protein following ropinirole administration but only at the level of a very weak trend (Fig. 6a; drug F 1,3 = 5.47, p = 0.1). This effect appeared to be driven by an increase in D2 receptor expression in the group sacrificed whilst ropinirole was still on board, which was not present in ropinirole-treated rats post-washout animals, but all follow-up analyses were not significant (all Fs <1.72, NS). In contrast, the effects on D1 receptor protein were more pronounced: chronic ropinirole decreased expression of the D1 receptor protein, an effect that subsided following 4-week washout (Fig. 6b; drug × time point F 1,3 = 12.72, p = 0.04; early ropinirole vs early saline F 1,5 = 16.84, p = 0.009; late ropinirole vs late saline F 1,3 = 0.03, NS).

Fig. 6
figure 6

Summary of changes in protein level in the dorsal striatum in animals treated with chronic ropinirole in comparison to saline. The largest change in the dorsal striatum was a decrease in pGSK that persisted in the washout group, indicating a potent activation of the alternative D2 signalling cascade underlying the observed behavioural effects. Data are expressed as the fold change relative to the mean value of the control group + SEM. Significant effects are denoted by *p ≤ 0.05 and # p ≤ 0.1

D2-like receptors are metabotropic receptors typically coupled to inhibitory G-proteins (Pierce et al. 2002). Activation of D2 receptors decreases cAMP production, inhibiting PKA, leading to a decrease in phosphorylation of DARPP at serine 34 and a subsequent decrease in ERK signalling (Beaulieu et al. 2007). Ropinirole treatment led to a significant reduction in DARPP (Fig. 6c; drug F 1,3 10.05, p = 0.05), although this was only significant at the later time point (early ropinirole vs early saline F 1,3 = 1.72, NS; late ropinirole vs late saline F 1,3 = 10.50, p = 0.05). There was also a trend toward decreased total ERK and DARPP75 in all ropinirole-treated animals (Fig. 6d; ERK F 1,3 = 5.82, p = 0.095; Fig. 6e; drug DARPP F 1,3 = 6.2, p = 0.09), but no changes in pDARPP34, CREB or pCREB (Fig. 6f–h; drug pDARPP34 F 1,3 = 1.14, NS; CREB F 1,3 = 1.28, NS; pCREB F 1,3 = 0.37, NS). There is therefore only marginal evidence that ropinirole caused lasting change through this signalling pathway, at least in the dorsal striatum.

D2 receptors can also signal through an alternative G-protein-/cAMP-independent cascade via a β-arrestin-mediated AKT/GSK3β cascade during hyperdopaminergic conditions (Beaulieu et al. 2007; Li et al. 2012). Chronic ropinirole administration did not lead to any alterations in total GSK3β (Fig. 6i; drug F 1,3 = 2.35, NS). However, levels of pGSK3β were vastly decreased, an effect that was not ameliorated following washout (Fig. 6j; drug F 1,3 = 19.55, p = 0.02; drug × time point F 1,3 = 1.45, NS; early ropinirole vs early saline F 1,5 = 33.83, p = 0.002; late ropinirole vs late saline F 1,3 = 19.34, p = 0.02). GSK3β is constitutively active, such that phosphorylation renders it inactive. Hence, it would appear that ropinirole resulted in significantly greater activation of this alternative signalling pathway in the dorsal striatum.

In contrast to the multiple changes in both the canonical and alternative dopaminergic signalling cascades observed in the dorsal striatum, the only significant difference in protein expression that we detected in the nucleus accumbens was a ropinirole-induced decrease in pCREB in animals who were sacrificed immediately following the cessation of treatment (Fig. 7a–h; group F 1,3 = 10.79, p = 0.046; drug × time point F 1,3 = 0.01, NS; early saline vs early ropinirole F 1,5 = 7.96, p = 0.037; late saline vs late ropinirole F 1,3 = 5.33, NS).

Fig. 7
figure 7

Summary of changes in protein level in the nucleus accumbens in animals treated with chronic ropinirole compared to saline. In contrast to the dorsal striatum, few alterations in protein level were observed in the nucleus accumbens with the exception of a decrease in pCREB. Data are expressed as the fold change relative to the mean value of the control group + SEM. Significant effects are denoted by *p ≤ 0.05 and # p ≤ 0.1

Polymerase chain reaction

In contrast to the observed changes in DA-related signalling in the striatum, there were no alterations in receptor messenger RNA (mRNA) for the D1, D2 or 5-HT2A receptors between saline- and ropinirole-treated animals at either time point in either the OFC or PFC (Table 6; all Fs <1.03).

Table 6 Changes in mRNA or protein in rats treated with chronic ropinirole or saline from both early and late time points

Discussion

Here, we demonstrate that chronic administration of the D2/3 agonist ropinirole produces invigorated engagement on the rSMT, as indicated by a marked increase in the number of trials completed, continued responding during extinction and a reduction in the degree to which stimulus valence modulated the speed of subsequent responses at the array. This potentially compulsive style of play was accompanied by a dramatic and long-lasting reduction in the inactive (phosphorylated) form of GSK3β in the dorsal striatum, potentially indicative of increased D2-mediated activation of the β-arrestin-AKT intracellular signalling cascade in this area. Chronic ropinirole treatment was also associated with decreased expression of D1 receptors in the dorsal striatum and reduced activation of CREB within the nucleus accumbens. These behavioural changes superficially resemble the increased desire to gamble observed in iatrogenic GD, and the molecular correlates observed may provide novel insight into the mechanism by which dopamine agonists effect such changes, and therefore through which these psychiatric side effects could be reversed.

From a clinical perspective, one important caveat with respect to the interpretation of the current data set is that we have not used a rodent model of PD. The loss of DA-producing neurons in conditions such as PD has been suggested to result in a compensatory increase in the sensitivity of receptors—so-called denervation supersensitivity (Lee et al. 1978)—that may influence the response to DA agonist treatments. However, ICDs have been reported following treatment with DA agonists in conditions that are not associated with any perturbations of the DA system, such as fibromyalgia, RLS and prolactinoma (Clark and Dagher 2014). Furthermore, dorsolateral striatal dopamine depletion (a model of early stage PD) did not alter the ability of chronic pramipexole to increase risky choice in rats (Rokosik and Napier 2012; Tremblay et al. 2016). Hence, it is likely that the ICDs exhibited by PD patients following DA agonist therapy can largely be attributed to the actions of the drugs themselves, rather than due to the interaction of these drugs and the pathophysiology of PD. Nevertheless, future studies utilising an animal modes of PD would be useful in definitely addressing this point.

With regard to the behavioural processes through which dopamine agonist treatment may invigorate rSMT performance, one potential explanation is that augmenting the activity of D2 receptors may have altered the appetitive valence of the sugar reward. Chronic quinpirole administration has been shown to enhance the reward-facilitating effects of amphetamine with respect to responding for intracranial self-stimulation (Schmidt et al. 2013). However, although an increase in appetitive motivation might explain the increase in the number of trials completed, we would also expect decreased reward collection latencies and perhaps more rapid responses to illuminated apertures in the stimulus array, neither of which were observed. Relatedly, the increase in trials may arise as a result of ropinirole ameliorating a decrease in the appetitive value of reward throughout the session, such that ropinirole-treated animals continue to respond for food whilst their saline counterparts become sated as the session progresses. This is highly unlikely for a number of reasons. Firstly, animals do not earn enough sugar pellets on the rSMT to reach satiety. Given that roughly 1 in 8 trials are rewarded and that animals performed just over 80 trials on average at baseline, rats would therefore earn around 80–88 45-mg sugar pellets per session, equating to 3.6–4G of food reward. Free-fed rats easily consume 20G of food per day; hence, this amount would not result in a drop in motivation. There is no evidence that increased satiation throughout a session meaningfully impacts performance even in other behavioural tasks that yield comparatively larger amounts of sugar (Cocker et al. 2012a; Cocker et al. 2012b; Zeeb et al. 2009). Furthermore, animals’ weight does not predict the number of trials completed on the rSMT, suggesting individual differences in satiation is not a principal component mediating task performance (Cocker et al. 2016a). Finally, dopamine agonists have been shown to reduce, rather than increase, free-feeding behaviour (Terry et al. 1995).

Another explanation for the observed effects arose as a result of increased motor impulsivity or a general trend towards increased motoric output. The psychostimulant amphetamine, which potentiates DA, has been repeatedly demonstrated to increase impulsive action on operant tasks (Cole and Robbins 1987; Zeeb et al. 2009). Impulsivity, loosely defined as acting or making decisions without appropriate forethought, is a multifaceted construct (Winstanley 2011), and it is not intuitively obvious as to what aspect of rSMT performance may best be considered ‘impulsive’. Nevertheless, amphetamine leads to increased erroneous attempts to collect reward on 1 and 0 light trials. These errors are made more rapidly at baseline than collect errors on near-miss trials and may therefore represent impulsive-like responses (Cocker et al. 2016a; Cocker et al. 2014; Cocker et al. 2016b; Winstanley et al. 2011). However, although ropinirole did increase such mistakes in the first week of administration, this effect was only transient, whereas the increase in trials performed persisted for the duration of drug treatment. Indeed, if anything, ropinirole-treated rats made fewer such errors towards the end of testing. It is therefore hard to attribute the behavioural pattern caused by chronic ropinirole on the rSMT to a deficit in any specific aspect of impulse control.

The behavioural endophenotype exhibited following chronic ropinirole may instead be more related to compulsivity, rather than impulsivity. Compulsivity can broadly be regarded as performing repetitive, stereotyped behaviours according to certain rules (Fineberg et al. 2010). The relationship between impulsivity and compulsivity is complex. Traditionally, these multifaceted constructs have been viewed as diametrically opposed, with individuals exhibiting a preponderance of one at the expense of the other, yet more contemporary theories now suggest that the relationship between the two is dynamic and can shift over time (see Fineberg et al. 2010 for discussion). Thus, it may be reasonable to suggest that both impulsivity and compulsivity represent differing vulnerabilities towards the development of GD. There is ample evidence to indicate that impulsivity is a precursor for poor decision making, such that high trait impulsivity or the diagnosis of a concurrent ICD can enhance vulnerability towards both substance and behavioural addictions (Barrus et al. 2015; Jentsch and Taylor 1999; Michalczuk et al. 2011; Winstanley 2007). In contrast, the archetypal pathology of aberrant compulsivity, namely obsessive compulsive disorder (OCD), is rarely co-morbid with GD (Fontenelle et al. 2005). However, gamblers do score higher on self-report measures of compulsivity (Blaszczynski 1999) and many of the cognitive distortions such as an adherence to ‘lucky’ rituals, which have been suggested as central to the development of GD (Ladouceur et al. 1988; Toneatto et al. 1997), could be considered compulsive in nature.

EGMs such as slot machines are a particularly virulent form of gambling, in that these sorts of games are most often reported as the primary game of choice by patients reporting for treatment for GD (Breen and Zimmerman 2002) and those who develop ICDs following DA agonist treatment (Rossi et al. 2010). Additionally, EGMs are associated with the shortest latency between the onset of recreational play and the development of problematic engagement (Breen and Zimmerman 2002; Choliz 2010; Dowling et al. 2005). One hypothesis that may explain why EGMs are such a virulent form of gambling is that they appear to foster unthinking or compulsive play (Murch and Clark 2015). The compulsive style of play exhibited by gamblers on slot machines has been referred to as being in the ‘zone’ and manifests as attentional narrowing, with gamblers focusing exclusively on the game and less able to detect alterations in the environment (Diskin and Hodgins 1999; Schüll 2012) In regard to the rSMT, the large increase in the number of trials completed could potentially represent a similar behavioural response. Animals performing the rSMT undergo extensive training (100+ sessions); therefore, such a robust increase in the number of trials completed represents a fundamental shift in animals’ approach to the task. Importantly, this behavioural change cannot be explained through a general increase in the speed of responding. Hence, non-task related behaviour, such as grooming or exploration of the operant box, must necessarily have been reduced for the number of trials to have increased so dramatically, potentially indicative of greater task ‘focus’.

Although there was no overall difference in response latency between ropinirole and saline-treated animals, the characteristic spike in the time taken to respond at the next aperture in the sequence when the light in the preceding hole remained off was far less pronounced following ropinirole administration. Thus, ropinirole-treated animals appeared less sensitive to the appetitive valence of the aperture and instead became more repetitive in their responding. Broadly speaking, goal-directed actions are governed by the action-outcome associations in effect, in addition to an appraisal of the outcomes’ value, whereas habitual responding is largely driven by simple stimulus-response (S-R) relationships (Balleine and O’Doherty 2010). As such, the fact that the behaviour of ropinirole-treated animals was less affected by the illumination status of the previous hole may indicate an increase in S-R control of behaviour. Genuinely goal-directed behaviour should also be significantly affected by outcome devaluation, such that manipulations such as satiety or extinction have less impact on habit-based behaviour (Balleine and Dickinson 1998). Although all rats decreased the number of trials completed when win trials were no longer included in the rSMT, this extinction of performance through non-reward was significantly slower in ropinirole-treated rats, and performance never declined to the level of saline-treated controls. Thus, although ropinirole-treated animals remained broadly goal directed, the drug seemed to elicit more habit-like and compulsive task engagement in that they continued to engage with the game to a greater degree than saline-treated controls, even in the absence of reward.

Our hypothesis that an increase in compulsivity may underlie animal’s increased game play is potentially bolstered by the fact that most of the significant protein changes detected ex vivo were observed in the dorsal, rather than the ventral striatum (with the exception of a single increase in pCREB within the nucleus accumbens). The transition in control of behaviour from ventral to dorsal striatum has been hypothesised to underlie the development of compulsive drug seeking (Everitt and Robbins 2005). The most robust change we observed was a decrease in phosphorylation, and therefore inhibition, of GSK3β (Beaulieu et al. 2009; Beaulieu et al. 2005). The critical downstream effects of GSK3β activation that may result in behavioural change are as yet unknown, but GSK3β does play a fundamental role in a variety of functions, including receptor trafficking and cellular plasticity (see Li and Gao 2011 for discussion). Interestingly, the increase in pGSK3β did not abate following cessation of ropinirole treatment, indicative of long lasting alterations in this pathway. However, the putative increase in compulsive engagement with the rSMT—the rise in the number of trials completed—did return to baseline levels within 4 weeks. This discrepancy between the time course of the behavioural and the molecular changes may suggest that disinhibition of GSK3β is unlikely to cause compulsive rSMT engagement. However, normalisation of the number of trials exhibited by the ropinirole-treated group to the saline-treated group was slow, only resolving in the last few sessions. Whether the pervasive increase in active GSK3β would have eventually abated in line with the behavioural effect is unclear. Regardless, the most pragmatic way to address issues of causation would be to determine whether the increase in trials resulting from ropinirole administration can be blocked by GSK3β inhibitors.

Predominant activation of the β-arrestin/AKT/GSK3β pathway, rather than PKA-dependent intracellular signalling cascades, through ligand binding at the D2 receptor is thought to only occur during periods of increased DA release (Beaulieu et al. 2009). As such, it may not be surprising to find evidence of activation of this pathway following chronic administration of a DA agonist. Certainly, chronic administration of the D2-like agonist pramipexole increases the tonic activity of DA producing neurons (Chernoloz et al. 2009). Excess DA transmission in idiopathic gambling is less clear, but some reports have suggested that GD patients exhibit augmented DA release during gambling episodes (Boileau et al. 2014; Linnet et al. 2011), and this may override the dips in phasic DA following unsuccessful outcomes (Schultz et al. 1997), making losses less aversive and promoting continued game play. Certainly, PD patients appear to exhibit deficits learning from negative outcomes whilst on dopaminergic medication, potentially indicative of a failure to detect a dip in DA (Frank et al. 2004). In a potentially analogous manner, animals on the rSMT show increased erroneous collection responses following chronic ropinirole treatment, at least during the first two session bins. However, these effects abated by the third session bin and ropinirole-treated animals actually showed incremental improvements in differentiating between winning and losing outcomes in latter sessions. Ultimately, whether steady-state slow release ropinirole alters the perception of reward-related stimuli or loss is unclear based on the current data.

Interestingly, there was also a reduction in D1 receptor protein in the dorsal striatum. As D1 receptors typically signal through Gs receptors, which have broadly opposing effects to G i receptors, it raises the possibility that the small decreases in total ERK and DARPP were due to a decrease in D1 receptor signalling. The reason for the reduction in D1 is unclear, but chronic administration of a D2-like agonist has previously been shown to result in desensitisation of autoreceptors (Chernoloz et al. 2009). G-protein-coupled receptors are internalised as a homeostatic mechanism to modulate G-protein-mediated signalling in response to agonist stimulation. Therefore, the decrease in post-synaptic D1 receptors may be a compensatory effect as a result of increased tonic DA levels following a loss of autoreceptor regulation (Beaulieu and Gainetdinov 2011). However, it is unlikely that the decrease in D1 receptors can account for the observed behavioural effects, as previous investigations acutely targeting the D1 system did not affect rSMT performance (Winstanley et al. 2011).

In contrast, qPCR did not reveal any alterations in either D1 or D2 receptor mRNA in prefrontal regions, which coupled with the Western blot data indicate that gross changes in receptor density contributed little to the behavioural effects seen here. Previous reports have shown that chronic ropinirole augments the activity of the serotonergic system (Chernoloz et al. 2009), but we did not see any changes in 5-HT2A receptor mRNA, a target chosen due to its relative abundance in the PFC and role in decision making (Winstanley et al. 2004). However, as we observed large-scale changes in downstream signalling molecules associated with DA receptor activation, without a change in gross receptor density, we cannot preclude a role for 5-HT here, particularly given the multitude of centrally located 5-HT receptors. Although we have not investigated modulation of rSMT performance by selective serotonergic ligands, acute amphetamine administration, which potentiates the actions of all monoamines, had a far less pronounced effect on the rSMT then quinpirole, arguably indicating a more prominent role for DA over 5-HT on this task (Winstanley et al. 2011).

Ultimately, these data provide novel insight into a potential role for aberrant dopaminergic signalling in the development of compulsive gambling. Chronic activation of D2-like receptors resulted in a potentially compulsive style of engagement on the rSMT putatively as a result of increased activation of the β-arrestin-mediated AKT/GSK3β signalling cascade in the dorsal striatum. These results have implications for the treatment of both idiopathic and iatrogenic gambling. Excitingly, lithium chloride is a potent inhibitor of GSK3β and has been extensively used in human subjects for the treatment of affective disorders (Beaulieu et al. 2005; Burgess et al. 2001). Moreover, lithium has previously been explored as a treatment for GD, with some success (Hollander et al. 2005; Pallanti et al. 2002). Our results indicate that gamblers who principally exhibit compulsive gambling engagement, such as slot machine players, may potentially respond favourably to lithium as a pharmacotherapy. More generally, these results continue to hint at heterogeneity within GD (Cocker and Winstanley 2015; Limbrick-Oldfield et al. 2013) and highlight the need for endophenotypic animal models that may aid in elucidating differing vulnerabilities.