Introduction

Nicotine in the form of smoked tobacco remains one of the world’s most popular recreational drugs, despite being one of the largest causes of preventable illness and death (Grant et al. 2015; Fagerström 2005; Taylor et al. 2002; World Health Organization 2013). A potential explanation for the extremely high rate of relapse in smokers attempting to quit (Ockene et al. 2000) is that cigarette-associated cues could trigger impulses for smoking behavior reflexively and automatically, as a form of maladaptive habit (Belin et al. 2013; Clemens et al. 2014; Tiffany 1990; Tiffany and Carter 1998; Wray et al. 2013; Yalachkov et al. 2009). In studies of habit formation, newly acquired instrumental actions that are only performed if necessary to earn desired outcomes (Adams and Dickinson 1981) are defined as “goal-directed” (Dickinson 1985). With repeated performance of the instrumental response, however, these behaviors can come to be automatically or “habitually” elicited by associated environmental stimuli (Adams 1982; Balleine and Dickinson 1998; Tricomi et al. 2009; Valentin et al. 2007). As drugs of abuse may themselves accelerate the rate at which habits are acquired (Belin et al. 2013; Belin-Rauscent et al. 2012; Corbit et al. 2012; Everitt et al. 2008; Everitt and Robbins 2005), the highly repetitive nature of taking multiple puffs while smoking each cigarette (Djordjevic et al. 2000) may make nicotine-seeking behaviors particularly likely to develop into habits.

Goal-directed behaviors can be identified through their sensitivity to outcome devaluation (i.e., reduction of the incentive value of an operant reward by either allowing consumption to satiety or pairing with a negative consequence) and contingency degradation (i.e., reduction of the necessity of operant responding for reinforcer deliveries) (Dickinson and Balleine 1994; Dickinson and Mulatero 1989; Yin and Knowlton 2006). These approaches have been used to show that operant responding for orally consumed alcohol (Corbit et al. 2012; Dickinson et al. 2002; Mangieri et al. 2012, 2014) and cocaine rewards (Miles et al. 2003) may come under habitual control more rapidly than natural rewards, such as sucrose. However, few studies have examined the development of habitual responding for intravenously self-administered drugs of abuse (Everitt and Robbins 2016; Olmstead et al. 2001; Root et al. 2009; Zapata et al. 2010). Furthermore, the devaluation of intravenous (IV) drug rewards through satiety may be challenging to interpret, due to unconditioned drug effects such as hyperactivity or sedation (Ostlund and Balleine 2008; OʼTousa and Grahame 2014). Recently, though, Clemens et al. (2014) reported that pairings of non-contingent IV nicotine and lithium chloride (LiCl) following 10 days of IV nicotine self-administration (NSA) training significantly reduced extinction test responding for nicotine, suggesting that NSA behavior was goal-directed. However, these authors also reported that those LiCl-nicotine pairings did not reduce responding for nicotine in a reacquisition test session (Clemens et al. 2014), making it unclear whether nicotine was successfully devalued (Adams and Dickinson 1981) and therefore difficult to interpret whether nicotine-seeking following limited training was a goal-directed or habitual behavior.

To help resolve these issues, experiment 1 in the present study sought to verify whether IV nicotine, after 10 days of NSA training on a fixed ratio 1 (FR-1) schedule, could be devalued by pairing with LiCl. In order to confirm that our LiCl dosing and pairing parameters were capable of effectively devaluing a reinforcer, as well as demonstrate that our methodological parameters were capable of identifying goal-directed behavior, the same subjects were concurrently trained to self-administer orally consumed saccharin solution. Only one of these reinforcers was devalued for any individual subject. We then sought to extend the results of experiment 1 using an alternate assay for habitual behavior, contingency degradation (Dickinson and Mulatero 1989; Yin and Knowlton 2006), in experiment 2.

Materials and methods

Subjects

Adult male Sprague-Dawley rats (200–250 g) were obtained from Charles River (Montreal, QC, Canada) and housed in a climate-controlled environment on a 12-h reverse light/dark cycle (lights on at 19:00 h). Subjects were double housed for a 1–2-week acclimatization period with food and tap water available ad libitum. Following this period, rats were individually housed and restricted to 20–25 g of lab chow per day to prevent excessive weight gain. All experimental procedures described were conducted in full compliance with the National Institutes of Health “Principles of Laboratory Animal Care” (8th Edition, 2011) and approved by the local Animal Care Committee of the Centre for Addiction and Mental Health.

Apparatus

Nicotine self-administration (NSA) sessions took place in 16 operant chambers operated by a Med Associates interface (St Albans, VT, USA) in 60-min sessions. Each operant chamber (30.5 × 24.1 × 21.0 cm) was equipped with a non-retractable inactive lever and a retractable active lever on opposite walls, 2.5 cm above a grid floor. Illumination of a house-light above the inactive lever signaled the start of a self-administration session. Nicotine was delivered via Tygon tubing connecting a PHM-100VS syringe pump (Med Associates) to a fluid swivel and a modified 22-gauge cannula protected by a metal spring. Responses on the active lever triggered nicotine delivery (0.1 mL/kg over approximately 1 s), as well as illuminating a white cue light (2 s) and enabling a tone-generator (2800 Hz, 2 s) both located above the active lever. Following nicotine infusions, a timeout period (30 s) occurred in which the house-light was disabled and active lever responses were recorded but had no consequence. Responses on the inactive lever were recorded but had no consequence throughout each session.

Saccharin self-administration (SSA) took place in a distinct room in chambers similar to those described for NSA above, except that the inactive lever was retractable and positioned on the same wall as the active lever, a fluid receptacle was positioned in between the two levers, subjects were not connected to any intravenous drug administration apparatus, and the session duration was 30 min. The position of the active lever was counterbalanced across rats, and for a given subject was consistent across food training, NSA, and SSA sessions. Active lever responses during SSA sessions dispensed 0.19 mL of saccharin solution over 2 s into the receptacle. Reward-associated cues and timeout periods were identical in NSA and SSA sessions. Following SSA sessions, receptacles were checked for unconsumed saccharin solution, which was recorded and used to calculate saccharin reinforcements consumed.

Intravenous jugular catheterization

Subjects were anesthetized with isoflurane/oxygen, and incision sites treated with subcutaneous (SC) bupivacaine as a local anesthetic (0.1 mL, 0.125%). Also administered were the antibiotic Derapen SQ/LA (Wyeth Animal Health, Guelph, ON, Canada; 0.1 mL, SC) and the analgesic ketoprofen (5 mg/kg, SC). Catheters were constructed in-house and implanted into the right jugular vein as described previously (Corrigall and Coen 1989), and catheter patency was subsequently maintained by a daily flush of sterile heparin-saline solution (50 U/mL, 0.1 mL, IV). Subjects were given a minimum of 7 days to recover before the first operant session of the acquisition phase.

Drugs

Nicotine tartrate (Sigma-Aldrich, Oakville, ON, Canada), prepared fresh daily, was dissolved in 0.9% sterile saline and pH adjusted to 7 ± 0.2, administered intravenously at a unit dose of 30 μg base/kg/infusion. Saccharin sodium salt hydrate (Sigma-Aldrich) was prepared fresh daily in a concentration of 0.1% w/v in distilled water. Lithium chloride (Sigma-Aldrich) was dissolved in sterile water (89.7 mg/kg) and administered as an intraperitoneal (IP) injection (14.1 mL/kg) at a concentration of 0.15 M (Balleine and Dickinson 1992). Thiopental (2–4 mg, IV, 20 mg/mL) was used to test catheter patency at the end of each experiment.

Procedures

Experiment 1: LiCl devaluation of nicotine and saccharin

The week prior to surgery, rats (n = 24) were given operant training for 45 mg sucrose pellets (Bio-Serv, Flemington, NJ, USA) in a separate set of operant chambers with pellet magazines. Responses on the active lever were reinforced under an FR-1 schedule, with no timeout period or reward-associated cues presented. Food training was distributed across one 7-h and one 16-h session, in which subjects could, respectively, earn maximums of 400 and 600 pellets. Water was available ad libitum during food training sessions. Following recovery from surgery, rats were returned to the food training boxes for a “consolidation” session lasting for either 1 h or until 100 reinforcements were earned. Following completion of this consolidation session, subjects began NSA and SSA acquisition.

Over 10 consecutive days, all subjects were trained to self-administer saccharin as well as nicotine under FR-1 schedules. Each subject performed one operant session for nicotine and one operant session for saccharin each day, with sessions separated by 2–3 h. The order in which sessions were performed each day (nicotine or saccharin first) was counterbalanced across the acquisition phase.

Rats were then assigned into two counterbalanced groups, “nicotine-devalued” (n = 12) or “saccharin-devalued” (n = 12), based on numbers of active lever presses and reinforcements earned over the last 3 days of the acquisition phase (Fig. 1). The LiCl aversion-pairing procedure consisted of three repetitions of a 2-day pairing cycle. On the first day of each cycle, rats were allowed to freely consume saccharin solution from a Richter tube in a novel environment for a period of 30 min; for rats in the “saccharin-devalued” condition, this was followed by an IP injection of LiCl, while rats in the “nicotine-devalued” group received an IP injection of sterile saline of the same volume. On the second day of each pairing cycle, all subjects were given non-contingent nicotine infusions; rats were transported to a novel experimental room equipped with polycarbonate “pairing” cages and nicotine infusion pumps and allowed to habituate for 1–2 min in these cages while catheter dead-space was filled with nicotine solution. Based on the average numbers of nicotine infusions self-administered during the first 10 min of the final acquisition sessions, we delivered five nicotine infusions at the self-administration dose (0.03 mg/kg/infusion) at 1, 2, 4, 6, and 9 min after catheter filling. Following the final infusion, each rat was immediately removed from its pairing cage and those in the “nicotine-devalued” condition received an IP injection of LiCl, while those in the “saccharin-devalued” condition received an IP injection of saline. On both days of the pairing cycle, rats were returned to transport cages following the appropriate injection, where they were monitored for 15 min before being returned to their home cages. Subjects were fed 4 h following the final LiCl injection.

Fig. 1
figure 1

Experimental timelines and group assignments for both experiments

The day following completion of the devaluation phase, an extinction test in each of the NSA and SSA contexts was performed in a counterbalanced order. Extinction test sessions were 10 min in duration, in which the houselight was continually illuminated, active and inactive lever responses were recorded but otherwise had no consequence, and no rewards or associated cues were presented at any time; for extinction tests in the NSA context, animals were connected to the intravenous self-administration apparatus as in training, although no drug infusions were delivered. Reacquisition of NSA and SSA was then evaluated over the next 2 days in “reacquisition sessions”, identical to the operant sessions in the acquisition phase.

Experiment 2A: contingency degradation of nicotine and saccharin

Rats (n = 26) were trained to self-administer nicotine and saccharin over 10 sessions as described in experiment 1. Following acquisition, rats were assigned into “nicotine-degraded” (n = 13) or “saccharin-degraded” (n = 13) groups in a counterbalanced manner, based on active lever responses and reinforcements earned (for both saccharin and nicotine) from the last 3 days of acquisition (Fig. 1). For each reinforcer, the degradation phase consisted of two conditions: the “degraded” condition (in which active lever responses were recorded but were otherwise inconsequential), and the control, “non-degraded” condition (in which rats experienced operant sessions identical to normal acquisition, i.e., lever pressing led to reinforcer delivery at FR-1). To control for variable nicotine exposure in those subjects that were “non-degraded” for nicotine and avoid the potential confound of distinct nicotine administration histories between groups, each animal was “yoked” pairwise to an animal in the non-degraded control condition with a comparable active response and reinforcement history. Animals in “degraded” conditions received non-contingent rewards and cues that were instead contingent on the responding of the non-degraded animal of each pair. Rats received six such sessions for each reinforcer. This yoking procedure was based on Fanelli et al. (2013) and Yin et al. (2006). The following day, rats received a 10-min extinction test for each reward type as described in experiment 1.

Experiment 2B: contingency degradation of nicotine alone

To verify that the degradation of saccharin responding and repeated extinction testing did not alter how rats performed in the NSA context, experiment 2A was repeated in a separate group of rats (n = 24) using intravenous nicotine as the only reinforcer. These rats performed a single operant session each day throughout the acquisition phase and were similarly divided into “nicotine-degraded” (n = 12) and non-degraded “control” (n = 12) condition, as above. This experiment was otherwise identical to experiment 2A.

Statistical analysis

Acquisition and reacquisition data were analyzed using repeated measure analysis of variance (ANOVA), with the within-subject factors of reinforcer and day, and a between-subject factor of either devaluation or degradation condition (for experiments 1 and 2, respectively). Data from single-day comparisons (extinction tests) were analyzed with repeated measures ANOVA using the within-subject factor of reinforcer and between-subject factor of devaluation (or degradation) condition. Greenhouse-Geisser corrections were applied in all analyses for which Mauchley’s test of sphericity was significant. For presentation and analysis, reinforcements and active lever responses in reacquisition and extinction test data were converted to a proportion of baseline responding (mean of the last 2 days of self-administration). The criterion for significance was p < 0.05.

Results

Experiment 1: effects of LiCl devaluation on NSA and SSA extinction and reacquisition

Active lever responses, inactive lever responses, and reinforcements earned during acquisition of SSA (a) and NSA (b) in experiments 1 and 2 are shown in Fig. 2. In experiment 1, seven rats were excluded due to loss of catheter patency; a total of 17 rats were used in the final data analysis, nine in the nicotine-devalued group, and eight in the saccharin-devalued group. Following training, rats were assigned into matched groups based on reinforcements and active lever presses for both saccharin and nicotine.

Fig. 2
figure 2

Mean (±SEM) active lever responses (open triangles), inactive lever responses (closed triangles), and reinforcements earned (open circles) during the 10 acquisition sessions of saccharin self-administration (a) (n = 38) and nicotine self-administration (b) (n = 58) in experiments 1 and 2

For the extinction tests in experiment 1, planned comparisons within each reinforcer revealed a significant effect of devaluation on responding on the saccharin-associated lever [F (1, 15) = 9.34, p < 0.05], but not on the nicotine-associated lever (p > 0.05). This is because rats subject to LiCl-saccharin pairings responded less than their controls during the SSA extinction test, while rats given LiCl-nicotine pairings did not respond differently from their controls during the NSA extinction test. ANOVA on active lever responding additionally revealed a significant effect of reinforcer [F (1, 15) = 12.46, p < 0.05]. There were no significant effects on inactive lever responding during the extinction test (Table 1).

Table 1 Mean inactive lever press data (±SEM) collapsed across the final 2 days of acquisition (baseline), the six degradation sessions (for experiment 2), extinction testing, and the 2 days of reacquisition testing (for experiment 1). Asterisk denotes data different from nicotine-trained groups (p < 0.05), n = 8–11 per group

For the reacquisition tests in experiment 1, ANOVA on reinforcements earned (Fig. 3b, top) revealed a significant reinforcer × devaluation interaction [F (1, 15) = 35.67, p < 0.05], because rats given LiCl-saccharin pairings earned fewer saccharin reinforcements compared to their controls on both days of reacquisition, but those given LiCl-nicotine pairings earned fewer nicotine reinforcements compared to controls on the first day, but not the second.

Fig. 3
figure 3

Effect of LiCl devaluation on extinction and reacquisition performance in experiment 1. a Active lever responses (±SEM) made on the levers previously associated with saccharin (left) and nicotine (right) during the extinction test sessions. Control: open bars, devalued: closed bars. b Reinforcements earned (top) and active lever presses (bottom) during the reacquisition test sessions for saccharin and nicotine. Circles: saccharin; squares: nicotine. Control: open symbols, devalued: closed symbols. Asterisk different from control (p < 0.05), n = 8–9 per group

Similarly, an ANOVA on active lever responses made during reacquisition (Fig. 3b, bottom) revealed a reinforcer × devaluation interaction [F (1, 15) = 19.10, p < 0.05], as rats given LiCl-saccharin pairings responded significantly less for saccharin on both days of reacquisition compared to control, but those given LiCl-nicotine pairings made significantly fewer responses for nicotine on only the first day of reacquisition. There were no significant effects on inactive lever responding during the reacquisition test sessions (Table 1).

Experiment 2A: contingency degradation in subjects performing both NSA and SSA

Three rats were excluded due to loss of catheter patency, and two due to a failure to respond in operant sessions. A total of 21 rats were used in the final data analysis: 10 rats in the saccharin-degraded group and 11 in the nicotine-degraded group. Rats were assigned to non-degraded (control) and degraded groups for each reinforcer at the end of training, based on reinforcements and active lever presses for both saccharin and nicotine. During degradation sessions, saccharin and nicotine reinforcements delivered to rats in the degraded and non-degraded control conditions were equated via yoking. The mean daily number (±SEM) of saccharin reinforcements thereby delivered to both groups across the six degradation sessions was 28.4 ± 1.3, and for nicotine was 12.7 ± 0.5.

Analysis of active lever responses during the degradation sessions revealed a significant reinforcer × day × degradation interaction [F (3.5, 67.3) = 6.20, p < 0.05]. This is because degradation of the contingency between lever pressing and reinforcer delivery significantly reduced lever pressing for saccharin across the degradation phase (Fig. 4a), but did not affect responding for nicotine (Fig. 4b). Inactive lever responding during the degradation sessions for neither reinforcer was affected by the contingency manipulation (Table 1).

Fig. 4
figure 4

Mean (±SEM) active lever responses made for saccharin (a) and nicotine (b) in the final two sessions of acquisition (average of days 9 and 10) (left side) and the 6 days of contingency degradation (right side) in experiment 2. Responding for those rewards was either non-degraded (open symbols) or degraded (closed symbols), and subjects were trained to respond for either both nicotine and saccharin (triangles) or nicotine alone (squares). Asterisk different from control (p < 0.05), n = 10–11 per group

Analysis of active lever responses in the extinction tests which followed the degradation phase (Fig. 5) revealed a reinforcer × degradation interaction [F (1, 19) = 27.75, p < 0.05]. Rats subject to degradation of the contingency between lever pressing and saccharin delivery pressed significantly less on the saccharin-associated lever than their controls during the extinction test, but the responding of rats subject to degradation of the contingency between lever pressing and nicotine delivery did not. There was a significant effect of reinforcer on inactive lever responding during the extinction test, as overall, responding was higher on the inactive lever during the extinction tests for saccharin [F (1, 19) = 4.79, p < 0.05; Table 1].

Fig. 5
figure 5

Effects of contingency degradation of SSA or NSA on extinction responding. Data are mean (±SEM) active lever responses (as ratio of baseline) made during the extinction tests for saccharin (left side) and nicotine (center) in animals that received training with both reinforcers. Active lever responding of rats trained to self-administer only nicotine (nicotine-alone) is on the right. Control: open bars; degraded: closed bars. Asterisk different from control (p < 0.05), n = 10–11 per group

Experiment 2B: contingency degradation in rats performing NSA alone

Three rats were excluded due to loss of catheter patency; 21 rats were used in the final data analysis, with 10 rats in the nicotine-degraded group, and 11 rats in the nicotine non-degraded control group. At the end of training, rats were assigned to non-degraded and degraded groups, based on reinforcements and active lever presses. There were no differences in numbers of lever presses or reinforcements between the two groups (p > 0.05).

As in experiment 2A, reinforcements delivered to rats in the non-degraded control and nicotine-degraded conditions during degradation sessions were equated via yoking; the mean daily number of nicotine reinforcements thereby delivered to both groups in the six contingency-degradation sessions was 12.8 ± 0.4.

Degradation of the contingency between lever pressing and nicotine reward did not affect responding for nicotine during the degradation sessions. No significant effects were observed for active (Fig. 4b) or inactive (Table 1) lever pressing during the degradation sessions (p > 0.05). During the extinction test, there were no differences in active (Fig. 5, right) or inactive lever responding (Table 1).

Discussion

In the present work, we employed LiCl devaluation and contingency degradation procedures to determine whether nicotine-seeking behavior following limited duration self-administration training is a goal-directed or habitual behavior. We used a paradigm in which rats were trained to self-administer both orally consumed saccharin and IV nicotine in separate sessions in order to verify that the conditioning procedures we used were effective in devaluing reinforcers. In extinction tests, we found that responding for saccharin was highly sensitive to reductions in outcome value and the instrumental contingency. In contrast, responding for nicotine was insensitive either manipulation. These results suggest that following comparable durations of training for both reinforcers, saccharin-seeking responses were goal-directed whereas nicotine-seeking responses were habitual.

LiCl devaluation

In experiment 1, the pairing of LiCl with saccharin or nicotine selectively reduced self-administration of the devalued reward during reacquisition tests, indicating that both reinforcers were devalued (Adams and Dickinson 1981), consistent with data from the extended training group of Clemens et al. (2014). Although passive versus active nicotine infusions have been shown to have different effects (e.g., Donny et al. 2000), our findings suggest that the interoceptive cues from passive IV nicotine infusion were effectively associated with the aversive properties of LiCl, such that self-administered nicotine lost incentive value as an operant reward. Pairing LiCl with saccharin reduced reacquisition responding for saccharin, and also reduced responding during the extinction test in the SSA context; this pattern of effects suggests that responding for saccharin was dependent on saccharin’s incentive value, and that saccharin-seeking was goal-directed. In contrast, the pairing of LiCl with nicotine did not influence responding during the extinction test in the NSA context, suggesting that nicotine-seeking was not sensitive to changes in nicotine’s incentive value, which indicates that responding for nicotine was habitual (Balleine and Dickinson 1998).

However, our observation that LiCl-nicotine pairings did not affect extinction responding but did suppress reacquisition responding is not consistent with the results of Clemens et al. (2014), who found that LiCl-nicotine pairings reduced extinction responding but did not affect reacquisition responding in rats that received 10 days of training. The reasons for these discrepant results are not clear, but procedural differences between the two studies may help to explain them. Differences in the pairing procedures used may have resulted in differential strengths of LiCl-nicotine associations between the Clemens study and ours. To potentially make our pairing procedure more effective, we used a 1.4× higher dose of LiCl (Paredes-Olay and López 2002) and utilized five nicotine infusions in each pairing, in contrast to the two nicotine infusions administered per pairing given by Clemens et al. (2014). Another difference which may have contributed to differences in LiCl-nicotine associative strength is that the nicotine infusions were “experimenter delivered” by Clemens et al. (2014), but were delivered in the present work via infusion pumps. Our use of pumps reduced the amount of experimenter handling (Grabus et al. 2006), potentially increasing the subjects’ experience of interoceptive nicotine cues. Furthermore, Clemens et al. (2014) employed nose-poking as the operant response; differences between the two behavioral responses in drug seeking have been demonstrated, for example, only rats trained to lever-press demonstrate cue-induced reinstatement (Clemens et al. 2010). Other procedural differences are that subjects were group housed in the study conducted by Clemens et al. (2014) but individually housed in the present study, and that nicotine infusions were delivered over a duration of 3 s in the Clemens study, but over 1 s in ours. To date, there is no existing work specifically comparing 1 and 3 s infusion durations on NSA. One study showed that 1-s infusions led to enhanced NSA compared to 3 and 60 s (Sorge and Clarke 2009). The effects of different infusion rates on development of habitual responding remain to be determined.

In this study, our interpretation of enhanced development of habitual responding for nicotine is based on the finding that LiCl-nicotine pairings did not affect responding in the extinction test but reduced responding in the NSA reacquisition test, while LiCl-saccharin pairing reduced responding in both tests. However, in the absence of data showing that NSA transitions from goal-directed to habitual control, a rival interpretation of our findings is that the substrates underlying impaired nicotine reacquisition are not the same as those controlling extinction responding, i.e., separate mechanisms produced the differential effects of devaluation on extinction responding and reacquisition.

Contingency degradation

In order to test the possibility that our IV NSA training regimen led to habit formation, we employed another distinct behavioral assay, contingency degradation. Devaluation and contingency degradation, respectively, rely on one of two goal-directed learning processes, namely “incentive” and “contingency” learning (Balleine and Dickinson 1998). As these two processes may be mediated by separate neuroanatomical substrates (Balleine and O’Doherty 2010), it is desirable to employ both methods when evaluating a behavior as goal-directed or habitual (Dickinson and Balleine 1994; Yin and Knowlton 2006).

A major concern with interpreting contingency degradation data in the case of nicotine is that nicotine can have a complex influence on general activity (Clarke and Kumar 1983a, b; Miller et al. 2001; Stolerman et al. 1995) and injections of nicotine can reinstate nicotine-seeking behavior (Chiamulera et al. 1996; Shram et al. 2008). As this could potentially mask any contingency learning which took place during nicotine degradation sessions, the extinction test took on an additional role in ruling out nicotine-mediated behavioral activation; because no nicotine or cues were delivered during extinction testing, contingency learning masked by nicotine would be reflected in reduced extinction responding relative to that of subjects in the non-degraded control condition. We found that following IV NSA and oral SSA training, using parameters the same as in experiment 1, saccharin-seeking responses were suppressed by contingency degradation, and this suppression persisted into extinction testing. This indicates that the reduced responding of saccharin-degraded rats during degradation sessions was due to learning of the new instrumental contingency, rather than satiety (Dickinson and Mulatero 1989; Serlin and Torregrossa 2015). In contrast, nicotine-seeking responses were unaffected by nicotine degradation during the degradation sessions and in extinction testing. This suggests that contingency learning did not occur (Ostlund and Balleine 2008).

While the contingency degradation experiments for nicotine suggest that responding for nicotine was habitual, there are potential alternative interpretations. First, the nicotine non-contingently infused during the degradation sessions may have acutely impaired contingency learning. Although Donny et al. (1998) reported that passive nicotine infusions were incapable of maintaining unreinforced lever-press behavior in rats previously trained to respond for food, suggesting that this possibility is unlikely, it cannot be entirely ruled out by the design of the present work. To address this limitation, it would need to be shown that a contingency degradation paradigm is still effective in subjects under the effects of acute nicotine. It is also possible that contingency degradation of responding for nicotine may merely take longer than for saccharin, and this is why we did not see decreases in responding by the degraded groups over the six degradation sessions. Had we subjected the rats to more degradation sessions, it is possible we would have eventually observed reduced responding in the nicotine-degraded groups, thereby ruling out the possibility that nicotine impaired contingency learning.

Another possibility is that animals in the nicotine-degraded conditions persisted in lever pressing for nicotine due to the enhancing properties that nicotine has been shown to have on the reinforcing effects of cues themselves (Caggiula et al. 2008). However, our observation that nicotine-seeking responses persisted into the drug- and cue-free extinction session makes this an unlikely explanation. One possible way to rule this out in future studies would be to conduct nicotine-degradation sessions without cues.

Methodological concerns

Other interpretational issues pertain to the time-out periods we employed, use of concurrent training for multiple rewards, and the distinct natures of orally consumed versus intravenously delivered rewards. First, in our study, timeout periods of 30 s were used in both NSA and SSA training; FR-1 schedules employed in this way may resemble fixed interval 30-s reinforcement schedules (albeit with houselight illumination signaling reward availability). The difference between fixed interval training and other schedules on habit formation is poorly understood; however, DeRusso et al. (2010) found that fixed interval training produced habitual behavior at a slower rate than random-interval training (possibly due to the more predictable contingency on fixed interval schedules). This suggests that our use of timeout period was unlikely to promote habit formation.

The second issue is that rats may maintain goal-directed control over behavior when trained with two distinct action-outcome contingencies (Kosaki and Dickinson 2010). However, this effect predominantly occurs when the option to perform both actions is present in a “choice” extinction test (Colwill and Rescorla 1985; Halbout et al. 2016); when extinction test responding for each reward is evaluated in a distinct and specific context, as was done in our study, habitual behavior can develop (Killcross and Coutureau 2003). We designed our study to maximize the distinction between the SSA and NSA training and testing contexts. NSA and SSA sessions were conducted in different rooms, and the chambers were made distinct by different arrangements of levers and the presence or absence of the reward receptacle (present only in SSA sessions) and the shielded drug line connected to the implanted catheter of each subject (present only in NSA sessions). This was effective because we observed strong reinforcer-dependent devaluation and contingency degradation effects, which we would not have if strong generalization of cues or contexts occurred between NSA and SSA sessions. A related concern may be that the access to an alternative and palatable reward (saccharin) could perturb patterns of nicotine self-administration (Huynh et al. 2015; Pelloux et al. 2015). However, we do not believe this to be an issue for two reasons: 1) we specifically separated the NSA and SSA sessions by more than 2 h, and 2) in experiment 2, responding for nicotine did not differ between rats responding for nicotine and saccharin and those responding for nicotine alone. Therefore, it is unlikely that NSA was affected by the performance of SSA.

Finally, it is important to mention that the direct comparison of SSA and NSA is problematic. This is because the routes of administration of the two reinforcers have different response requirements (i.e., saccharin also requires licking and drinking) and elicit different sensory and neuronal stimulus properties, which may influence how these rewards are internally represented by the rats. These differences may in part account for our observation that the effects of LiCl pairing were more persistent for saccharin than for nicotine during reacquisition testing.

Time-course of habit formation

That nicotine-seeking is more prone to habit formation than saccharin-seeking would be consistent with previous findings that habitual responding for drugs of abuse such as ethanol and cocaine develops faster than for natural reinforcers like food pellets or sucrose solution (Corbit et al. 2012; Dickinson et al. 2002; Miles et al. 2003). A surprising aspect of the current work was the apparent speed at which habitual nicotine-seeking appeared to have developed. Another major factor that influences how quickly habitual responding develops is the reinforcement schedule used during training (Dickinson et al. 1983).

Training animals on ratio schedules of reinforcement (as we did in the present study) is thought to be less likely to produce habitual responding (Yin and Knowlton 2006). While daily random ratio training has been shown to produce habitual behavior in approximately 2 to 4 weeks (Corbit et al. 2012; Mangieri et al. 2012, 2014), at least one report exists of habitual responding for a natural reward following 10 days of training under a FR-1 schedule (Adams 1982), which is what we observed with nicotine in the present study. As the critical feature of training schedules which promote habit formation is a weak correlation between changes in response rate and changes in reward rate (Balleine and Dickinson 1998; Yin and Knowlton 2006), the reasons underlying these discrepancies in the time-course of habit formation are unclear; however, one methodological issue which may have contributed is that Adams (1982) conducted LiCl-reward pairings in the operant chambers, and thereby, context-LiCl associations may have confounded extinction test results (Colwill and Rescorla 1985). Indeed, other studies examining habit formation under fixed ratio schedules have found goal-directed behavior to persist for several weeks of training (Hay et al. 2013; Shillinglaw et al. 2014).

Conclusions

The current findings suggest that nicotine may facilitate a transition to habitual responding, as do other drugs of abuse. While this lends credence to current conceptual accounts of addiction which propose that the chronic self-administration of drugs may potentiate habitual response behaviors (Belin et al. 2013; Everitt et al. 2008; Everitt and Robbins 2005, 2016), the apparent rate at which this occurred suggests that nicotine could have unique interactions with the neural substrates underlying habit formation, compared to other drugs of abuse. It is thought that the development of habits is mediated by a general shift in behavioral control from those neural networks responsible for goal-directed actions to those for habitual responses (Belin-Rauscent et al. 2012; Knowlton 2015; Smith and Graybiel 2016), and in support of this, Clemens et al. (2014) have already reported increased c-Fos expression brain regions following extended NSA training. Our most important observation is that we did not observe goal-directed nicotine seeking despite limited operant training. Therefore, a key extension of the current work would be to determine the relative contributions made to the acquisition of NSA by those striatal subregions specifically involved in the acquisition and expression of either goal-directed or habitual behavior.