Introduction

Relapse to drug use after a period of abstinence is one of the main obstacles to the effective treatment of drug abuse. The extinction/reinstatement procedure is a widely used animal model of relapse (see reviews by Carroll 1998; Spealman et al. 1999; Stewart 2000; Shalev et al. 2002). In this procedure, the availability of a self-administered drug is discontinued until responding ceases. Then, various manipulations are compared to assess their ability to reinstate the response. The most commonly studied, and most consistently effective, manipulation studied within this procedure is "priming" by response-independent drug delivery (Gerber and Stretch 1975; de Wit and Stewart 1981; Schenk and Partridge 1999). Other manipulations that can reinstate "drug seeking" after extinction include response-independent presentation of drug-paired stimuli (Davis and Smith 1976; de Wit and Stewart 1981), "stress" (i.e. inescapable footshock; Shaham et al. 1996; Ahmed and Koob 1997), and priming by drugs other than the training drug (de Wit and Stewart 1981; Slikker et al. 1984).

The face validity of this extinction/reinstatement model clearly extends to situations where abstinence results from the unavailability of the drug or the blocking of drug effects by antagonist treatment (e.g. see Fuchs et al. 1991; Kosten and Kosten 1991). Furthermore, the extinction/reinstatement model appears to have direct relevance to behaviorally-based treatments that seek to eliminate the conditioned effects of drug-related stimuli by presenting the stimuli in the absence of drug effects (e.g., O'Brien et al. 1988). However, non-reinforcement is obviously not the only means by which drug seeking can be reduced. For example, some therapeutic treatments involve associating drug-related cues with aversive events, or making an aversive event contingent on drug-taking behavior (Childress et al. 1985; Crowley 1985; Frawley and Smith 1992; Brewer 1993). Abstinence can also occur spontaneously (i.e. without treatment) due to inherently aversive consequences of drug use (Cunningham et al. 2000; Klingemann 1991; Burman 1997). Unfortunately, the reinstatement of drug seeking that has been suppressed by non-extinction procedures has received little attention by researchers.

To our knowledge, there has only been one previous study in which researchers attempted to reinstate punished drug self-administration. Smith and Davis (1974) allowed rats to self-administer either morphine or amphetamine for three sessions, then punished the self-administration response with footshock. This manipulation was highly effective in eliminating self-administration. When the punishment contingency was discontinued, responding failed to resume over the course of four sessions lasting 12 h each, even when priming injections were given. The authors attributed the effectiveness of the punishment procedure to the fact that it was instituted early in training, after only three days of acquisition. However, the length of training was not systematically manipulated in that study, and there have been no subsequent studies of punished drug self-administration in rats.

Although only Smith and Davis (1974) have previously attempted to produce reinstatement of punished self-administration responding, punishment has been studied extensively with drug reinforcers in primates. For example, it has been shown that the level of suppression produced by punishment in primates depends on both the unit dose of the reinforcing drug and the intensity of the shock (Grove and Schuster 1974; Bergman and Katz 1989). When constant training parameters are maintained, punishment sometimes loses its effectiveness over time (Bergman and Johanson 1981). In choice procedures, monkeys prefer a non-punished alternative when doses are equal, but they may prefer the punished alternative when it is associated with a higher dose (Johanson 1977). Thus, the suppression of drug self-administration by punishment in primates has generally been less extreme than the lasting suppression observed by Smith and Davis (1974) in rats.

The goals of the present study were to examine the effects of punishment on self-administration of remifentanil (a short-acting µ-opioid agonist with reinforcing properties comparable to those of heroin; see Panlilio and Schindler 2000) and to determine whether response-independent priming with remifentanil can reinstate punished responding. This was accomplished by systematically replicating the study of Smith and Davis (1974), extending its design in several ways. First, the amount of self-administration training prior to the introduction of punishment was systematically varied to determine whether punishment has a more robust effect if instituted early in training. Second, post-punishment sessions continued for a sufficient number of sessions to allow self-administration to resume even in rats that did not receive priming. Third, the priming manipulation was conducted more systematically and compared to appropriate non-priming control conditions. Finally, parallel manipulations were performed in rats trained with food. Despite the fact that punished food responding has been studied quite extensively (see review by Azrin and Holz 1966), to our knowledge neither the priming of punished responding nor the length of training prior to punishment appear to have been investigated before with food reinforcement.

Materials and methods

Subjects

Thirty-three experimentally-naive, male, Long-Evans hooded rats (Charles River Laboratories, Wilmington, Mass., USA), weighing approximately 350–400 g, were individually housed with free access to water. One week after the rats were received (and several weeks before the start of the experiment), food was restricted to approximately 15 g/day of laboratory chow, given in the home cage. Under this feeding regimen, body weights were maintained close to their original values throughout the course of training. Rats in all groups (food-trained as well as drug-trained) were food restricted. On training days (Monday to Friday), food was given after the session. Lights in the cage room were turned on at 1800 h and off at 0600 h (reversed light cycle), and experiments were conducted between 0900 and 1500 h.

At least 3 days prior to the beginning of training, rats in the self-administration groups were catheterized (see Panlilio et al. 1996 for a more detailed description of the surgical procedure). In brief, catheters consisting of 4 cm of Silastic tubing (0.044 mm ID, 0.814 mm OD) connected to vinyl tubing (Dural Plastics, 0.5 mm ID, 1.0 mm OD) were implanted in the right jugular vein under aseptic conditions using IP ketamine (60 mg/kg) and xylazine (10 mg/kg) anesthesia. Catheters were flushed before and after each training session with 0.1 ml of saline solution.

The animal facilities were fully accredited by the American Association for the Accreditation of Laboratory Animal Care (AAALAC), and all procedures were conducted in accordance of the guidelines of the Animal Care and Use Committee of the NIDA Intramural Research Program and the National Research Council (1996).

Apparatus

Experimental chambers (30×24×29 cm, Coulbourn Instruments, Allentown, Pa., USA) were enclosed individually in sound attenuation chests. Each experimental chamber had a stainless-steel grid floor and two nose-poke holes in the right wall. Previous data indicated that, in the absence of reinforcement, responding occurred equally in both holes (Panlilio and Schindler 2000). A shielded white light bulb (type #1820), situated on the wall above the nose-poke holes, was illuminated at all times except during reinforcement and time-out periods, when the light was pulsed at a rate of 5 Hz. Food pellets (45 mg; Product #F0021; Bio-Serv, San Diego, Calif., USA) were delivered to a food trough situated between the two nose-poke holes. Scrambled shock (1.5 mA, 0.5 s) was delivered from Coulbourn shockers (model H13-16) to the grid floor. Drug solution was delivered through Tygon tubing, protected by a metal spring and suspended through the ceiling of the experimental chamber from a single-channel fluid swivel (Alice King Chatham Medical Arts, Hawthorne, Calif., USA). This tubing was attached to a syringe pump (Med-Associates, St Albans, Vt., USA) that delivered fluid at a rate of 3.19 ml/min using a 10-ml syringe. To reduce tension on the catheter, the spring was attached to a 20-mm plastic screw that was mounted on the rat's head during catheterization surgery. Experimental events were controlled by computer using a MED Associates interface. Remifentanil, a rapidly-metabolized µ-opioid agonist (see Panlilio and Schindler 2000), was obtained from the DHHS Supply Service Center (Perry Point, Md., USA) as the commercial formulation, Ultiva (manufactured by Glaxo Wellcome and later by Abbott Laboratories). The infused solution had a concentration of 50 µg/ml, and the duration of the infusion (approximately 0.5 s) was based on the body weight of the rat.

Acquisition and maintenance

Each rat was trained with either food or remifentanil (4 µg/kg per infusion). For the first two sessions, which lasted 4.5 h each for food and 6.25 each hour for remifentanil, food pellets or infusions of remifentanil were delivered independently of the rat's behavior. During these sessions, the nose-poke holes were removed from the chamber. Reinforcers were delivered at random intervals averaging 150 s for food and 225 s for drug. Each presentation of food or remifentanil was paired with flashing of the houselight for 5 s. Beginning with the next session, the nose-poke holes were installed and a continuous-reinforcement schedule was instituted. A response in the left nose-poke hole immediately activated the feeder or infusion pump and also caused the houselight to flash for 5 s. During this 5-s time-out period, responding had no programmed consequences. Following time-out, the next response in the left hole was reinforced. Responses in the nose-poke hole on the right side were recorded but never reinforced. Each session under the continuous-reinforcement schedule lasted until 100 reinforcers had been received.

Rats were divided into four groups. The Short-Term Food group (n=8) and the Short-Term Drug group (n=8) were trained with the continuous reinforcement-schedule for 3 days. The Long-Term Food group (n=8) and the Long-Term Drug group (n=9) were trained with the continuous-reinforcement schedule for 27 days. Each of these groups was divided into two subgroups (Primed and Non-Primed), with four rats in each subgroup except the Long-Term Non-Primed Drug subgroup, which included five rats.

Punishment

After short- or long-term training with the continuous-reinforcement schedule, punishment training was conducted for 3 days (with the final punishment session occurring on a Monday). During punishment sessions, which lasted 5 h each, left-hole responses continued to produce either food or remifentanil (paired with flashing of the houselight), but each response also produced a 0.5-s, 1.5-mA shock.

Reinstatement

After 3 days under the punishment schedule, the punishment contingency was discontinued and reinstatement testing was begun. During this testing, responding in the left hole continued to produce food or remifentanil, but no shock was given. Each of the four groups was split into two subgroups, Primed and Non-Primed. In the priming condition, five response-independent reinforcers were given 1 min apart at the beginning of each session, and series of five reinforcers were also given 1, 2, and 3 h later if responding did not commence. In the non-priming condition, response-independent reinforcers were not given. Each time that food or drug was delivered, the houselight was flashed for 5 s. These sessions lasted for 5 h or 100 reinforced responses, whichever came first. This testing phase was continued with each rat until it reached a criterion of responding 100 times within a session.

Statistical analysis

Data were subjected to multi-level analysis with maximum likelihood estimation (PROC MIXED; see Singer 1998) using SAS software (SAS Institute, Cary, N.C., USA). For the highest-order interaction that was significant for each analysis, the Tukey-Kramer procedure was used to conduct pair-wise comparisons, maintaining an overall significance level of 0.05 for each set of comparisons. Thus, all paired-comparisons described as "significant" have a P<0.05.

The number of minutes required to obtain 100 reinforcers under the continuous-reinforcement schedule was analyzed using the following factors: reinforcer (food versus remifentanil), length of training (short-term versus long-term), treatment subgroup (primed versus non-primed), and day of training. For this measure, which is the reciprocal of reinforcement rate since the number of reinforcers was always 100, the analysis included data from the following days: day 1 of continuous-reinforcement training, day 3 of continuous-reinforcement training (the final day of this phase for the short-term groups), day 27 of continuous-reinforcement training (the final day of this phase for the long-term groups)], and the criterion day (i.e. the first day on which 100 reinforcers were obtained during reinstatement testing). For data from the criterion day, the amount of time was calculated starting with the first response.

Response rates (responses/min) were analyzed using the following factors: reinforcer, length of training, treatment subgroup, response type (active versus inactive hole), and day. The days analyzed were: days1, 3 and 27 of continuous-reinforcement training, each of the 3 days of punishment training, and the criterion day. Responses during the 5-s time-out period were included in calculations of response rates. There were no responses during time-out on the three days of the punishment phase, so shock rates were equal to response rates during this phase. For the criterion day, time prior to the first response was not included in the calculation of response rates. To determine whether priming produced a general increase in locomotion, paired comparisons were made between subgroups' rates of inactive-hole responding during the criterion session, even though the appropriate interaction term (reinforcer×length of training×treatment×response type) was not significant in the overall analysis of response rates.

During reinstatement testing, responding was essentially "all-or-none", with rats either obtaining all 100 of the available reinforcers within the session or only responding 0–2 times. Therefore, the effects of priming were assessed based on the number of sessions required to reach the 100-reinforcer criterion. This variable was analyzed with the following factors: reinforcer, length of training, and treatment.

Results

Acquisition and maintenance

All rats acquired the nose-poking response and received 100 reinforcers on the first day of continuous-reinforcement training. Figure 1 shows that the amount of time required to receive 100 reinforcers tended to decrease over the course of training, but the only substantial change in this measure was that remifentanil-trained rats took longer on the first day than on subsequent days. Food-trained rats responded faster than remifentanil-trained rats throughout acquisition and maintenance training (see Fig. 2). Although response rates in remifentanil-trained rats reached stability by the third day, responding in the long-term food group increased more than 2-fold by the end of training. This increase was mainly due to responding during the time-out period and did not affect reinforcement rates.

Fig. 1.
figure 1

Amount of time (min; mean±SEM) before all 100 available reinforcers were obtained by the short-term and long-term remifentanil-trained and food-trained groups during the first training session with the nose-poke operandum installed (session 1), during the third training session (session 3, the final pre-punishment session for the short-term groups), during the 27th training session (session 27, the final pre-punishment session for the long-term groups), and during the criterion session (criterion, the first session during reinstatement testing in which 100 reinforcers were obtained). Rates for the criterion session were calculated starting with the first response. Statistical analysis of this measure indicated that the effects of reinforcer [F(1,25)=134.94, P<0.0001], day of training [F(2,25)=21.21, P<0.0001], and the interaction of these two factors [F(2,25)=16.89, P<0.0001] were significant. There were no significant differences in these data between long-term and short-term groups or between primed- and non-primed subgroups (P>0.1). Paired comparisons revealed that 100 reinforcers were obtained significantly faster on day 3 compared to day 1 and on day 27 compared to day 1 for remifentanil-trained rats (P>0.05), but days 3 and 27 did not differ from each other (P>0.8). This measure did not differ significantly across these three training days for food-trained rats (P>0.9). Food-trained rats received 100 reinforcers significantly faster than remifentanil trained-rats on each of the days shown here (P<0.05). On the criterion day, time to the 100th reinforcer did not differ from the last training day for either remifentanil- or food-trained rats (P>0.9). Left panel: remifentanil-trained groups. Right panel: food-trained groups. Open symbols: short-term groups. Filled symbols: long-term groups. Circles: mean of primed and non-primed subgroups. Upward-pointing triangles: primed subgroups. Downward-pointing triangles: non-primed subgroups. Note that left and right panels use different scales

Fig. 2.
figure 2

Responses per minute (mean±SEM) in the active and inactive nose-poke holes on days 1, 3, and 27 of continuous-reinforcement training (crf1, crf2, and crf3), on days 1, 2 and 3 of punishment training (pun1, pun2, and pun3), and on the first day that the 100-reinforcer criterion was met after the punishment contingency was discontinued (crit). Statistical analysis of these data indicated significant main effects of reinforcer [F(1,25)=320.97, P<0.0001], day of training [F(6,301)=86.83, P<0.0001], and response type [active versus inactive; F(1,301)=370.25, P<0.0001], as well as significant interactions involving reinforcer×day [F(6,301)=73.21, P<0.0001], reinforcer×response type [F(1,301)=301.96, P<0.0001], day×hole [F(6,301)=79.47, P<0.0001], and reinforcer×day×response type [F(6,301)=67.56, P<0.0001]. Notably, there were no significant main effects or interactions involving training history (short versus long) or treatment subgroup (primed versus non-primed) for this measure (P>0.1). Therefore, the following paired comparisons (all significant P<0.05 and non-significant P>0.15) were conducted based on reinforcer, day, and response type. For both remifentanil- and food-trained rats, comparisons revealed significantly higher rates of responding in the active hole than the inactive hole on continuous-reinforcement days 1, 3, and 27, and on the criterion day, but not on any of the punishment-phase days. When active-hole response rates were compared across the remifentanil and food groups, response rates were found to be significantly higher in the food-trained rats than the remifentanil-trained rats on continuous-reinforcement days 1, 3, and 27, and on the criterion day, but not on any of the punishment-phase days. Response rates in the inactive hole did not differ significantly between remifentanil- and food-trained rats on any day, and inactive-hole response rates also did not differ significantly across days within the remifentanil-trained rats or within the food-trained rats. For remifentanil-trained rats, active-hole response rates on days 3 and 27 of continuous-reinforcement training were significantly higher than on the first day, but days 3 and 27 did not differ from each other. For food-trained rats, active-hole response rates on days 1, 3 and 27 of continuous-reinforcement training increased progressively, with rates on each of these days differing significantly from rates on each of the other 2 days. For both remifentanil- and food-trained rats, active-hole response rates were significantly reduced on the first day of punishment training compared to the last day of continuous-reinforcement training, but rates did not differ significantly across the 3 days of punishment training. Active-hole response rates were significantly higher on the criterion day than on the last day of punishment training for the remifentanil-trained rats, but these criterion-day rates did not differ from the last day of continuous-reinforcement training. Criterion-day active-hole rates were also significantly increased compared to the last day of punishment training for the food-trained rats; while these criterion-day rates were not significantly different from those on day 3 of continuous-reinforcement training, they were significantly lower than on day 27. Left panel: remifentanil-trained groups. Right panel: food-trained groups. Open symbols: short-term groups. Filled symbols: long-term groups. Circles: active hole. Squares: inactive hole. Note that left and right panels use different scales

Punishment

Responding in the active hole was rapidly suppressed by the punishment contingency in all rats (see Fig. 2).Very few responses occurred during this phase, and the ones that did were mainly during the first 15 min of the first session. No rat emitted more than ten punished responses over the 3 days, and many rats did not respond in the active hole at all on days 2 or 3 of punishment training.

Reinstatement

Figure 3 shows that the priming procedure effectively reinstated responding, regardless of the reinforcer type (remifentanil versus food) or length of training prior to punishment (3 versus 27 days). In the absence of priming, none of the rats in either the food or remifentanil groups responded at all on the first day that the punishment contingency was discontinued. On average, the long-term remifentanil-trained rats that did not receive priming resumed responding slightly faster than the other three non-primed groups, but this difference was not significant (P>0.8).

Fig. 3.
figure 3

Number of sessions (mean±SEM) required before all 100 available reinforcers were obtained within a session in sub-groups that received priming (i.e. experimenter-delivered remifentanil or food) or no priming after short-term training with remifentanil (Short Remi), long-term training with remifentanil (Long Remi), short-term training with food (Short Food), or long-term training with food (Long Food). Statistical analysis indicated that the effect of priming was significant [F(1,32)=28.9, P<0.0001], but the effects of reinforcer, length of training, and all interactions involving these effects were not significant (P>0.3). Filled bars primed groups. Open bars non-primed groups

After the first few responses, responding during the reinstatement phase typically occurred at about the same rate as during the last pre-punishment session (see criterion-day data in Figs 1 and 2). This was true of all groups except the long-term food group, which no longer had high rates of time-out responding.

To evaluate the possibility that priming promoted active-hole responding simply by increasing levels of locomotor activity, response rates in the inactive hole on the criterion day were analyzed. Figure 4 reveals that there was no evidence of non-specific locomotor effects. That is, priming did not significantly affect inactive-hole responding for either remifentanil- or food-trained rats.

Fig. 4.
figure 4

Inactive-hole response rates (mean±SEM) on the criterion day. Priming did not have a significant effect on inactive-hole responding in any of these groups (P>0.9). Response rates in the long-term drug subgroups were significantly higher than in the short-term drug subgroups (P<0.05), but the short- and long-term food subgroups did not differ significantly (P>0.8). Filled bars primed groups. Open bars non-primed groups

Discussion

Self-administration of remifentanil was robustly suppressed by response-contingent footshock. When this punishment contingency was discontinued, self-administration eventually resumed in all rats. However, rats resumed responding much sooner when they received response-independent priming infusions of remifentanil. These results suggest that punishment effects are probably not more robust in rats than in primates, despite the apparent contrast between the lasting punishment effect obtained by Smith and Davis (1974) in rats and the sometimes transitory nature of punishment-induced suppression of self-administration that has been observed in monkeys (Johanson 1977; Bergman and Johanson 1981).

The basic design and parameters of the present study were modeled on those of Smith and Davis (1974), including 1) the number of sessions in each phase of training (3 days of continuous reinforcement followed by 3 days of punishment); 2) the number of self-administered infusions (300 over 3 days in the present study versus an average of 281 in the earlier study); and 3) the nominal shock intensity (1.5 mA). Many aspects of the results obtained in these two studies were also comparable, including the rapid acquisition of responding and the rapid suppression of this responding by punishment. However, Smith and Davis found that punished self-administration responding did not resume even after the punishment contingency was discontinued and the drug was presented non-contingently.

Perhaps the most important procedural differences between these two studies were that priming presentations were more frequent and more time was allowed for responding to resume in the post-punishment phase of the present study. During this phase, Smith and Davis (1974) allowed one 12-h session for all of their rats and three additional 12-h sessions for four rats that received primes. In the present study, daily 5-h sessions were continued until responding resumed, which took up to 10 days in non-primed rats. Other procedural differences between the study of Smith and Davis (1974) and the present study include, respectively: 1) the strain of rats (Sprague-Dawley versus Long-Evans); 2) the reinforcer (morphine or amphetamine versus remifentanil or food); 3) the temporal parameters of the priming procedure (details of this procedure were not given by Smith and Davis); 4) the operant response (lever-pressing versus nose-poking); and 5) the nature of pre-training prior to the acquisition phase. During pre-training, the rats of Smith and Davis received response-contingent saline infusions for one session, while in the present study rats received non-contingent drug infusions or food-pellet presentations for two sessions. This non-contingent drug presentation during pre-training was intended to facilitate the subsequent acquisition of the self-administration response by making the flashing houselight a conditioned reinforcer. It is unlikely that this pre-training procedure increased the likelihood of reinstatement induced by priming in the final stage of the study, however. The nose-poke operandum was not available during the pre-training phase, and non-contingent reinforcers were never given during the acquisition, maintenance, or punishment phases. Therefore, the training procedures of the present study should not have caused non-contingent drug delivery to become a discriminative stimulus for nose-poking.

Although response-contingent shock has been found to suppress drug self-administration in a number of studies involving primates (as briefly described in the Introduction, above), there are conditions under which shock has been reported to enhance drug self-administration in rats. In a study by Beck and O'Brien (1980), response-contingent delivery of a low-intensity shock to an electrode implanted in the foreleg of rats was found to produce a dramatic enhancement of morphine self-administration, leading to intake of lethal levels of drug. In another study (Dib and Duclaux 1982), non-contingent delivery of shock through an electrode pasted to the base of the tail was found to enhance intracerebroventricular morphine self-administration in rats during a 15-min shock period. In the Dib and Duclaux study, the enhancement of self-administration during the period of inescapable shock was attributed to reinforcement of responding due to the drug's analgesic properties. A similar explanation might also be applied to the phenomenon observed by Beck and O'Brien; although they intended their shock intensity to be below the pain threshold, the enhancements of self-administration were dependent on the duration of the shock. In contrast, the procedures of both the present study and that of Smith and Davis made it unlikely that the analgesic effects of remifentanil or morphine, respectively, could have occurred rapidly enough to alleviate the effects of the shock, which was delivered contiguously with the infusion.

In addition to the two studies showing enhanced morphine self-administration during shock presentation (Beck and O'Brien 1980; Dib and Duclaux 1982), a number of studies have shown that extinguished self-administration responding in rats can be reinstated by a period of non-contingent shock presented in the context where self-administration had previously occurred (see review by Shaham et al. 2000). Non-contingent shock can also enhance the subsequent acquisition of cocaine self-administration in rats (Goeders and Guerin 1994). However, in these procedures, shock was delivered at a time when there was a low probability of responding due to extinction or because the response had not yet been acquired. In contrast, ongoing drug self-administration can be robustly suppressed by shock presented either independent of responding (Kearns et al. 2002) or contingent on the self-administration response (present study; Smith and Davis 1974). It should be noted that these punishment studies used a higher shock intensity (1.5 mA) than the studies showing shock-induced enhancement or reinstatement of self-administration (which typically used 0.5 mA). It should also be noted that it may be important that a fairly high shock intensity was used early in training in the punishment studies. With food responding, it is known that if a low-intensity, response-contingent shock is used early in training, even intense shock may fail to suppress responding later in training (Azrin and Holz 1966).

The priming effect observed here was not as immediate as that typically observed in the extinction/reinstatement procedure. Although many rats in the present study did resume responding during the first priming session, this did not occur after the first set of primes. An average of more than one session was required before primed rats resumed responding, regardless of the reinforcer (drug versus food) or length of training. However, once rats began responding during the reinstatement phase, they quickly returned to their pre-punishment rates (except for the long-term food group, which no longer exhibited high rates of time-out responding). These results indicate that priming has qualitatively, but perhaps not quantitatively, similar effects in the extinction/reinstatement and punishment/reinstatement procedures.

Response-contingent drug was continuously available during both the punishment and reinstatement phases of the present study. Thus, resumption of responding under this procedure may be considered "reacquisition" as well as "reinstatement." The availability of the drug throughout the experiment, particularly during the response-reduction phase, distinguishes this reinstatement procedure from those using extinction. This feature may make the punishment/reinstatement procedure a particularly relevant model of situations where human drug abusers receive both reinforcing effects of the drug and aversive consequences. However, if desired, the procedure could be modified to determine whether discontinuing response-contingent reinforcement during punishment and/or priming would influence the reinstatement effect. In addition, since both the reinforcer and its associated stimulus (flashing light) were presented during priming in the present study, it could be useful in the future to determine whether the drug-paired stimulus contributes to the reinstatement effect. Previous work with the extinction/reinstatement procedure has demonstrated important differences between the neural mechanisms underlying reinstatement produced by drug priming, presentation of drug-paired stimuli, and stress (Shalev et al. 2002).

The evidence obtained here demonstrates that priming can have similar effects in both the punishment/reinstatement and extinction/reinstatement procedures. However, it is possible that other manipulations might differentially affect punished and extinguished responding. For example, it is already known that response-independent presentation of mild shock can reinstate extinguished self-administration responding, but not extinguished food responding (Ahmed and Koob 1997; Shaham et al. 2000). It could be important to determine whether such a manipulation can also reinstate self-administration that has been punished by more intense, response-contingent shock. Any differences that exist between reinstatement in the extinction and punishment procedures might suggest more comprehensive strategies for preventing relapse to drug abuse.