Steven R. Goldberg entered graduate school only a few decades after drug abuse research started in earnestFootnote 1. The first arguably rigorous experimental studies of drug dependence in animals were published by Tatum and Seevers (1929) on cocaine and Tatum et al. (1929) on morphine. There were also noteworthy studies by Lawrence Kolb (e.g., Kolb and DuMez 1931), but the major sustained contributions came from the prolific output from Seevers and his colleagues, initially at the University of Wisconsin, and a bit later and for the rest of Seevers’ career at the University of Michigan. In 1935, the Public Health Service Hospitals for treatment of drug abusers (“Narcotic Farms”) were established, with the one in Lexington Kentucky having both treatment and research components. The treatment component was intended to give a “new deal” for drug addicts in terms of humane treatment (Campbell, 2006). The research component initially had only a clinical focus and was led by Clifton K. Himmelsbach Footnote 2 with a charge of developing a better understanding of opioid dependence and evaluating treatments. The substantial contributions establishing the fundamental characteristics of opioid dependence and withdrawal from clinical studies by Himmelsbach (e.g., 1941), coupled with studies by Seevers at Michigan, remain today to be the most important studies of opioid dependence. A detailed history of the early research, and the role of the National Academy of Sciences in that effort, has been documented by Eddy (1973) and by May and Jacobson (1989).

The historical context for behavioral studies of drugs as reinforcers

The initial studies by both Himmelsbach and Seevers characterizing opioid dependence focused on the withdrawal syndrome. A question that seemed important at the time was whether “addiction” with its volitional component was a uniquely human malady. That question was addressed by Shirley Spragg (1940), with prompting from Robert Yerkes, then director of the Yale Laboratories of Primate Biology, Orange Park, FL. Spragg conducted what were the first studies on drug self-administration, though they were not characterized as such. One feature that distinguished Spragg’s studies from those of Seevers was the emphasis by Spragg that the chimpanzees be trained to initiate the drug injections through their own actionsFootnote 3. Films taken during the course of the studies were particularly compelling, showing the chimpanzees on leashes pulling the investigator to the room in which the injections were administered. There was a sensational nature to these films that spoke to, if not proving, Spragg’s conclusion that addiction was not uniquely human.

The first wave—response-reinforcer contingencies

It was not until a decade-and-a-half later that the next experiments on drug self-administration were published. The studies were conducted by Harold Coppock and John Nichols who used an ingenious contraption to show that rats in morphine withdrawal could be trained to turn their heads to the right or to the left when morphine solutions were injected i.p. as a consequence (Headlee et al. 1955). The authors spoke of the response as an instance of opioid-withdrawal avoidance learning: behavior reinforced by its consequences. These findings and a few others factored prominently in the formulated views of Abraham Wikler on the role of conditioning factors in opioid addiction and relapse (Wikler 1948; 1965) that were the basis for Steve Goldberg’s doctoral thesis.

Probably the most convincing of the initial papers on self-administration was published in Science (Weeks 1962), demonstrating that rats could be trained to press a lever when those responses produced intravenous injections of morphine. Weeks conceptualized and reported the lever pressing as operant behavior reinforced by morphine injection. Weeks, trained at the University of Michigan, was influenced by Seevers (who at Michigan was not?) but did not work with him. At the time of his self-administration research, Weeks was working at the Upjohn Company. Upjohn had little interest in drug abuse but had a particularly enlightened unwritten policy that senior scientists could devote approximately 10 % of their time to “personal research.” In his own account later, he said that “having an animal take its own drug seemed to be a good method for studying addiction” (Weeks, 2004). Upjohn paid for supplies, Weeks built the apparatus at his home and was off and running. Weeks saw being a renowned “gadgeteer” as a virtue and worked out many of the technical difficulties with a procedure for IV drug self-administration. A bit later, Tomoji Yanagita was at Weeks’ door to learn the particulars of drug self-administration for his experiments with monkeys in the Michigan laboratories (Yanagita 1970).

At about the same time, Travis Thompson and Charles R. Schuster at the University of Maryland published their studies on morphine self-administration in rhesus monkeys. Thompson and Schuster (1964) stated that “research techniques based upon the behavioral principles of operant conditioning have provided a complementary approach to standard pharmacologic analysis of physical dependence upon opiates…” (p. 7). Critical aspects of their complex procedure included chained schedules of morphine injection, alternating with periods of food reinforcement and avoidance of presentations of electric shock. Using these procedures, Thompson and Schuster advanced the study of drug self-administration in several ways. They compared the effects of withdrawal with that of antagonist administration and the administration of response-independent morphine injections. The study clearly showed that self-administration of morphine was reliable and orderly and a function of the fundamental variables involved in drug dependence. Additionally, the complexity of the behavioral situation was clearly intriguing to Thompson and Schuster, and Schuster was to follow these complexities later with Steve Goldberg.

Seevers, seeing the trend toward research on drug self-administration and a paper presented by Schuster at a Committee on Problems of Drug Dependence annual meeting, recruited Schuster to the University of Michigan. With Tomoji Yanagita and Gerald Deneau, Seevers had already initiated studies of drug self-administration (Yanagita et al. 1965a, b; Deneau et al. 1965). Those studies were the first to document reinforcing effects of opioids in subjects that had not previously been made dependent prior to the initiation of the self-administration studies. Additionally, Deneau et al. (1969) characterized the findings in behavioral terms. To quote the authors: “the present experiments were designed to determine whether a monkey, having received the first injection of any drug by spontaneous lever press, would continue to seek reinforcement by increasing the number of lever presses and maintain such a pattern of self-administration for long periods of time” (p. 32). Yanagita worked out many of the technical details with input from Weeks, who had of course worked out many of those details for chronic catheterization of rats. He and Seevers devised a minimally restraining harness and arm compatible with chronic catheterization of large primates (Yanagita et al. 1965a, b; Yanagita 1970), another ingenious contraption.

There are substantial implications inherent with the conceptualization of drug self-administration as an instance of operant behavior reinforced by drug administration. Considerations of drug dependence from social, legal, and psychiatric perspectives have been and continue to be replete with explanations involving failures of social systems, as well as moral depravity, abnormal psychiatric characteristics, or failures of will on the part of individual drug addicts. Research on drug self-administration in experimental animals, presumably devoid of social, legal, and psychiatric considerations, directs research on dependence to contingencies between responses and consequences which can be examined from a natural science perspective. Further, with the pioneering studies in self-administration, drugs could now be subsumed within the class of reinforcing stimuli, and rules that apply to one reinforcer will likely apply to another. Moreover, the “volitional” aspects of drug dependence that had been a cause of concern for many within the scientific community were now accessible for study, with all of the rigor demanded by true experimentalism.

The second wave—stimulus-stimulus contingencies

By the time Steve Goldberg entered graduate school at the University of Michigan, as Schuster’s first Ph.D. student, the idea that a drug could function as a reinforcer had a firm foot hold in the drug abuse field—as characterized above, a first wave of applying behavior analysis to the problem of drug dependence, and a substantial paradigm shift. Nonetheless, the reinforcing effects of drugs did not encompass all that was involved in drug abuse, and the analysis of behavior encompasses much more than reinforcement (cf. Catania, 2013). There were opportunities to expand the horizons of the behavioral analysis of drug abuse, certainly not only in further studies of reinforcing effects of drugs but also in other areas. Steve Goldberg’s initial contributions involved effects other than reinforcing effects, with drugs delivered contingent on other stimuli (Pavlovian or respondent conditioning). The conceptualization of drugs as having stimulus functions was facilitated by the studies of drug self-administration in which drugs were considered reinforcing stimuliFootnote 4.

A look back at Lexington Kentucky for context

The passing of the Harrison Narcotics Tax Act in 1914 regulated and taxed distribution of opioids in the USA and was interpreted by the courts system to mean that opioids could be prescribed for pain relief but not for the treatment of addiction (Musto 1999). Before the Act, physicians were free to invent any treatment for drug dependence and did so with wide variety and little empirical assessment (Campbell, 2006). One primary mission of the PHS Hospital in Lexington was to use clinical science to develop effective methods for withdrawing and curing opioid addicts. In this mission, the PHS Hospital was an abysmal failure, with many of the discharged patients returning shortly thereafter. Assessments of relapse rates of patients discharged from the PHS Hospital vary but are uniformly high, ranging from 75 to 97 % (e.g., Duvall et al. 1963; Pescor 1943).

In the 1940s, Abraham Wikler was a psychiatry resident at the PHS Hospital in Lexington and was responsible for the management of opioid withdrawal among newly admitted prisoners. Wikler became convinced that the normally “stormy” withdrawal syndrome became even more so when he was actually on the ward. During interviews with post-addicts, Wikler probed the “recollections” of the precise circumstances of previous relapses. Although the subjects invariably attributed relapse to the desire to “get off the natural” or “to get high,” questioning revealed that in many instances, relapse occurred when the post-addict was returning to his home environment or when he unexpectedly encountered an active addict or a former drug supplier on the street. These circumstances precipitated withdrawal symptoms (chills, running nose, watery eyes, nausea, “flu-like” symptoms), and the individual started looking for “a fix” and eventually relapsed.Footnote 5 In addition, when discussions turned to drugs during group therapy sessions at Lexington, the post-addicts began to yawn and to wipe their noses and eyes without recognition of these as withdrawal-like symptoms (Wikler 1978).

To account for these phenomena as well as the high rate of relapse among those that were withdrawn from opioids at Lexington, Wikler proposed what has been called a two-factor hypothesis of relapse. The first factor being respondent (Pavlov 1927) conditioning, and the second factor being operant self-administration of drugs. According to the hypothesis, the environment in which drug withdrawal had occurred in the past serves as a conditional stimulus that elicits withdrawal symptoms. The subsequent alleviation of withdrawal symptoms then reinforces drug self-administration which ultimately leads to relapse (Wikler 1948, 1965, 1977, 1978). This hypothesis remains the most cogent explanation of relapse to this day. But in the mid-1960s, it lacked empirical support.

Steve Goldberg’s doctoral thesis

In order to put Wikler’s hypothesis to test, Steve Goldberg used a conditioned suppression procedure (Estes and Skinner 1941) which was originally devised to objectively study anxiety. In this procedure, most often, a behavior such as lever pressing is reinforced typically with food pellets according to some intermittent schedule of reinforcement. Once the behavior is reliably maintained, an exteroceptive stimulus is occasionally presented, and after habituation so that the stimulus alone has no effect on lever pressing, it is reliably followed by a noxious stimulus, in most cases, electric shock. In the terms of Pavlov, the exteroceptive stimulus was the conditional stimulus (CS) and the electric shock the unconditional stimulus (US). In the study by Estes and Skinner, as well as scores of studies that followed, responding was eventually suppressed (the conditioned response) during the previously ineffective pre-shock stimulus, even though the pre-shock stimulus and electric shock were presented independently of responding and had no effect on the likelihood of food reinforcement.

The procedure reported by Goldberg and Schuster (1967) was for all intents and purposes identical, except that morphine-dependent rhesus monkeys (12 or 8 mg/kg/day, depending on the study) were surgically prepared with chronic jugular catheters through which an intravenous injection of an opioid antagonist could be delivered. The antagonist used was nalorphine (∼0.4 mg/kg) as the more pure antagonist; naloxone was not in widespread use at the time. Again, in the terms of Pavlov, the nalorphine injection served as the US. Lever-press responding was maintained under a fixed-ratio schedule in which every tenth response produced a food pellet.

A cumulative record of responding maintained under the fixed-ratio schedule (Fig. 1a) shows a characteristic performance; a brief pause in responding was followed by an abrupt transition to a high rate of responding up until food reinforcement, giving the record a step-like appearance. The record from the last session of illuminating a light for 10 min with IV saline injection 5 min after its onset (Fig. 1b) shows no effect of saline, as indicated by no change in the slope of the cumulative response curve. In the following session (Fig. 1c), the light was paired with nalorphine injection (in this experiment, ∼0.2 mg/kg, IV) for the first time. The monkey responded normally during the light until a few moments after the nalorphine injection, at which time responding was completely suppressed and remained so throughout the session. The arrow shows the point after injection at which emesis and excessive salivation were elicited.

Fig. 1
figure 1

The development of conditioned behavioral changes to nalorphine as indicated by cumulative response records of responding maintained by food under an FR 10 schedule as it occurred in time. Each record is for the complete 1-h food component of the 2-h session for Monkey M2113. Ordinates: Cumulative responses; abscissae: time. Diagonal slash marks on the curve show the presentations of food. The presentation of the 10-min light stimulus, and injection after 5 min is indicated by brackets where appropriate on the record. The monkey was maintained on morphine at a dose of 2 mg/kg every 6 h. a Control session before tone-injection pairings; b Session 5 in which the light and IV saline injection were presented without notable effect. c, d The first and tenth conditioning sessions in which 0.2 mg/kg IV nalorphine was injected 5 min after the onset of the light stimulus. Arrows indicate the occurrence of emesis and excessive salivation (modified from Goldberg and Schuster 1970)

By the tenth session of pairing the light and nalorphine injection (Fig. 1d), suppression of ongoing operant responding occurred immediately after the light was turned on, before the injection, and never resumed for the remainder of the experimental session. Some, but not all of the monkeys, including the one shown in Fig. 1, showed emesis and salivation with the illumination of the light. In contrast, nalorphine injection at the same dose in non-dependent subjects had no effects on the rate of ongoing operant responding (Goldberg and Schuster 1967).

In another part of Goldberg’s dissertation (Goldberg and Schuster 1970), the transfer to the post-dependent state of the conditioned effects and their enduring nature were examined. If Wikler’s hypothesis of conditioned withdrawal as a contributor to relapse was to hold, it had to occur well after individuals were withdrawn from and no longer dependent on opioids. In this second study, the conditioned suppression procedure was used again. Once the suppression was established while morphine-dependent, the rhesus monkeys were taken off morphine, and the effectiveness of the light was tested at 1-month intervals.

Figure 2a shows cumulative records of responding in a session that was conducted 1 month after morphine treatment stopped. Shortly after the light was turned on, responding ceased entirely. That cessation of responding lasted the entire length of the CS. In contrast to what was seen with morphine-dependent subjects, responding resumed after the light was turned off, likely because frank withdrawal was not precipitated. At 2 months after morphine injection (Fig. 2b), the subjects were again presented with the CS followed by a saline injection, and as before, responding was completely suppressed during the CS presentation. This occurred again in one of the subjects at 3 and 4 months after morphine treatment had stopped (Fig. 2c, d, respectively). There was no indication of any loss of control by the CS when it was presented at these 1-month intervals for up to 4 months after morphine treatment had stopped. Daily sessions with CS-saline pairings were subsequently conducted in order to extinguish the conditioned suppression produced by the CS (Goldberg and Schuster 1970).

Fig. 2
figure 2

The persistence of conditioned behavioral changes to a light previously presented before nalorphine injection as indicated by cumulative response records of performances in a morphine-withdrawn monkey. All details of the reinforcement schedule and the recording of responding are as in Fig. 1. a Performance under the fixed-ratio schedule and suppression of responding by the stimulus previously paired with nalorphine injection 30 days after morphine withdrawal. bd Performances under the fixed-ratio schedule and suppression of responding by the stimulus previously paired with nalorphine injection at 60, 90, and 120 days after morphine withdrawal (modified from Goldberg and Schuster 1970)

In summary, Goldberg’s doctoral thesis established that withdrawal signs can be conditioned through stimulus-stimulus contingencies, as suggested by Wikler. Further, the conditioning occurred rapidly; only a few pairings of the stimuli were necessary for the effect. However, not all signs were conditioned in all subjects; conditioned emesis and excessive salivation occurred only in some subjects. The effects of conditioning were long lasting; months after, the subjects were withdrawn from morphine stimuli associated with precipitated withdrawal continued to be effective. Taken together, the results supported the respondent conditioning part, established through stimulus-stimulus contingencies, of Wikler’s two-factor hypothesis. More globally, the results emphasize the role of environmental context as a factor in relapse to drug use.

Stimulus-nalorphine contingencies and opioid self-administration

In another study (Goldberg et al. 1969), the same “conditioned suppression” procedure was used with monkeys trained to self-administer morphine (0.1 mg/kg/inj). The total daily dose of 11 to 18 mg/kg depending on the subject was sufficient to produce withdrawal on its discontinuation. In this experiment, the CS (a flashing light) was presented for a total of 40 min during each daily session. Ten min after the flashing lights began, nalorphine (0.1 mg/kg, IV) was injected. Figure 3 (Sessions 1 to 4) shows that after saline was injected, the frequency of morphine self-administration was less than five injections during the following 30 min, a rate similar to that before the onset of flashing light. When nalorphine injections replaced saline (Sessions 5 to 14), response rates immediately increased more than fivefold, an effect opposite the nalorphine-induced suppression observed with food-reinforced responding in the previous experiments. Following a single session probe without injection (Session 15), saline injections had effects like those of nalorphine, increasing morphine self-administration, followed by an orderly decrease (Sessions 16 to 20) to the levels obtained before nalorphine injection in the first several sessions. Subsequent conditions (Sessions 21 to 34) replicated these basic effects with a more rapid decrease when saline replaced nalorphine injections (Sessions 31 to 34).

Fig. 3
figure 3

Frequency of morphine self-administration (0.1 mg/kg/injection) in rhesus monkeys in the 30-min period immediately following the intravenous injection of saline or nalorphine (0.1 mg/kg) during conditioning in three morphine-dependent rhesus monkeys. Each point represents the average frequency of self-administration in the three monkeys, and the vertical bars represent the range. Injections of saline or nalorphine were omitted on the control days (C). (Modified from Goldberg et al. 1969)

The increase in rates of morphine self-administration are similar to those obtained previously by Thompson and Schuster (1964) when morphine injections were withheld, prompting a withdrawal syndrome. This acceleration in morphine self-administration can be put in the context of negatively reinforced (“avoidance”) behavior. In other studies with the Estes-Skinner procedure applied to a baseline of responding maintained by electric-shock postponement (avoidance) rather than food reinforcement (e.g., Herrnstein and Sidman 1958), an acceleration rather than a suppression of responding was obtained. The nalorphine injection in the experiment by Goldberg et al. (1969) may be functioning as a CS for the precipitated withdrawal that will unfold moments after its injection. The acceleration of operant responding is consistent with an interpretation that opioid self-administration in dependent subjects is behavior that postpones withdrawal. Studies of human opioid addicts at Lexington had suggested that continued opioid use was due to the suppression by opioids of withdrawal in its early stages rather than the euphoria occurring with the initial use of the drugs (Wikler 1952).

The analogy to avoidance responding can be extended further. Studies conducted by Goldberg (with F. Hoffmeister when he was at the Institut fur Pharmakologie der Farbenfabriken, Bayer AG; Goldberg et al. 1971) and by Downs and Woods (1975) characterized avoidance and termination of opioid antagonist injections in morphine-dependent rhesus monkeys. A series of studies unfolding at about the same time demonstrated that after a history of electric shock postponement, responding could be indefinitely maintained by response-produced presentations of noxious electric shock (McKearney 1969). To extend the analogy further, it should be possible to maintain responding with injections of opioid antagonists that precipitate withdrawal. Just such a demonstration was published somewhat later by Woods et al. (1975). In that study, morphine-dependent rhesus monkeys (10 mg/kg/day) were initially trained to avoid naloxone injections (2.0 μg/kg/inj). In subsequent conditions of the study, a second-order schedule was introduced in which completion of one schedule requirement was treated as a single response “unit” that was reinforced according to another schedule (Kelleher 1966). In the Woods et al. (1975) study, completion of the unit schedule requirement was followed by a brief stimulus that was also presented with the ultimate presentation of the reinforcer. Figure 4 shows cumulative records of responding maintained by a schedule of response-produced naloxone injection in which each 30th response produced a visual stimulus, and every 10th stimulus presentation was followed by an injection of naloxone (2.0 μg/kg/inj). A very low dose no doubt, but the effectiveness of that dose was established by removing the injection (“No Injections”), at which point responding rapidly extinguished. Responding resumed when naloxone injections were replaced under this complex schedule. These studies with responding maintained by naloxone injection emphasize that the determinants of reinforcement and punishment are varied and that stimuli can function in seemingly diametrically opposed ways which depend on how the stimuli are scheduled, the ongoing behavior of the subject, and the history of the individual (Morse and Kelleher 1977).

Fig. 4
figure 4

Cumulative records of responding maintained by IV naloxone injection. Abscissae: time; ordinates: cumulative number of responses. The upper record shows the eighth session of responding when every completed FR 30 unit produced a 1.5-s flash of the house light and every tenth completed FR 30 unit produced an injection of naloxone (0.002 mg/kg/injection) followed by a 1-min timeout accompanied by illumination of the experimental chamber. The center record shows performance in the third session with the naloxone infusion pump disconnected. The lower record shows resumption of naloxone-maintained responding in the first session in which naloxone injections were again scheduled. In this and in subsequent sessions, the schedule was changed so that every fifth completion of an FR 30 unit produced the injection. Injections of naloxone or saline are indicated by downward deflections on the horizontal line below the cumulative curve. Each session was terminated after ten injections or about 1 h (modified from Woods et al 1975)

Further studies of drug self-administration

Above, it was emphasized that conceptualizing drug self-administration in terms of the reinforcing stimulus functions of drugs suggested that findings obtained with one reinforcer will apply to another. Many of the first studies of drug self-administration found response rates to be relatively low compared to those maintained by more conventional reinforcers, and rates of responding decreased with increasing dose per injection or magnitude of the reinforcer. Conversely, traditional psychological learning theory would suggest that the rates of responding are directly related to reinforcer magnitudeFootnote 6, possibly suggesting unique qualities of drugs as reinforcing stimuli.

Goldberg (1973) examined performances maintained by either cocaine or d-amphetamine and compared those performances to those maintained by food reinforcement. Initial studies were conducted with responding maintained under fixed-ratio schedules with a 1-min timeout following each injection. The response rates maintained approached those maintained by food reinforcement, and the dose-effect curve for the self-administered drug had a bell shape: as dose increased, response rates first increased and then decreased. The use of a timeout following each injection likely limited the disruption of behavior due to the accumulation of drug. These drug-induced effects on response rates resulting from repeated injections were explored further by Spealman and Kelleher (1979) and were found to contribute importantly to the decreases in response rates obtained at the higher doses per injection. Though the bell-shaped dose-effect curve had been obtained sporadically prior to the study by Goldberg (1973), his study emphasized the importance of examining a wide range of doses and implied that the two limbs of the bell-shaped dose-effect curve were a function of differing sets of influences.

In further studies assessing the potential similarities of performances maintained by scheduled drug injections and those maintained by conventional reinforcers, Goldberg (1973) used second-order schedules. Figure 5 (left) shows cumulative records of performances under a second-order schedule in which every 10th response produced a brief visual stimulus, and the first completion of 10 responses after the lapse of 5 min produced the brief stimulus and a cocaine injection for subject S319 or food presentation for subject S384 (Goldberg 1973). A 1-min timeout period (not shown in the records) followed each injection or food presentation, during which all stimulus lights were turned off, and responding had no scheduled consequences. This sequence cycled through 15 times within an experimental session. One important aspect of the second-order schedule was the establishment of sequences of large amounts of responding over extended periods of time, beyond what had been previously obtained in studies of drug self-administration. As these records show, the extended high rates of responding maintained by cocaine and their temporal patterns were under these conditions, similar in all important aspects to those maintained by food reinforcement (compare right and left panels of Fig. 5).

Fig. 5
figure 5

Representative cumulative records of performances of monkeys S319 and S384, showing the effects of changing the injection dose of cocaine (S319) or the amount of food per presentation (S384) under second-order schedules. Abscissae: time; ordinates: cumulative number of key-press responses. Each record represents a complete session which lasted until 15 cocaine injections or food presentations occurred. Short diagonal strokes on the cumulative records indicate presentations of a 2-s stimulus light. Similar strokes on the event records (solid horizontal lines) indicate cocaine injections or food presentations accompanied by 2-s presentations of the stimulus lights. The recording pen reset to the bottom of the cumulative record when cocaine was injected or food was presented. After each cocaine injection, there was a 1-min time-out period not shown on the records, during which responses had no programmed consequences (modified from Goldberg 1973)

Another attribute of the performances under the second-order schedule is that rates of responding across a wide range of doses of cocaine or amounts of food were similar. This was evident in dose-effect curves for cocaine or analogous graphs for food reinforcement that had very shallow slopes (Fig. 6, filled symbols). The relatively shallow slopes reflect a decreased exclusivity of the role of the terminal reinforcing stimulus (food or cocaine) in maintaining responding under the second-order schedule and a relatively enhanced control over behavior of the briefly presented stimulus. The open symbols in these graphs show the low rates of responding during the timeout period that followed injections or food presentations.

Fig. 6
figure 6

Effects of varying the dose of cocaine per injection or the amount of food per presentation on the rate of responding maintained under the second-order schedules in monkeys S319 and S384, respectively. In the presence of green stimulus lights, each FR component completed by the monkeys (FR 20 for S319 and FR 10 for monkey S384) during a 5-min fixed interval produced a brief 2-s light (S); the first FR component completed after the 5-min interval ended produced the light and either a cocaine injection (S319) or food presentation (S384). After each cocaine injection or food presentation there was a 1-min time-out period during which the green lights were not present and during which responses had no programmed consequences. Abscissae, dose of cocaine per injection (top) or amount of food per presentation (bottom), log scale; ordinates, mean rate of responding in the presence of the green stimulus lights (black dots) and during timeout periods (white dots). Each session lasted until 15 cocaine injections or 15 food presentations occurred. Four daily sessions were conducted at each value of cocaine injection or food presentation. Each point represents the mean of results of the last three sessions at each condition, and brackets represent the range (modified from Goldberg 1973)

Goldberg’s study demonstrated that under suitable conditions for each event, the self-administration of cocaine was generally similar to responding maintained by food presentation. Thus, when drug injections occur infrequently in time, self-administration is very similar to responding maintained by more conventional reinforcers. The distribution of injections over a longer time period minimizes the pharmacological effects that can decrease otherwise high response rates. These findings with drugs remind us that the capacity of any reinforcer to control behavior depends upon a variety of conditions, and those conditions can vary significantly across different reinforcing stimuli. For example, with food reinforcers, commercially available pellets are most often used for the obvious advantages of consistency of size and content across individual pellets and their milling to fit commercially available dispensing equipment. It is often taken for granted that substantial engineering has gone into the development of those pellets rendering them of a size that is sufficiently large to be effective but sufficiently small to ensure no substantial change in the level of food deprivation when delivered many times within an experimental session. Results of Goldberg’s studies with food presentation at the larger magnitudes indicate that at the largest magnitude, food reinforcement produced a substantial satiation across the session. A disruption in cocaine-reinforced responding was also obtained at the highest doses which were similar in form but dissimilar in function from the decreases in response rates maintained by food reinforcement.

Second-order schedules maintained high rates of responding between injections with drugs as diverse as cocaine, morphine, and nicotine (e.g., Goldberg 1973; Goldberg et al. 1976; Goldberg and Tang 1977; Goldberg et al. 1981), and it has been suggested that stimulus-stimulus contingencies between the drug and the stimulus briefly presented at the completion of the unit schedule give the stimulus conditioned reinforcing effects. Only a few studies have examined whether stimulus-stimulus contingencies are necessary for the high rates of responding maintained under second-order schedules. Two such studies (Goldberg et al. 1979; Katz 1979) compared the effects of completely removing the brief visual stimulus following the completion of each “unit schedule” with presenting a stimulus that had never been presented contingent with the drug injection (a non-paired stimulus), which allowed for an analysis of the role of conditioned reinforcement, established through stimulus-stimulus contingencies. In both of these studies, removal of the stimulus and substitution of a non-paired stimulus decreased the rates of responding maintained under the second-order schedule; however, the decreases in response rates were greater with the brief stimuli removed entirely than with the substitution of a non-paired stimulus. This outcome is not dissimilar to many studies conducted with food reinforcement, and in some of those response rates with a non-paired stimulus that never presented with food reinforcement are similar to those obtained with the paired stimulus (see Gollub 1977 for a review). Those rates of responding maintained by non-paired stimuli were still higher than those maintained with brief stimuli omitted which indicates that response-contingent stimuli scheduled to occur in a regular relation with drug injections, but without explicit pairing with the reinforcing stimulus, also can enhance sequences of behavior, whether the ultimate consequence is a drug injection or some other reinforcer.

The use of second-order schedules suggested new ways to study the control of behavior by stimuli associated with drug injections and presented opportunities to examine variants of scheduled injections of drugs and related stimuli to generate a variety of performances within experimental sessions. In one example, performances of monkeys were maintained with intramuscular injections of morphine or cocaine (Goldberg et al. 1976; Katz 1979). One obvious advantage of the IM route of injection is that complications stemming from surgically implanted catheters can be avoided. In order to minimize disrupting effects of injecting via the intramuscular route, the schedule was arranged so that a single injection was delivered only at the end of the experimental session. Figure 7 shows a performance of a rhesus monkey under a FI 60-min (FR 10:S) schedule of intramuscular cocaine (1.5 mg/kg) injection (Goldberg et al. 1976). The only consequence of responding prior to that injection was the brief stimulus presentations scheduled according to the 10-response fixed-ratio unit schedule. One of the many interesting aspects of the performance is that with a single cocaine injection available 1 h after the session started, the subject emitted well over 600 responses, with only one injection each day. These results show dramatically the power of the second-order schedule to maintain extended sequences of responding over long periods of time with infrequent drug injections. In several papers, Goldberg and co-authors drew an analogy between the extended sequences of responding under the second-order schedule and the long sequences of behaviors in the daily activities of drug abusers involving obtaining funds to purchase, procurement, preparation, and finally injection of the drug.

Fig. 7
figure 7

Representative performance of Monkey M-681 under the second-order schedule of intramuscular cocaine injection. Ordinates: cumulative number of key-pressing responses; abscissae: time. Short diagonal strokes on the cumulative record indicate 2-s presentations of a red light at the completion of each 10-response fixed-ratio unit schedule. The recording pen reset to the bottom of the record after 500 responses had cumulated and at the end of the session. The session ended with an intramuscular 1.5 mg/kg injection of cocaine accompanied by a red light (modified from Goldberg et al. 1976)

Another interesting aspect of performances maintained by a single injection delivered only at the end of a session is the ability to study the effects of one drug given as a pretreatment on behavior reinforced by a second drug. Importantly, with this schedule arrangement, the effects of the first drug on behavior per se are not due to a pharmacological interaction with the second, reinforcing drug. In the study by Goldberg et al. (1976), a low dose of nalorphine (0.03 mg/kg) increased rates of responding maintained by morphine. The increase in responding produced by the low dose of nalorphine in the absence of an interaction with the reinforcement produced by morphine is consistent with other data suggesting that termination of initial prodromal withdrawal symptoms contributes importantly to morphine self-administration.

In summary, scheduling drug injections according to second-order schedules demonstrated that performances maintained by drug injections could be similar to those maintained by more conventional reinforcers. This demonstration, however, required suitable conditions, those being conditions that minimized the pharmacological effects of the drugs that disrupt behavior. Under second-order schedules, the performances maintained appeared to be less influenced by the magnitude of the ultimate reinforcer and importantly influenced by the response-dependent presentations of brief stimuli that had been associated with the reinforcer. These stimulus-stimulus contingencies give the previously neutral stimulus a capacity to maintain responding at levels greater than those maintained without the brief stimulus or with a non-paired stimulus. However, a non-paired stimulus effectively enhanced responding, emphasizing a role of the scheduling of events in time and in relation to behavior as an important determinant of drug self-administration. Finally, the use of second-order schedules provided ways of assessing the effects of other variables (e.g., drug injection) on the behavior maintained exclusive of pharmacological interactions with the drug reinforcer.

The second wave today

The use of second-order schedules has continued, with studies extending their use into areas traditionally outside the domain of most studies on schedule-controlled behavior (e.g., sexual activity, Everitt et al. 1987; alcohol drinking; Lamb et al., 2015; neurological substrates of self-administration, Everitt and Robbins, 2000). On some occasions, second-order schedules have been exploited to distinguish between poorly defined constructs, so-called drug-taking and drug-seeking. The distinction suggests that these are behaviors controlled by different, though possibly overlapping sets of variables. Other than the obvious, whether the response does or does not result in a drug injection, the distinction is reminiscent of that in the feeding literature distinguishing between consummatory behavior and behaviors reinforced with food. Put in the context of Goldberg’s research, a critical question is whether the behaviors so called are under the control of different sets of variables, that is, whether there are substantive functional differences among the behaviors categorized as such when the direct effects of the drugs are eliminated. Another consideration is whether the distinctions primarily are due to differences in scheduling rather than differences in the type of behavior.

One of the important contributions of the research on second-order schedules has been the variety of scheduling techniques applied to study the role of stimulus-stimulus contingencies in innovative ways. Unfortunately, much of what is published today on how stimulus-stimulus contingencies influence behavior currently or previously maintained by drug injections has been singularly unimaginative. For example, much of the current research on the effects of stimulus-stimulus contingencies is limited to one-to-one pairings of stimuli. If the research on response-stimulus contingencies (e.g., Ferster and Skinner 1957) has taught us anything, it is that the schedule of contingent relations can create a whole host of different behavioral outcomes. Variations in stimulus-stimulus contingency relations other than one-to-one are likely the rule outside the laboratory but unfortunately have been little studied inside the laboratory.

Much of the current research on drug self-administration has a focus on the neurological proximal causes of drug-taking behavior. At the same time, those accounts largely neglect the analysis of the events occurring in the individual’s history that were critical in the establishment and continued maintenance of the behavior. This is not to say that CNS substrates are unimportant, but rather to highlight the difference between proximal cause and the environmental determinants of behavior. Simply put, if we were to know every neuronal circuit that is activated when a stimulus occasions a response, we would admittedly know much. However, our jobs as behavioral scientists would be woefully incomplete. This is because we would still need to know what events outside the organism are important to initiate activity within those circuits and, most importantly, how they came to do so. Steve Goldberg’s studies on the role of stimulus-stimulus contingencies in drug dependence point toward significant relations between environmental events and behavioral outcomes dictating what behavioral phenomena need to be “explained” in terms of CNS substrates. An undue focus on the proximal neurological “causes” of behavior with the neglect of the environmental contingencies that are determinants of behavior flows from an invalid insistence that anything with the adjectival prefix “neuro” is more fundamental than all else. Action at a historical distance is hardly novel in behavioral science but is often a challenging subject matter. Nonetheless, the past environmental occurrences of responses and stimuli or stimuli together in time may be complex and seem mysterious, but the tools of analysis in terms of contingency relations between responses and stimuli as well as stimuli and other stimuli are readily available to unlock that mystery and lead to a more complete understanding for prediction and control of behavior.