Keyword

Motivation comprises many related psychological processes, but can be defined generally as the energizing of behaviour in pursuit of a desired goal. There are a number of functional impairments in psychiatric disease which fall under the rubric of “motivational ” deficits. For example, one of the critical diagnostic symptoms in major depressive episode is “marked diminished interest or pleasure in all, or almost all, activities most of the day, nearly every day”, while the diagnostic criteria for schizophrenia explicitly include avolition as a negative symptom (American Psychiatric Association 2013). Uncovering the biological basis of these impairments requires the use of animal models. Rodents are particularly useful in this regard because of the many molecular manipulations that have been optimized for use in rats and mice.

Modelling motivational deficits in rodents presents a particular challenge. This is because motivational deficits by nature have to do with the absence of overt behavioural responding. This makes it all the more difficult to identify the crucial motivational components that are impacted by any experimental manipulations. There are any number of reasons why an animal might not engage in some behaviour, and not all of these have to do with the psychological or neurobiological processes researchers are interested in. If an animal responds less in a task, it could be because it is less motivated. However, it could also be that the animal is satiated, fatigued, has motor impairments, among other reasons. Conversely, if an animal responds more in a task, it could be due to increased motivation , but it could also be due to increased overall hyperactivity. Thus, without a careful experimental analysis, it is impossible to separate specific effects on goal-directed motivation from effects on other processes.

Aside from the difficulty of interpreting increases or decreases in behavioural output as specific changes in motivation , another reason why studying motivation is so difficult is that any behaviour that we study and infer motivation from is not caused by a monolithic process, but is itself the result of a combination of multiple psychological processes. Consider all of the psychological processes involved in the simple act of a rat pressing a lever for access to a food reward. For one, there is the physical effort required to press the lever. There is also a hedonic component to the behaviour. If the reward does not produce a positive hedonic reaction, that is, if the rat does not “like” the reward, it may not press the lever. The rat must also be able to represent not only the anticipated cost of the effort it will expend to press the lever (along with other costs, such as the time lost which could be engaged in other foraging activities due to engaging in the work requirement), but also the anticipated benefit of the reward at the end of the work. These expectations must then be compared via some type of cost/benefit calculation. If the outcome of the effort outweighs the associated costs, the rat will choose to expend the effort of the lever presses.

In addition, in experiments which assess the willingness of animals to work for many trials (and many rewards), there is the question of how quickly the animal satiates. If they are not hungry, they will not work for food. In cases where the work requirement takes some amount of time to complete, we must consider the extent to which the animal is tolerant of the delay to the reward. Related to this is how accurately the animal is able to represent the time it takes to complete the work requirement. If an animal overestimates or underestimates the elapsed time, it will have an impact on the willingness to expend the effort. All things being equal, an animal that is less tolerant of delays to reward, or has an exaggerated representation of the time it will take to fulfil the work requirement, will work less for a reward that is temporally distant.

We can see that interpreting deficits in performance on some behavioural task becomes all the more challenging when we parse apart the many psychological processes involved in these performances. The challenge is to isolate these specific psychological processes via a careful behavioural analysis and identify the neurobiology of these separable aspects of motivation (see Berridge and Robinson 2003, and also Robinson et al., in this volume for related discussion). Because of the many processes involved in motivated performance, and the potential for confounding influences, careful and rigorous behavioural analysis of motivation in animal models is as much about identifying experimental factors and psychological processes that are not related to a behavioural performance of interest, as it is about identifying factors and processes that are (Ward et al. 2011). We give an example of this type of an analysis below.

This careful behavioural analysis becomes even more critical given the pervasive use of genetically modified animal models of various diseases in contemporary biomedical research (Ellgood and Crawley 2015; Nestler and Hyman 2010; O’Tuathaigh et al. 2014). With the relative ease of manipulating specific genes, mouse (and increasingly, rat) models of various genetic risk factors are easily tractable. The impact of genetic deletion or overexpression of specific genetic products is difficult to predict or understand; however, it is most certainly the case that these manipulations impact much more than the circumscribed psychological or disease process of interest (Berridge and Robinson 2003; O’Tuathaigh et al. 2010, 2013).

Fortunately, there are a wealth of behavioural procedures, which have been developed throughout the history of experimental psychology with which to dissect motivational processes. Furthermore, researchers have employed these procedures to understand the neurobiology of various components of motivation and have revealed mostly separable neural substrates. In this chapter, I discuss some of these methods, with a particular focus on methods for dissecting specific psychological aspects of motivation , and ruling out alternative interpretations for observed behavioural deficits.

1 Aspects of Motivation Impacted in Psychiatric Disease

Diagnostic criteria for a range of psychiatric diseases include deficits in motivation . Although these types of deficits vary across diagnoses and patients, they generally fall into two categories: anhedonia and avolition . Anhedonia refers to the inability to experience pleasure from normally pleasurable activities (Ribot 1897). While it has been studied in a range of psychiatric conditions, it is historically most strongly associated with major depression (Gorwood 2008) and schizophrenia, although current research in patients suggests that the prevalence of anhedonia in schizophrenia may be exaggerated due to the format of clinical intake interviews, and may instead reflect decreased anticipatory motivation or deficits in other cognitive processes (Burbridge and Barch 2007; Gard et al. 2007; Strauss and Gold 2012). More information on the clinical concepts and methods of assessment of motivational deficits in patients with schizophrenia is provided by Reddy et al., in this volume.

Avolition refers to a general lack of motivation or drive to engage in goal-directed activities. It is considered to be a critical component of the functional deficits in schizophrenia (Foussias and Remington 2010) and is resistant to current therapeutic strategies. In recent years, there has been a renewed focus on avolition and motivational deficits in schizophrenia. An in-depth review of studies aimed at dissecting the behavioural components of deficits in motivation in patients with Schizophrenia is provided by Waltz and Gold et al., in this volume. The recent increase in such research is due to the critical nature of these deficits and their likely interaction with other symptom clusters in the disease given the aberrant function of prefrontal-striatal circuitry (critical to both motivational and cognitive processes) in patients (Barch and Dowd 2010; Barch 2005). Such an interaction may contribute to the lack of functional improvement in patients following cognitive-remediation therapy, even when such therapy results in significant improvement in specific domains of cognitive functioning (Hogarty et al. 2004; Medalia and Choi 2009; Velligan et al. 2006). Results like these suggest that successfully treating an underlying motivational impairment may be critical to achieve meaningful cognitive and functional gains. For an in-depth discussion on this subject, please see Saperstein and Medalia in this volume. In addition, symptoms in other psychiatric diseases, which have historically been considered separately from goal-directed motivation (apathy, emotional and mood disturbances), are increasingly recognized as having a core deficit in avolition . Thus, impairments in motivation likely play a critical role in producing functional deficits in depression (Chen et al. 2015) and Parkinson’s disease (Pagonabarraga et al. 2015), among others. For a detailed discussion on the similarities and differences in the mechanisms underlying motivational deficits in depression and schizophrenia, please see Barch et al., in this volume.

2 Assessing Anhedonia

2.1 Preference Assessments

The most common way to assess anhedonia in rodents is to present them with a palatable substance and assess some behavioural measure of preference, pleasure or hedonic reaction. There are a number of ways to assay these types of reactions in rodents. The simplest is the sucrose preference test (Muscat and Willner 1989). This assay takes advantage of the fact that when given a free choice between water and a sucrose solution, rodents will generally choose to consume more of the sucrose solution (Muscat and Willner 1989). The development of this preference is thought to be indicative of a pleasurable hedonic response to the palatable sucrose solution. Thus, preference for the sucrose solution and overall sucrose consumed over some time period is thought to be a measure of hedonic capacity in rodents.

Interpreting increased sucrose preference and consumption as a measure of hedonic reaction to reward is complicated, however, by other factors that contribute to flavour preference. For example, preference for a specific flavour has been shown to be related to not only the sweetness of the solution, but also the caloric benefit the animal obtains from consuming the solution (Bolles et al. 1981; Mehiel and Bolles 1984). Thus, the hedonic contribution of increased preference for and consumption of sucrose over the course of an experiment is difficult to disentangle from other factors. Giving animals a brief, single exposure and measuring consumption is less likely to suffer from this confound, and this method is widely used as a quick general assay of hedonic functioning.

Another method for assessing taste preference is to measure the number of licks a rodent makes to a taste stimulus during an experimental session. In these assays, a number of flavours or concentrations can be delivered during the same session and the number of licks to each solution is measured (see Glendinning et al. 2002). These experiments are readily implemented with the use of commercially available gustometers. In addition, the within-subject nature of these assessments means that all animals experience all solutions or concentrations of the test solution, meaning that any differences are not due to differences in post-ingestional consequences.

2.2 Taste-Reactivity Testing

Another method for assaying hedonic reaction to reward, the taste-reactivity test (Grill and Norgren 1978), takes advantage of the evolutionary conservation of the hedonic response evoked by certain taste stimuli. In a number of species, from humans to mice, exposure to a sweet solution, such as sucrose, elicits a response consisting of rhythmic tongue and mouth movements and tongue protrusions (Berridge 2000). This characteristic response is quite different from that observed when a bitter solution (for example, quinine) is presented (mouth gapes, head shaking), and is thought to reflect a positive hedonic reaction to reward (Berridge and Robinson 2003; Pecina et al. 1997, 2003; Pecina and Berridge 2005). By exposing animals to varying solutions and videotaping, coding and scoring their facial expressions, researchers can assess differences in both positive and negative hedonic reactions and the impact of manipulating various neurobiological circuits on the hedonic response (see Robinson et al., in this volume for more detail).

3 Assessing Avolition

Avolition is generally assessed in rodents by determining how willing an animal is to make a particular response in order to procure some type of reward. This type of behaviour is thought to be goal-directed, meaning that the animal is motivated to engage in the behaviour in anticipation of earning the reward. Goal-directed behaviour is thought to involve a cost/benefit computation in which the anticipated benefit of the to-be-earned reward is weighed against the anticipated effort of the task at hand (Salamone et al. 2007; and see Redish et al., in this volume for further discussion). By manipulating aspects of the required effort and/or the characteristics of reward, combined with measuring and manipulating neural activity in putatively involved brain areas, researchers can interrogate different aspects of the cost/benefit computation, thus gaining insight into impaired motivation .

3.1 Fixed- and Progressive-Ratio Schedules

One of the most simple experimental arrangements to test avolition is a fixed-ratio (FR) schedule of reinforcement. In this schedule, once the animal is initially trained to lever press, it is rewarded after it makes a specified number of lever presses (e.g. FR1, FR5, FR25). Over the course of several sessions, the number of presses required for reward can be increased, and the extent to which animals are willing to keep pressing in the face of an increasing work requirement gives an index of how motivated an animal is to obtain the reward in the face of an increasing effort requirement and allows researchers to assess the effects of targeted manipulations (Salamone et al. 2003).

In a progressive-ratio (PR) schedule (Hodos 1961), a variant of an FR schedule, animals are rewarded after a certain number of lever presses are emitted, but the work requirement for each reward increases following reward delivery. With each subsequent reward, the effort becomes more and more difficult, until at some point the animal refuses to complete the work requirement. The work requirement at which the animal no longer responds is taken as an index of motivation . Animals are generally trained to first respond on a low requirement FR schedule. Once responding is established, the PR schedule is implemented. During this schedule the response requirement increases according to some mathematical rule following each reward. A number of progression steps have been used, including exponential, geometric and arithmetic progressions with various step sizes (Killeen et al. 2009; Richardson and Roberts 1996). The requirement continues to increase until the animal ceases responding for a set period of time (e.g. 3 min) or until the session has continued for a predetermined amount of time (e.g. 2 h).

An important consideration to be noted here is that choosing the right progression is a key to identifying differences in motivation using this schedule. Depending on the progression step used, the response requirement and required effort increases either slowly, or very rapidly. Choosing too difficult a progression will result in the animal quitting very soon, providing limited data to analyse. This can be particularly problematic if a difference in motivation between groups is subtle. Similarly, choosing too easy a progression may also obscure differences. Thus, it can be helpful to test animals using several different progressions in a parametric design (Simpson et al. 2011).

Several measures of performance on the PR schedule can be used to assay motivation . The most common is the break point, which is the last ratio completed before the animal quits responding or the session ends. This measure gives an indication of the amount of effort an animal is willing to expend for the reward. Other commonly used measures are the total number of lever presses and the number of obtained rewards. Another way to analyse PR data is to present a log-survivor plot, which shows the proportion of animals still working on the schedule as a function of elapsed time or response requirement (e.g. Drew et al. 2007; Simpson et al. 2011). These plots are particularly informative because the data from all subjects are shown and considered, whereas the average measures may obscure individual differences or be unduly influenced by extreme cases.

3.2 Effort-Related Choice Tasks

Another task that has been widely used to study the neurobiology of reward-motivated behaviour is the effort-based choice task first used by Salamone et al. (1991; for review, see Salamone et al. 2007). This task assesses the trade-off between expending effort to obtain a more-preferred reward and consuming a freely available less-preferred reward. Animals are trained on an FR schedule, which is gradually increased until it is moderately demanding. Working on this schedule is rewarded with a highly preferred reward. Concurrently with the FR schedule, a less-preferred reward (usually home-cage chow) is freely available to the animal, usually in a dish on the floor of the testing chamber. Thus, throughout the session, the animal can make a choice between working for the more-preferred reward or consuming the less-preferred home-cage chow. This task is thought to specifically assay reward-motivated behaviour. By including the choice between the lever and the freely available chow, it can also dissociate goal-directed motivated behaviour from motivation to consume reward. Usually, animals will work more for the preferred reward, and consume less of the freely available chow. By increasing the FR requirement (or manipulating neural functioning in some way), researchers can determine how various conditions or manipulations affect the willingness to expend effort for a preferred reward (Salamone et al. 2007 and see Salamone et al., in this volume).

In addition to the operant effort-based procedure, a T-maze variant of this task has also been used (Salamone et al. 1994; Izquierdo and Belcher 2012). In this procedure, animals are trained in a T-maze and given forced choice trials where they experience both highly preferred and less-preferred rewards at the end of different arms of the maze. On free-choice trials, the rats will generally choose the arm with the highly preferred reward. The effort requirement for this reward can be increased by placing a short barrier (15 cm) in the highly preferred arm. Over trials, the height of the barrier is increased until the rat changes its preference to the less-preferred arm. This task, along with the operant version described above, has been successfully used to clarify the role of dopamine transmission in reward-motivated behaviour (see Salamone et al. 1991, 1994, 2003, 2007; and Salamone et al., in this volume).

4 Assessing Outcome Representation and Flexible Use of Represented Outcomes in Behaviour

As discussed above, motivation is a multifaceted concept, and research has demonstrated the critical involvement of a number of psychological processes in the overt manifestation of a reward-motivated response. Chief among these are processes that allow for the anticipation of rewards. These include the computation of the value of expected outcomes as well as the ability to update this representation with changing contingencies or to flexibly adapt behaviour in different situations. Two methods for assessing the extent to which animals can flexibly use represented outcomes in adaptive behaviour are devaluation and Pavlovian to instrumental transfer (PIT) tasks. A number of variations of the devaluation procedure have been employed, but the basic premise is the same across variations. In devaluation procedures, in separate sessions animals learn to associate a particular response with a particular outcome. For example, pressing on one lever during a morning session may result in a chocolate flavoured food pellet, while pressing on an opposite different lever in an afternoon session produces a banana-flavoured pellet. In these sessions, animals learn the association between a particular response (pressing a specific lever) and a particular outcome (pellet flavour). Prior to the test session, one of the rewards is devalued. This devaluation can be accomplished by pairing the reward with gastrointestinal malaise via an injection of lithium chloride to induce a conditioned taste aversion. Alternatively, the animals can be given free access to as much of one type of outcome (usually counterbalanced across test-session days) as they will consume. Directly after devaluation, animals are given access to both response options at the same time for the first time under extinction conditions, where making either response will not be reinforced with any outcome. If the animal has learned the association between a particular response and outcome, and if they update the value of the outcome and use the current value to guide behaviour, during this extinction session they will make fewer responses on the response option that normally produces the devalued outcome (Balleine and Dickinson 1998).

In devaluation procedures, an animal learns that different responses produce unique outcomes, and the extent to which these unique outcomes are valued dictates the animal’s willingness to engage in the response during the devaluation test. Representations of unique outcomes can also impact performance in discrimination procedures. In a differential outcomes procedure, animals are trained to make different responses in the presence of different stimuli (Traphold 1970). Under these conditions, animals learn the discrimination much more quickly if correct responses produce different outcomes than if they produce the same outcome. This differential outcomes effect (DOE) provides more evidence that animals can use information about anticipated rewards to guide their behaviour.

Another way to assess the flexibility with which animals use information about the learned value of expected outcomes is the PIT task. In PIT, an animal is first trained in a simple Pavlovian conditioning protocol, in which a stimulus (e.g. a tone or light) predicts the delivery of a reward. Under these conditions, animals will readily learn to make an anticipatory response during the conditioned stimulus. Once this response is established, the animals are trained to make an instrumental lever-press response for the same reward, which they easily learn. The key test takes place in sessions in which the previously learned conditioned stimulus is presented to the animal while it is currently engaged in the lever-press task. Under these conditions, an animal that has learned the association between the conditioned stimulus and the reward will elevate its lever-press responding in the presence of this cue, indicating anticipation of the reward previously predicted by the cue. This procedure assays the ability of a cue learned in one context (Pavlovian conditioning) to transfer its value and impact behaviour in another context (instrumental lever-press responding). Furthermore, by including stimuli and outcomes that are not explicitly paired with instrumental responding, one can separate the general excitatory effects of reward predicting stimuli (stimuli associated with reward in the initial training phase but not explicitly paired with instrumental responding) from outcome-specific motivational activation (Corbit and Balleine 2005, 2011; Corbit et al. 2007).

5 Separating Goal-Directed from Arousal Components of Motivation

One of the most difficult aspects of studying and modelling motivation in rodents is separating goal-directed action from general arousal. For example, it is well-documented that amphetamine and other stimulants increase performance on PR schedules (Mayorga et al. 2000; Olausson et al. 2006) and these results are often interpreted as evidence that these substances increase motivation . This interpretation is complicated, however, by the fact that these drugs also reliably increase measures of arousal, including locomotor activity (Hall et al. 2008; McNamara et al. 1993). Thus, pharmacological, and increasingly, genetic manipulations which may appear to enhance motivation and goal-directed behaviour may involve an enhancement in arousal and hyperactivity instead, or in addition to, an increase in motivation .

In tasks such as the FR or PR schedule, long considered the gold standard in terms of assaying motivation , it is impossible to differentiate the contribution of these two separate components to motivated performance. These shortcomings may also be relevant to the treatment of motivational deficits in patients. Motivational impairments produce a significant functional burden in patients, and there are currently no effective treatments. Deficits in goal-directed behaviour are seen as a critical aspect of the impairment in functioning. However, given the failure of most methodologies for studying motivation to separate this aspect of behaviour from general arousal, targeting specific aspects of impaired motivation in preclinical models and translation of efficacious treatment strategies to patients is difficult. Methods for separating these components of motivation will therefore be useful for the development of successful treatment strategies.

In a recent series of elegant experiments, Bailey and colleagues developed and validated a novel behavioural task with the express purpose of dissociating goal-directed from arousal components of motivated behaviour (Bailey et al. 2015). In their method, called the progressive hold down (PHD) task, mice are required to hold a response lever down for a progressively longer period of time after each reward. Bailey and colleagues show that this schedule is sensitive to manipulation of variables that are known to impact motivation , such as food deprivation and increasing sucrose concentration. Most importantly, they characterized the effects of methamphetamine on both PR and PHD performance. In the PR schedule, they replicated the results of numerous prior studies, showing that methamphetamine increased overall lever presses, rewards earned, and time spent working in the session. These results have sometimes been interpreted previously as indicating increased motivation .

When they tested the effects of methamphetamine on performance of the PHD task, they found that it also increased the number of lever presses, but because of the nature of the schedule, this increase in lever presses came at the expense of fewer earned rewards. More careful analysis showed that methamphetamine specifically increased bursts of short-duration presses, which the authors interpreted as indicating a general increase in arousal (hyperactivity), rather than an increase in goal-directed motivation . Thus, using their method, they were able to separate methamphetamine’s effects on goal-directed motivation from its effects on general arousal. Future research using procedures such as this is critical to further elucidation and understanding of the neurobiology of motivation .

6 Dissecting Psychological Components of Motivation : An Example Using an Animal Model of the Negative Symptoms of Schizophrenia

Over the past several years, we have undertaken a behavioural characterization of the motivational deficits in a transgenic mouse model of the negative symptoms of schizophrenia. This mouse models the well-replicated finding of increased striatal dopamine D2 receptor activity in patients with schizophrenia (Abi-Dargham et al. 2000). In this model, transgenic expression of the human D2 receptor is directed via the camkIIα promotor, resulting in an increase in D2 receptor expression that is limited to the striatum and olfactory tubercle of D2R-OE mice (Kellendonk et al. 2006). Fortuitously, this model has a 15% increase in overall striatal D2 receptor expression, thus recapitulating the degree of increased D2 activity in patients. Importantly, by employing the tetracycline controlled gene expression system in this model (Mayford et al. 1996), Kellendonk et al. were able to control transgene expression in a temporal manner, thus allowing for excess D2 receptors to be “turned off” by feeding the mice doxycycline. This allows for separation of the acute, or reversible, effects of D2 overexpression from more permanent dysfunction that results from developmental changes.

The initial evidence of a motivational deficit in D2R-OE mice came from an experiment designed to assess interval timing in these mice (Drew et al. 2007). Interval timing deficits have been documented in schizophrenia (Carroll et al. 2009; Gomez et al. 2015), and we wished to determine if striatal D2R overexpression contributed to such timing deficits. We tested mice in the peak procedure (Roberts 1981). In this procedure, on some trials, called fixed interval (FI) trials, animals earn a reward for making a response after a specified interval of time since the beginning of the trial. Interval timing can then be assessed on probe, or peak, trials. During these trials, the reward is omitted and the trial continues on for much longer than the usual FI trial (three to four times as long). While the rate of responding as a function of time within the trial provides an index of the subject’s timing accuracy (lever pressing should peak at the expected time of reward), the average rate of responding over the course of the entire trial provides a measure of the subject’s motivation . When we tested D2R-OE mice on the peak procedure (see Fig. 1a), we found that D2R-OE mice were indeed impaired in timing accuracy (their peak rate of responding was shifted later than the reinforced interval). In addition, they displayed a dramatic reduction in overall response rates, indicative of decreased motivation compared to controls (Drew et al. 2007; Ward et al. 2009). Strikingly, turning-off the transgene rescued the motivation , as indexed by response rates, and this partially rescued their timing (Fig. 1a; Drew et al. 2007).

Fig. 1
figure 1

a Performance of control and D2R-OE mice on the peak-interval procedure (see text for details). The figure shows response rate as a function of time in the peak trials. Data from Drew et al. (2007). b Performance of control and D2R-OE mice on the progressive-ratio schedule. The percentage of mice who were still responding on the schedule as a function of time in the session is shown. Also shown are the data from a group of D2R-OE mice in which the transgene was turned off by feeding mice doxycycline. Data from Drew et al. (2007)

To more systematically assess motivation in D2R-OE mice , we first tested them on the PR paradigm (Fig. 1b). We used a schedule in which the work requirement doubled following each reward. We found that D2R-OE mice earned fewer rewards, had lower break points and quit working sooner than controls on this schedule. Thus, D2R-OE mice displayed a motivational deficit. This deficit was rescued when the transgene was turned off (Fig. 1b).

The specific nature of the motivational deficit in D2R-OE mice is not clear from the PR performance. As discussed above, there are any number of psychological processes that combine to produce the outward behavioural outcome of persistence in pressing the lever (e.g. Ward et al. 2011). For example, perhaps D2R-OE mice do not find the reward hedonically satisfying (do not “like” the reward). Data collected from rewards retrieved suggested this was not the case, as D2R-OE mice and controls retrieved and consumed all of their earned rewards (Drew et al. 2007). Furthermore, D2R-OE mice displayed equivalent preference for sucrose as controls in a sucrose preference test (Ward et al. 2012). When we tested D2R-OE mice using the taste-reactivity test described above, we found no difference between D2R-OE mice and controls in several measures of hedonic reactivity to reward (Fig. 2), including positive hedonic reactions to increasing sucrose concentration (Fig. 2a) and increased lick rates as a function of increasing sucrose concentration in a gustometer test (Fig. 2b; Ward et al. 2012).

Fig. 2
figure 2

a Positive hedonic reactions as a function of increasing sucrose concentration in control and D2R-OE mice . Data from Ward et al. (2012). b Standardized lick ratio (a measure of hedonic reaction) as a function of increasing sucrose concentration for control and D2R-OE mice . Data from Ward et al. (2012). c Number of rewards earned as a function of increases in the progressive-ratio work requirement for control and D2R-OE mice . Data from Simpson et al. (2011). d Number of rewards earned as a function of increases in the progressive-ratio work requirement for a separate group of control and D2R-OE mice in which transgene expression had been turned off by feeding the mice doxycycline. Data from Simpson et al. (2011). e Response rate as a function of time in extinction for control and D2R-OE mice . Lines through the data are the best fits of the negative exponential equation. f The value of the decay parameter (rate of extinction) from the negative exponential equation for all control and D2R-OE mice . In all figures, error bars represent one standard error above and below the mean

Another possible reason why D2R-OE mice could quit working sooner on the PR schedule is that they may satiate or fatigue more easily than controls, and are therefore unwilling or unable to produce the same behavioural output. We addressed this by parametrically manipulating the work requirement in the PR schedule (Simpson et al. 2011). We tested D2R-OE and control mice on PR +1, +2, +5 and +10 schedules (counterbalanced for order). We found that at all schedule requirements, D2R-OE mice pressed less than controls (Fig. 2c). This performance deficit was rescued by turning-off the transgene (Fig. 2d). The key data to address the question of satiety can be found in a comparison of rewards earned by the different genotypes during the PR + 1 and PR + 2 schedules. As shown in Fig. 2c, D2R-OE mice earned the same number of rewards under the PR + 1 schedule as control mice did under the PR + 2 schedule. Furthermore, response effort of D2R-OE mice , although lower at all PR schedule values than that of controls, was still modulated by work requirement, indicating that performance deficits were not the result of a ceiling on lever presses or reward consumption. These data indicate that the deficit in PR performance displayed by D2R-OE mice is not caused by satiety or fatigue, because they are capable under the right conditions of working just as hard and consuming just as many rewards as control mice.

In addition, as the work requirement increases in the PR schedule, so too does the amount of time that an animal must wait to obtain a reward. Perhaps D2R-OE mice are less tolerant of delays to reward? We tested D2R-OE mice on a schedule which was modelled after a PR schedule, except that instead of the response requirement increasing after each reward, the inter-reward interval doubled after each reward, and the mice received a reward if at least one response was made during this interval. Performance of D2R-OE mice was identical to that of controls under these conditions (Simpson et al. 2011).

Also related to the delay between response initiation and reward receipt is the question of whether D2R-OE mice ’s behaviour extinguishes (decreases when responses do not produce reward) more quickly than control animals. To test this, we trained D2R-OE mice on a variable ratio 25 schedule (VR25), in which rewards were given for completing a fixed number of lever presses. The number of lever presses required for reward varied from trial to trial with an average requirement of 25 presses. Following stable performance on this schedule, we then exposed them to one session of extinction in which no rewards were delivered. As shown in Fig. 2e, the overall rate of responding was lower over the course of the entire extinction session for D2R-OE mice than for controls. To assess the rate of extinction, we fit negative exponential equations to the extinction curves from individual mice. These fits yielded a parameter for the decay rate of the functions, a measure of the rate of extinction (Fig. 2f). This analysis showed that notwithstanding the overall lower rates displayed by D2R-OE mice , their rate of extinction was not different from controls (see Simpson et al. 2011, for similar analysis and results). This result supports the idea that D2R-OE mice are not quitting the PR task early because they are more sensitive to extinction, but are perhaps quitting due to their decreased willingness to work.

Performance on the PR schedule also involves the appreciation of the relationship between the animal’s behaviour and the delivery of reward, that is, the contingency between responding and reward. In an unpublished experiment, we tested D2R-OE mice in a contingency-degradation protocol (Hammond 1980; Barker et al. 2014) in which we trained them to respond on a lever for rewards on a VR 25 schedule. This simple schedule was overlaid with a separate, independent schedule which presented free rewards with some specified frequency. Under these conditions, as the frequency of free rewards increases and the contingency between response and reward is degraded, animals usually decrease their rate of responding (Hammond 1980). As the frequency of free rewards increased, rate of responding decreased in both D2R-OE and control mice, but there was no difference in the degree of the decrease, indicating intact contingency appreciation in D2R-OE mice .

It should be noted that none of these results definitively determine the specific nature of the motivational deficit in D2R-OE mice . Thus, far, our analyses have only eliminated alternative explanations. To further specify the nature of this deficit, we tested D2R-OE mice on the effort-related choice paradigm (Ward et al. 2012). Similar to the results from the PR schedule, D2R-OE mice responded significantly less for a preferred reward (Fig. 3a). However, they consumed significantly more of the freely available home-cage chow (Fig. 3b). This shift in choice reversed and became matched to controls when the transgene was turned off (Fig. 3a, b).

Fig. 3
figure 3

Performance of D2R-OE mice on the effort-based choice task and sensitivity to differences in reward value. a Number of lever presses emitted for the preferred reward by control and D2R-OE mice . b Grams of freely available chow consumed by control and D2R-OE mice . Scored bars represent data from mice in which the transgene had been turned off by feeding the mice doxycycline. Error bars represent one standard error above and below the mean. c Sensitivity to differences in the distribution of rewards across response alternatives during concurrent-schedule testing. Data from Ward et al. (2012)

Thus, the deficit in D2R-OE mice seems to be due to a decreased willingness to work for reward rather than a general decreased motivation for reward. We suggest that this decreased willingness to work is the result of distortions in the cost/benefit computation required for motivated behaviour. This could occur in at least two ways. First, D2R-OE mice could be impaired in their ability to represent the value of the outcome of their work, leading to decreased motivation to engage effort. Second, D2R-OE mice could be impaired in their ability to accurately represent the anticipated cost of the effort required to procure reward, leading to an exaggerated anticipated cost.

Given the critical nature of outcome representation in motivated behaviour, we assessed the ability to which D2R-OE mice are able to represent the value of future outcomes (Ward et al. 2012). We exposed control and D2R-OE mice to a concurrent schedule in which we varied the frequency of rewards obtained for responding on two different levers. Mice received on average five rewards per minute (variable interval 20 s schedule) for pressing one lever, and 0.5 rewards per minute (variable interval 120 s schedule) for responding on the other lever. If mice are sensitive to the distribution of rewards across response options in this procedure, their ratio of lever presses will approximately match the ratio of rewards obtained from the two response options (Herrnstein 1961). We found that D2R-OE mice were less sensitive to the distribution of rewards across response options in the concurrent-schedule procedure (Fig. 3c), indicating that D2R-OE mice are less sensitive to the value of response options associated with different reward frequencies than controls.

7 Conclusions

In summary, the results of our experiments with the D2R-OE mice indicate that these mice have a deficit in goal-directed motivation . This impairment is not due to decreased hedonic reaction to reward, increased fatigue or satiation, decreased tolerance for delay, decreased sensitivity to contingency or increased susceptibility to extinction. The deficit appears to be due to a decreased willingness to expend effort, which results from a compromised cost/benefit computation, likely due to a deficit in either representing outcomes or using this information to guide behaviour. In fact, recent results indicate that although D2R-OE mice have relatively intact representations of different outcome types, these representations learned in one specific situation are not able to be used flexibly to contribute to adaptive behaviour in a new situation (Mezias et al., unpublished results).

The work described above with the D2R-OE mice is an example of the way that experimental methods can be leveraged to elucidate psychological mechanisms of motivational deficits in rodent models . By taking advantage of the reversible nature of the genetic manipulation in D2R-OE mice , we were able to gain much ground in uncovering the specific psychological processes that produced the performance deficits in D2R-OE mice . These types of methods can be used in concert with the ever-increasingly sophisticated molecular manipulations developed by researchers to make unprecedented gains in characterization of rodent models of psychiatric disease. This approach can also be used to investigate novel targets for therapies to enhance motivation in psychiatric disease (Simpson et al. 2011).

In conclusion, as noted above, motivational impairments are critical to functional deficits in a number of psychiatric diseases, and the need for effective treatments is great. Motivated behaviour involves a number of factors and psychological processes, including hedonic functioning, physical effort, satiety, tolerance of delay, ability to represent and adaptively use information about the value of represented outcomes, and ability to represent effort or time required to procure reward. All of these factors enter into a cost/benefit computation; the outcome of which will determine whether an animal will expend the time and effort required to obtain the reward. By utilizing the wealth of experimental methods available for dissecting component psychological processes involved in motivation , and continuing to invent new assays, researchers can better assess motivational impairments in animal models of psychiatric disease. Partitioning of motivated behaviour into its component psychological processes will deepen and enrich understanding of the separate and sometimes dissociable (as in the case of hedonic reaction and goal-directed motivation ; Berridge and Robinson 2003) neurobiological underpinnings of this behaviour, and will lead to more specific and targeted treatment strategies. These improvements will in turn promote and foster more translatable outcomes from preclinical research to clinical populations.