Dopamine Prediction Errors and the Relativity of Value

Sakagami, Masamichi; Tanaka, Shingo

doi:10.1007/978-981-10-0207-6_9

Masamichi Sakagami⁴ &
Shingo Tanaka⁴

Part of the book series: Advances in Cognitive Neurodynamics ((ICCN))

1222 Accesses

Abstract

Value, or reward prediction, is thought to be generated by neural circuits among the basal ganglia and midbrain dopamine area. Midbrain dopamine neurons calculate the difference between estimated and actual rewards and send the reward prediction error to the basal ganglia, particularly striatum where the value is coded. The value reflects the quality and quantity of reward, the timing of reward delivery, etc. In many cases, however, actually experienced values depend on the context in which rewards are given and cannot be explained merely by physicochemical components of reward.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Dopamine reward prediction-error signalling: a two-component response

Article 11 February 2016

Explaining dopamine through prediction errors and beyond

Article 25 July 2024

Reward and value coding by dopamine neurons in non-human primates

Article Open access 26 October 2017

Keywords

We conducted two kinds of experiments on the reward prediction error in midbrain dopamine neurons of Japanese monkeys in order to investigate neuronal mechanisms of relative value coding. In Nomoto et al. [1], we recorded single unit activity from midbrain dopamine neurons, while monkeys performed the random dot discrimination task with an asymmetric reward schedule (one direction was associated with a large reward and the opposite direction with a small reward). We observed biphasic cue-evoked dopamine responses representing reward prediction errors at the times of stimulus detection (early component) and identification (late component). Both components reflected reward prediction errors. The early component, which was trial-type-independent, coded the prediction error of average value of all the cues in the task as the component could not discriminate the direction of random dot motion cue but could detect the onset of cue. The late component, which was trial-type specific, coded the prediction error of specific value of the cue. The quantity of late component corresponded to the difference between the value of specific cue and the average value coded by the early component. These results indicate that dopamine prediction error signals are computed by using moment-to-moment reward prediction of cue stimuli and the amount of later one was modulated by that of preceding one. Therefore, the dopamine reward prediction error, particularly in later stages, can be relative, dependent on the context. Then we can postulate that the value itself in the basal ganglia may be relative, not absolute, because the value in the basal ganglia is dependent on the relative reward prediction error signal from dopamine neurons [2].

Value can be modulated also by the effort or the cost put into its achievement. Although recent reports indicate the cost information is transmitted to dopamine neurons, it is still unclear whether the paid cost modulates the activity of dopamine neurons. We examined whether the activity of the dopamine neurons in response to reward predictive cues was modulated by the cost preceding to these cues [3]. Two Japanese monkeys performed a saccade task. After fixation on a fixation point, subjects were required to make a saccade to a condition cue and then a target appeared. In the high-cost condition, long fixation to the target was required. In the low-cost condition, only a short fixation was required. After fixation on the target, the subjects made a saccade to the reward cue and obtained the reward. While the subjects performed the saccade task, the activities of dopamine neurons were recorded from the midbrain dopamine area (SNc). The dopamine neurons showed phasic responses to the condition cues and the reward cues. The neuronal response to the low-cost cue was larger than that to the high-cost cue. This difference in the activity of the dopamine neurons was independent of the subtype of dopamine neurons (value and salience types). In contrast, the responses to the reward cue after the high-cost condition were larger than those to the reward cue after the low-cost condition indicating that the paid cost increased the reward prediction error signal. We also showed that the paid cost enhanced the learning speed in an exploration task, in which the subjects have to learn the relationship between the reward cue and the reward trial-by-trial. Furthermore, the paid cost increased the response of the dopamine neurons to the reward itself. From these results, we suggest that information about the cost is integrated in the dopamine neurons and the paid cost amplifies the value of the reward.

According to recent studies on the value generation in the nigrostriatal circuit, the value learning can be explained by relatively simple reinforcement learning model [4–6], which means that the learnt value is basically dependent on the experienced probability of contingency between an event (stimulus or response) and outcome. Dopamine neurons calculate the prediction error to update the value more appropriately. As dopamine neurons calculate the prediction error moment-to-moment and the circumstances are usually not simple but structured, the moment-to-moment prediction error signal is not absolute but relative. As a result, we can postulate the value coded in the basal ganglia, which receives the prediction error signal from dopamine neurons, should be relative, which we have to clarify with future works.

References

Nomoto, K., Schultz, W., Watanabe, T., Sakagami, M.: Temporally extended dopamine responses to perceptually demanding reward-predictive stimuli. J. Neurosci. 30(32), 10692–10702 (2010)
Article PubMed CAS PubMed Central Google Scholar
Schultz, W.: Multiple dopamine functions at different time courses. Annu. Rev. Neurosci. 30, 259–288 (2008)
Article Google Scholar
Tanaka, S., Sakagami, M.: The enhancement of the reward prediction error signal in the midbrain dopamine neuron by the cost paid for the reward. Abstract retrieved from Abstracts in Society for Neuroscience (Abstruct No. 776.01/LLL19) (2013)
Google Scholar
Daw, N.D., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704–1711 (2005)
Article PubMed CAS Google Scholar
Rangel, A., Camerer, C., Montague, R.: A framework for studying the neurobiology of value-based decision-making. Nat. Rev. Neurosci. 9(7), 545–556 (2008)
Article PubMed CAS PubMed Central Google Scholar
Samejima, K., Ueda, Y., Doya, K., Kimura, M.: Representation of action-specific reward values in the striatum. Science. 310(5752), 1337–1340 (2005)
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Brain Science Institute, Tamagawa University, Tokyo, Japan
Masamichi Sakagami & Shingo Tanaka

Authors

Masamichi Sakagami
View author publications
You can also search for this author in PubMed Google Scholar
Shingo Tanaka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masamichi Sakagami .

Editor information

Editors and Affiliations

Inst for Cognitive Neurodynamics, East China Univ of Sci and Tech, Shanghai, China
Rubin Wang
Inst for Cognitive Neurodynamics, East China Univ of Sci and Tech, Shanghai, China
Xiaochuan Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sakagami, M., Tanaka, S. (2016). Dopamine Prediction Errors and the Relativity of Value. In: Wang, R., Pan, X. (eds) Advances in Cognitive Neurodynamics (V). Advances in Cognitive Neurodynamics. Springer, Singapore. https://doi.org/10.1007/978-981-10-0207-6_9

Download citation

DOI: https://doi.org/10.1007/978-981-10-0207-6_9
Published: 30 January 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0205-2
Online ISBN: 978-981-10-0207-6
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics

Dopamine Prediction Errors and the Relativity of Value

Abstract

Similar content being viewed by others

Dopamine reward prediction-error signalling: a two-component response

Explaining dopamine through prediction errors and beyond

Reward and value coding by dopamine neurons in non-human primates

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Dopamine Prediction Errors and the Relativity of Value

Abstract

Similar content being viewed by others

Dopamine reward prediction-error signalling: a two-component response

Explaining dopamine through prediction errors and beyond

Reward and value coding by dopamine neurons in non-human primates

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation