Backpropagation and the brain

Lillicrap, Timothy P.; Santoro, Adam; Marris, Luke; Akerman, Colin J.; Hinton, Geoffrey

doi:10.1038/s41583-020-0277-3

Backpropagation and the brain

Perspective
Published: 17 April 2020

Volume 21, pages 335–346, (2020)
Cite this article

From

View current issue Sign up to alerts

Timothy P. Lillicrap ORCID: orcid.org/0000-0001-8918-486X^1,2^na1,
Adam Santoro¹^na1,
Luke Marris¹,
Colin J. Akerman³ &
…
Geoffrey Hinton^4,5

47k Accesses
371 Citations
526 Altmetric
19 Mentions
Explore all metrics

Abstract

During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to determine the effect of an individual synaptic modification on the behaviour of the system. The backpropagation algorithm solves this problem in deep artificial neural networks, but historically it has been viewed as biologically problematic. Nonetheless, recent developments in neuroscience and the successes of artificial neural networks have reinvigorated interest in whether backpropagation offers insights for understanding learning in the cortex. The backpropagation algorithm learns quickly by computing synaptic updates using feedback connections to deliver error signals. Although feedback connections are ubiquitous in the cortex, it is difficult to see how they could deliver the error signals required by strict formulations of backpropagation. Here we build on past and recent developments to argue that feedback connections may instead induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

**Fig. 1: A spectrum of learning algorithms.**

**Fig. 2: Comparison of backprop-trained networks with neural responses in visual ventral cortex.**

**Fig. 3: Target propagation algorithms.**

**Fig. 4: Empirical findings suggest new ideas for how backprop-like learning might be approximated by the brain.**

Artificial Neural Networks and Backpropagation

Backpropagation Issues with Deep Feedforward Neural Networks

Fundamentals of Artificial Neural Networks and Deep Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Hebb, D. O. The Organization of Behavior: A Neuropsychological Approach (John Wiley & Sons, 1949).
Markram, H. & Sakmann, B. Action potentials propagating back into dendrites trigger changes in efficacy of single-axon synapses between layer V pyramidal neurons. Soc. Neurosci. Abstr. 21, 2007 (1995).
Google Scholar
Markram, H., Lübke, J., Frotscher, M. & Sakmann, B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275, 213–215 (1997).
Article CAS PubMed Google Scholar
Gerstner, W., Kempter, R., van Hemmen, J. L. & Wagner, H. A neuronal learning rule for sub-millisecond temporal coding. Nature 383, 76–78 (1996).
Article CAS PubMed Google Scholar
Bliss, T. V. & Lømo, T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. J. Physiol. 232, 331–356 (1973).
Article CAS PubMed PubMed Central Google Scholar
Bishop, C. M. Neural Networks for Pattern Recognition (Oxford University Press, 1995).
Werbos, P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD Thesis, Harvard Univ. P. (1974).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning Internal Representations by Error Propagation. Technical Report (DTIC Document, 1985).
LeCun, Y. in Proc. Cognitiva 85, 559–604 (CESTA, 1985).
Parker, D. B. Learning-Logic: Casting the Cortex of the Human Brain in Silicon. Technical Report Tr-47 (Center for Computational Research in Economics and Management Science, MIT, 1985).
Hannun, A. et al. Deep speech: scaling up end-to-end speech recognition. Preprint at http://arXiv.org/1412.5567 (2014).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Adv. Neural Inf. Process. Syst. 1097–1105 (NIPS, 2012).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proc. IEEE Conf. Comput. Vision Patt. Recog., 770–778 (2016).
Vaswani, A. et al. in Adv. Neural Inf. Process. Syst. 6000–6010 (NIPS, 2017).
Oord, A. v. d., Kalchbrenner, N. & Kavukcuoglu, K. Pixel recurrent neural networks. PMLR 48, 1747–1756 (2016).
Google Scholar
Van den Oord, A. et al. Wavenet: a generative model for raw audio. Preprint at https://arXiv.org/1609.03499 (2016)
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N. & Wu, Y. Exploring the limits of language modeling. Preprint at https://arXiv.org/1602.02410 (2016).
Oh, J., Guo, X., Lee, H., Lewis, R. L. & Singh, S. in Adv. Neural Inf. Process. Syst. 2863–2871 (NIPS, 2015).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Article CAS PubMed Google Scholar
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
Article CAS PubMed Google Scholar
Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).
Article CAS PubMed Google Scholar
Moravčík, M. et al. DeepStack: expert-level artificial intelligence in heads-up no-limit poker. Science 356, 508–513 (2017).
Article PubMed CAS Google Scholar
Gilbert, C. D. & Li, W. Top-down influences on visual processing. Nat. Rev. Neurosci. 14, 350–363 (2013).
Article CAS PubMed Google Scholar
Tong, F. Primary visual cortex and visual awareness. Nat. Rev. Neurosci. 4, 219–229 (2003).
Article CAS PubMed Google Scholar
Grossberg, S. Competitive learning: from interactive activation to adaptive resonance. Cogn. Sci. 11, 23–63 (1987).
Article Google Scholar
Marr, D. Simple memory: a theory for archicortex. Philos. Trans. R. Soc. Lond. B Biol. Sci. 262, 23–81 (1971).
Article CAS PubMed Google Scholar
Hinton, G. E. & McClelland, J. L. in Adv. Neural Inf. Process. Syst. 358–366 (NIPS, 1988).
Crick, F. The recent excitement about neural networks. Nature 337, 129–132 (1989).
Article CAS PubMed Google Scholar
Roelfsema, P. R. & Holtmaat, A. Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19, 166–180 (2018).
Article CAS PubMed Google Scholar
Whittington, J. C. & Bogacz, R. Theories of error back-propagation in the brain. Trends Cogn. Sci. 23, 235–250 (2019).
Article PubMed PubMed Central Google Scholar
Almeida, L. B. in Artificial Neural Networks 102–111 (ACM Digital Library, 1990).
Pineda, F. J. Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. 59, 2229–2232 (1987).
Article CAS PubMed Google Scholar
Pineda, F. J. Dynamics and architecture for neural computation. J. Complex. 4, 216–245 (1988).
Article Google Scholar
O’Reilly, R. C. Biologically plausible error-driven learning using local activation differences: the generalized recirculation algorithm. Neural Comput. 8, 895–938 (1996).
Article Google Scholar
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).
Article Google Scholar
Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The ‘wake–sleep’ algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).
Article CAS PubMed Google Scholar
Movellan, J. R. in Connectionist Models: Proc. 1990 Summer School 10–17 (ScienceDirect, 1991).
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M. & Huang, F. in Predicting Structured Data Vol. 1 (eds Bakir, G., Hofman, T., Scholkopf, B., Smola, A. & Taskar, B.) 191–245 (MIT Press, 2006).
Xie, X. & Seung, H. S. Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Comput. 15, 441–454 (2003).
Article PubMed Google Scholar
Bengio, Y. How auto-encoders could provide credit assignment in deep networks via target propagation. Preprint at http://arXiv.org/1407.7906 (2014).
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. in Joint Eur. Conf. Machine Learning Knowl. Discov. Databases 498–515 (Springer, 2015).
Mazzoni, P., Anderson, R. A. & Jordan, M. I. A more biologically plausible learning rule for neural networks. Proc. Natl Acad. Sci. USA 88, 4433–4437 (1991).
Article CAS PubMed PubMed Central Google Scholar
Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003).
Article CAS PubMed Google Scholar
Werfel, J., Xie, X. & Seung, H. S. Learning curves for stochastic gradient descent in linear feedforward networks. Neural Comput. 17, 2699–2718 (2005).
Article PubMed Google Scholar
Spall, J. C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37, 332–341 (1992).
Article Google Scholar
Williams, R. J. in Reinforcement Learning 5–32 (Springer, 1992).
Flower, B. & Jabri, M. Summed weight neuron perturbation: an O(n) improvement over weight perturbation. in Adv. Neural Inf. Process. Syst. 212–219 (NIPS, 1993).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Deisenroth, M. P. et al. A survey on policy search for robotics. Found. Trends R. Robot. 2, 1–142 (2013).
Google Scholar
Lillicrap, T. P. et al. Continuous control with deep reinforcement learning. Preprint at http://arXiv.org/1509.02971 (2015).
Rumelhart, D., Hinton, G. & Williams, R. Learning representations by back-propagation errors. Nature 323, 533–536 (1986).
Article Google Scholar
Andersen, P., Sundberg, S., Sveen, O., Swann, J. & Wigström, H. Possible mechanisms for long-lasting potentiation of synaptic transmission in hippocampal slices from guinea-pigs. J. Physiol. 302, 463–482 (1980).
Article CAS PubMed PubMed Central Google Scholar
Guillery, R. & Sherman, S. M. Thalamic relay functions and their role in corticocortical communication: generalizations from the visual system. Neuron 33, 163–175 (2002).
Article CAS PubMed Google Scholar
Sherman, S. M. & Guillery, R. Distinct functions for direct and transthalamic corticocortical connections. J. Neurophysiol. 106, 1068–1077 (2011).
Article PubMed Google Scholar
Viaene, A. N., Petrof, I. & Sherman, S. M. Properties of the thalamic projection from the posterior medial nucleus to primary and secondary somatosensory cortices in the mouse. Proc. Natl Acad. Sci. USA 108, 18156–18161 (2011).
Article CAS PubMed PubMed Central Google Scholar
Abdelghani, M., Lillicrap, T. & Tweed, D. Sensitivity derivatives for flexible sensorimotor learning. Neural Comput. 20, 2085–2111 (2008).
Article CAS PubMed Google Scholar
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).
Article CAS PubMed PubMed Central Google Scholar
Cadieu, C. F. et al. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10, e1003963 (2014).
Article PubMed PubMed Central Google Scholar
Yamins, D. L. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111, 8619–8624 (2014).
Article CAS PubMed PubMed Central Google Scholar
Elston, G. N. Cortex, cognition and the cell: new insights into the pyramidal neuron and prefrontal function. Cereb. Cortex 13, 1124–1138 (2003).
Article PubMed Google Scholar
Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207–214 (2014).
Article CAS PubMed PubMed Central Google Scholar
Jiang, X. et al. Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350, aac9462 (2015).
Article PubMed PubMed Central CAS Google Scholar
Tasic, B. et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature 563, 72–78 (2018).
Article CAS PubMed PubMed Central Google Scholar
Mountcastle, V. B. et al. Modality and topographic properties of single neurons of cat’s somatic sensory cortex. J. Neurophysiol. 20, 408–434 (1957).
Article CAS PubMed Google Scholar
Mountcastle, V. B., Motter, B., Steinmetz, M. & Sestokas, A. Common and differential effects of attentive fixation on the excitability of parietal and prestriate (V4) cortical visual neurons in the macaque monkey. J. Neurosci. 7, 2239–2255 (1987).
Article CAS PubMed PubMed Central Google Scholar
Douglas, R. J., Martin, K. A. & Whitteridge, D. A canonical microcircuit for neocortex. Neural Comput. 1, 480–488 (1989).
Article Google Scholar
Bastos, A. M. et al. Canonical microcircuits for predictive coding. Neuron 76, 695–711 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zipser, D. & Andersen, R. A. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature 331, 679–684 (1988).
Article CAS PubMed Google Scholar
Lillicrap, T. P. & Scott, S. H. Preference distributions of primary motor cortex neurons reflect control solutions optimized for limb biomechanics. Neuron 77, 168–179 (2013).
Article CAS PubMed Google Scholar
Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10, e1003915 (2014).
Article PubMed PubMed Central CAS Google Scholar
Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1, 417–446 (2015).
Article PubMed Google Scholar
Wenliang, L. K. & Seitz, A. R. Deep neural networks for modeling visual perceptual learning. J. Neurosci. 38, 6028–6044 (2018).
Article CAS PubMed PubMed Central Google Scholar
Pinto, N., Cox, D. D. & DiCarlo, J. J. Why is real-world visual object recognition hard? PLoS Comput. Biol. 4, e27 (2008).
Article PubMed PubMed Central CAS Google Scholar
Freeman, J. & Simoncelli, E. P. Metamers of the ventral stream. Nat. Neurosci. 14, 1195–1201 (2011).
Article CAS PubMed PubMed Central Google Scholar
Ullman, S., Assif, L., Fetaya, E. & Harari, D. Atoms of recognition in human and computer vision. Proc. Natl Acad. Sci. USA 113, 2744–2749 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630–644 (2018).
Article CAS PubMed Google Scholar
Mitchell, M. An Introduction to Genetic Algorithms (MIT Press, 1998).
Saxe, A. M. Deep Linear Neural Networks: A Theory of Learning in the Brain and Mind. PhD thesis, Stanford Univ. (2015).
Zmarz, P. & Keller, G. B. Mismatch receptive fields in mouse visual cortex. Neuron 92, 766–772 (2016).
Article CAS PubMed Google Scholar
Issa, E. B., Cadieu, C. F. & DiCarlo, J. J. Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals. eLife 7, e42870 (2018).
Article PubMed PubMed Central Google Scholar
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Article CAS PubMed Google Scholar
Zipser, D. & Rumelhart, D. in Computational Neuroscience (ed. Schwartz, E. L.) 192–200 (1990).
Stork, D. G. in Int. Joint Conf. Neural Netw. 2 (1989), 241–246.
Brandt, R. D. & Lin, F. in Proc. 1996 IEEE Int. Conf. Neural Netw. 300–305 (1996).
Brandt, R. D. & Lin, F. in Proc. 1996 IEEE Int. Symp. Intell. Control 86–90 (1996).
Oztas, E. Neuronal tracing. Neuroanatomy 2, 2–5 (2003).
Google Scholar
Harris, K. D. Stability of the fittest: organizing learning through retroaxonal signals. Trends Neurosci. 31, 130–136 (2008).
Article CAS PubMed Google Scholar
Venkateswararao, L. C. Adaptive Optimal-Control Algorithms for Brainlike Networks PhD Thesis, Univ. Toronto (2010).
Hinton, G. The ups and downs of Hebb synapses. Can. Psychol. 44, 10–13 (2003).
Article Google Scholar
Kolen, J. F. & Pollack, J. B. in IEEE World Congress Comput. Intell. 3, 1375–1380 (IEEE, 1994).
Körding, K. P. & König, P. Supervised and unsupervised learning with two sites of synaptic integration. J. Comput. Neurosci. 11, 207–215 (2001).
Article PubMed Google Scholar
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random feedback weights support learning in deep neural networks. Preprint at http://arXiv.org/1411.0247 (2014).
Nøkland, A. in Adv. Neural Inf. Process. Syst. 1045–1053 (NIPS, 2016).
Guergiuev, J., Lillicrap, T. P. & Richards, B. A. Deep learning with segregated dendrites. eLife 6, e22901 (2017).
Article Google Scholar
Liao, Q., Leibo, J. Z. & Poggio, T. How important is weight symmetry in backpropagation? Preprint at https://arXiv.org/1510.05067 (2015).
Samadi, A., Lillicrap, T. P. & Tweed, D. B. Deep learning with dynamic spiking neurons and fixed feedback weights. Neural Comput. 29, 578–602 (2017).
Article PubMed Google Scholar
Moskovitz, T. H., Litwin-Kumar, A. & Abbott, L. Feedback alignment in deep convolutional networks. Preprint at https://arXiv.org/1812.06488 (2018).
Xiao, W., Chen, H., Liao, Q. & Poggio, T. Biologically-plausible learning algorithms can scale to large datasets. Preprint at https://arXiv.org/1811.03567 (2018).
Amit, Y. Deep learning with asymmetric connections and Hebbian updates. Front. Comput Neurosci. 13, 18 (2019).
Article PubMed PubMed Central Google Scholar
Bartunov, S. et al. in Adv. Neural Inf. Process. Syst. 9390–9400 (NIPS, 2018).
Akrout, M., Wilson, C., Humphreys, P. C., Lillicrap, T. & Tweed, D. Using weight mirrors to improve feedback alignment. Preprint at https://arXiv.org/1904.05391 (2019).
Pascanu, R., Mikolov, T. & Bengio, Y. in Proc. Int. Conf. Machine Learning 1310–1318 (ICML, 2013).
Coesmans, M., Weber, J. T., De Zeeuw, C. I. & Hansel, C. Bidirectional parallel fiber plasticity in the cerebellum under climbing fiber control. Neuron 44, 691–700 (2004).
Article CAS PubMed Google Scholar
Yang, Y. & Lisberger, S. G. Purkinje-cell plasticity and cerebellar motor learning are graded by complex-spike duration. Nature 510, 529–532 (2014).
Article CAS PubMed PubMed Central Google Scholar
Li, W., Piëch, V. & Gilbert, C. D. Contour saliency in primary visual cortex. Neuron 50, 951–962 (2006).
Article CAS PubMed Google Scholar
Motter, B. C. Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J. Neurophysiol. 70, 909–919 (1993).
Article CAS PubMed Google Scholar
Moran, J. & Desimone, R. Selective attention gates visual processing in the extrastriate cortex. Front. Cognit. Neurosci. 229, 342–345 (1985).
Google Scholar
Spitzer, H., Desimone, R. & Moran, J. Increased attention enhances both behavioral and neuronal performance. Science 240, 338–340 (1988).
Article CAS PubMed Google Scholar
Chelazzi, L., Miller, E. K. & Duncanf, J. A neural basis for visual search in inferior temporal cortex. Nature 363, 27 (1993).
Article Google Scholar
Chelazzi, L., Miller, E. K., Duncan, J. & Desimone, R. Responses of neurons in macaque area V4 during memory-guided visual search. Cereb. Cortex 11, 761–772 (2001).
Article CAS PubMed Google Scholar
Treue, S. & Maunsell, J. H. Attentional modulation of visual motion processing in cortical areas MT and MST. Nature 382, 539–541 (1996).
Article CAS PubMed Google Scholar
Luck, S. J., Chelazzi, L., Hillyard, S. A. & Desimone, R. Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. J. Neurophysiol. 77, 24–42 (1997).
Article CAS PubMed Google Scholar
Ito, M. & Gilbert, C. D. Attention modulates contextual influences in the primary visual cortex of alert monkeys. Neuron 22, 593–604 (1999).
Article CAS PubMed Google Scholar
McAdams, C. J. & Maunsell, J. H. Effects of attention on orientation-tuning functions of single neurons in macaque cortical area V4. J. Neurosci. 19, 431–441 (1999).
Article CAS PubMed PubMed Central Google Scholar
Reynolds, J. H. & Desimone, R. Interacting roles of attention and visual salience in V4. Neuron 37, 853–863 (2003).
Article CAS PubMed Google Scholar
Abbott, L. F., Varela, J., Sen, K. & Nelson, S. Synaptic depression and cortical gain control. Science 275, 221–224 (1997).
Article Google Scholar
Chance, F. S., Abbott, L. & Reyes, A. D. Gain modulation from background synaptic input. Neuron 35, 773–782 (2002).
Article CAS PubMed Google Scholar
Girard, P., Hupé, J. & Bullier, J. Feedforward and feedback connections between areas V1 and V2 of the monkey have similar rapid conduction velocities. J. Neurophysiol. 85, 1328–1331 (2001).
Article CAS PubMed Google Scholar
De Pasquale, R. & Sherman, S. M. Synaptic properties of corticocortical connections between the primary and secondary visual cortical areas in the mouse. J. Neurosci. 31, 16494–16506 (2011).
Article PubMed PubMed Central CAS Google Scholar
Kosslyn, S. M. & Thompson, W. L. When is early visual cortex activated during visual mental imagery? Psychol. Bull. 129, 723–746 (2003).
Article PubMed Google Scholar
Bridge, H., Harrold, S., Holmes, E. A., Stokes, M. & Kennard, C. Vivid visual mental imagery in the absence of the primary visual cortex. J. Neurol. 259, 1062–1070 (2012).
Article PubMed Google Scholar
Manita, S. et al. A top-down cortical circuit for accurate sensory perception. Neuron 86, 1304–1316 (2015).
Article CAS PubMed Google Scholar
Fyall, A. M., El-Shamayleh, Y., Choi, H., Shea-Brown, E. & Pasupathy, A. Dynamic representation of partially occluded objects in primate prefrontal and visual cortex. eLife 6, e25784 (2017).
Article PubMed PubMed Central Google Scholar
Mignard, M. & Malpeli, J. G. Paths of information flow through visual cortex. Science 251, 1249–1252 (1991).
Article CAS PubMed Google Scholar
Markov, N. T. & Kennedy, H. The importance of being hierarchical. Curr. Opin. Neurobiol. 23, 187–194 (2013).
Article CAS PubMed Google Scholar
Ahissar, M. & Hochstein, S. The reverse hierarchy theory of visual perceptual learning. Trends Cognit. Sci. 8, 457–464 (2004).
Article Google Scholar
Lee, T. S. & Mumford, D. Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A Opt Image Sci. Vis. 20, 1434–1448 (2003).
Article PubMed Google Scholar
Lewicki, M. S. & Sejnowski, T. J. in Adv. Neural Inf. Process. Syst. 529–535 (NIPS, 1997).
Knill, D. C. & Richards, W. Perception as Bayesian Inference (Cambridge Univ. Press, 1996).
Dayan, P., Hinton, G. E., Neal, R. M. & Zemel, R. S. The Helmholtz machine. Neural Comput. 7, 889–904 (1995).
Article CAS PubMed Google Scholar
Von Helmholtz, H.& Southall, J. P. C. Treatise on Physiological Optics (Courier Corp., 2005).
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. in Readings in Computer Vision 522–533 (Elsevier, 1987).
Whittington, J. C. & Bogacz, R. An approximation of the error backpropagation algorithm in a predictive coding network with local Hebbian synaptic plasticity. Neural Comput. 29, 1229–1262 (2017).
Article PubMed PubMed Central Google Scholar
Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic error backpropagation in deep cortical microcircuits. Preprint at https://arXiv.org/1801.00062 (2017).
Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. in Adv. Neural Inf. Process. Syst. 8721–8732 (NIPS, 2018).
Scellier, B. & Bengio, Y. Towards a biologically plausible backprop. Preprint at https://arXiv.org/1602.05179.914 (2016).
Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).
Article PubMed PubMed Central Google Scholar
Hinton, G. How to do backpropagation in a brain. Deep Learning Workshop (NIPS, 2007).
Vincent, P., Larochelle, H., Bengio, Y. & Manzagol, P.-A. in Proc. 25th Int. Conf. Machine Learning 1096–1103 (ICML, 2008).
Smolensky, P. Information Processing in Dynamical Systems: Foundations of Harmony Theory Technical Report (Univ. Colorado at Boulder, 1986).
LeCun, Y. in Disordered Systems and Biological Organization 233–240 (Springer, 1986).
LeCun, Y. Modèles connexionnistes de l’apprentissage. PhD Thesis, Univ. Paris 6 (1987).
Coogan, T. & Burkhalter, A. Conserved patterns of cortico-cortical connections define areal hierarchy in rat visual cortex. Exp. Brain Res. 80, 49–53 (1990).
Article CAS PubMed Google Scholar
D’Souza, R. D. & Burkhalter, A. A laminar organization for selective cortico-cortical communication. Front. Neuroanat. 11, 71 (2017).
Article PubMed PubMed Central Google Scholar
Wimmer, V. C., Bruno, R. M., De Kock, C. P., Kuner, T. & Sakmann, B. Dimensions of a projection column and architecture of VPM and POm axons in rat vibrissal cortex. Cereb. Cortex 20, 2265–2276 (2010).
Article PubMed PubMed Central Google Scholar
Williams, L. E. & Holtmaat, A. Higher-order thalamocortical inputs gate synaptic long-term potentiation via disinhibition. Neuron 101, 91–102 (2019).
Article CAS PubMed Google Scholar
Larkum, M. E., Zhu, J. J. & Sakmann, B. A new cellular mechanism for coupling inputs arriving at different cortical layers. Nature 398, 338–341 (1999).
Article CAS PubMed Google Scholar
Gordon, U., Polsky, A. & Schiller, J. Plasticity compartments in basal dendrites of neocortical pyramidal neurons. J. Neurosci. 26, 12717–12726 (2006).
Article CAS PubMed PubMed Central Google Scholar
Branco, T., Clark, B. A. & Häusser, M. Dendritic discrimination of temporal input sequences in cortical neurons. Science 329, 1671–1675 (2010).
Article CAS PubMed PubMed Central Google Scholar
Branco, T. & Häusser, M. Synaptic integration gradients in single cortical pyramidal cell dendrites. Neuron 69, 885–892 (2011).
Article CAS PubMed PubMed Central Google Scholar
Losonczy, A., Makara, J. K. & Magee, J. C. Compartmentalized dendritic plasticity and input feature storage in neurons. Nature 452, 436–441 (2008).
Article CAS PubMed Google Scholar
Polsky, A., Mel, B. W. & Schiller, J. Computational subunits in thin dendrites of pyramidal cells. Nat. Neurosci. 7, 621–627 (2004).
Article CAS PubMed Google Scholar
Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. Neuron 81, 521–528 (2014).
Article CAS PubMed Google Scholar
Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. PNAS 115, E6329–E6338 (2018).
Article CAS PubMed PubMed Central Google Scholar
Schiess, M., Urbanczik, R. & Senn, W. Somato-dendritic synaptic plasticity and error-backpropagation in active dendrites. PLoS Comput. Biol. 12, e1004638 (2016).
Article PubMed PubMed Central CAS Google Scholar
Klausberger, T. & Somogyi, P. Neuronal diversity and temporal dynamics: the unity of hippocampal circuit operations. Science 321, 53–57 (2008).
Article CAS PubMed PubMed Central Google Scholar
Sjöström, P. J. & Häusser, M. A cooperative switch determines the sign of synaptic plasticity in distal dendrites of neocortical pyramidal neurons. Neuron 51, 227–238 (2006).
Article PubMed CAS Google Scholar
Richards, B. A. & Lillicrap, T. P. Dendritic solutions to the credit assignment problem. Curr. Opin. Neurobiol. 54, 28–36 (2019).
Article CAS PubMed Google Scholar
Muller, S. Z., Zadina, A., Abbott, L. & Sawtell, N. Continual learning in a multi-layer network of an electric fish. Cell 179, 1382–1392.e10 (2019).
Article CAS PubMed PubMed Central Google Scholar
Bittner, K. C. et al. Conjunctive input processing drives feature selectivity in hippocampal CA1 neurons. Nat. Neurosci. 18, 1133–1142 (2015).
Article CAS PubMed PubMed Central Google Scholar
Bittner, K. C., Milstein, A. D., Grienberger, C., Romani, S. & Magee, J. C. Behavioral time scale synaptic plasticity underlies CA1 place fields. Science 357, 1033–1036 (2017).
Article CAS PubMed PubMed Central Google Scholar
Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. eLife 6, e22901 (2017).
Article PubMed PubMed Central Google Scholar
Kwag, J. & Paulsen, O. The timing of external input controls the sign of plasticity at local synapses. Nat. Neurosci. 12, 1219–1221 (2009).
Article CAS PubMed Google Scholar
Dale, H. Pharmacology and nerve-endings. Proc. R. Soc. Med. 28, 319–332 (1935).
CAS PubMed PubMed Central Google Scholar
Osborne, N. N. Is Dale’s principle valid? Trends Neurosci. 2, 73–75 (1979).
Article Google Scholar
O’Donohue, T. L., Millington, W. R., Handelmann, G. E., Contreras, P. C. & Chronwall, B. M. On the 50th anniversary of Dale’s law: multiple neurotransmitter neurons. Trends Pharmacol. Sci. 6, 305–308 (1985).
Article Google Scholar
Draye, J.-P., Cheron, G., Libert, G. & Godaux, E. Emergence of clusters in the hidden layer of a dynamic recurrent neural network. Biol. Cybern. 76, 365–374 (1997).
Article CAS PubMed Google Scholar
De Kamps, M. & van der Velde, F. From artificial neural networks to spiking neuron populations and back again. Neural Netw. 14, 941–953 (2001).
Article PubMed Google Scholar
Parisien, C., Anderson, C. H. & Eliasmith, C. Solving the problem of negative synaptic weights in cortical models. Neural Comput. 20, 1473–1494 (2008).
Article PubMed Google Scholar
Zeiler, M. D. & Fergus, R. in Eur. Conf. Comput. Vision 818–833 (2014).

Download references

Author information

These authors contributed equally: Timothy P. Lillicrap and Adam Santoro

Authors and Affiliations

DeepMind, London, UK
Timothy P. Lillicrap, Adam Santoro & Luke Marris
Centre for Computation, Mathematics and Physics, University College London, London, UK
Timothy P. Lillicrap
Department of Pharmacology, University of Oxford, Oxford, UK
Colin J. Akerman
Department of Computer Science, University of Toronto, Toronto, Canada
Geoffrey Hinton
Google Brain, Toronto, Canada
Geoffrey Hinton

Authors

Timothy P. Lillicrap
View author publications
You can also search for this author in PubMed Google Scholar
Adam Santoro
View author publications
You can also search for this author in PubMed Google Scholar
Luke Marris
View author publications
You can also search for this author in PubMed Google Scholar
Colin J. Akerman
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey Hinton
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

T.P.L. and A.S. contributed equally to this work. T.P.L., G.H. and A.S. researched data for the article, and T.P.L., G.H., C.J.A. and A.S. wrote the article. The authors all provided substantial contributions to discussion of the content and reviewed and edited the manuscript before submission. The authors contributed equally to all aspects of the article.

Corresponding authors

Correspondence to Timothy P. Lillicrap or Geoffrey Hinton.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information

Nature Reviews Neuroscience thanks Y. Amit, J. DiCarlo, W. Senn and T. Toyoizumi for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Glossary

Auto-encoders: Networks showing unsupervised learning in which the target is the input itself. One application of auto-encoding is the training of feedback connections to coherently carry ‘targets’ to earlier layers.
Backpropagation of error (backprop): An algorithm for explicitly computing the changes to prescribe to synapses in deep networks in order to improve performance. It involves the flow of error signals through feedback connections from the output of the network towards the input.
Credit assignment: Determination of the degree to which a particular parameter, such as a synaptic weight, contributes to the magnitude of the error signal.
Deep learning: Learning in networks that consist of hierarchical stacks, or layers, of neurons. Deep learning is especially difficult because of the difficulty inherent in assigning credit to a vast number of synapses situated deep within the network.
Error function: An explicit quantitative measure for determining the quality of a network’s output. It is also frequently called a loss or objective function.
Error signals: Contribution to the error by the activities of neurons situated closer to the output. In backpropagation, these signals are sent backward through the network in order to inform learning.
ImageNet: A large dataset of images with their corresponding word labels. The task associated with the dataset is to guess the correct label for each image. ImageNet has become a de facto standard for measuring the strength of deep-learning algorithms and architectures.
Internal representations: Hidden activity of a network that represents the network’s input data. ‘Useful’ representations tend to be those that efficiently code for redundant features of the input data and lead to good generalization, such as the existence of oriented edges in handwritten digits.
Learning: The modification of network parameters, such as synaptic weights, to enable better performance according to some measure, such as an error function.
Reinforcement learning: Learning in an interactive trial-and-error loop, whereby an agent acts stochastically in an environment and uses the correlations between actions and the accumulated scalar rewards to improve performance.
Supervised learning: Learning in which the error function involves an explicit target. The target tends to contain information that is unavailable to the network, such as ground truth labels.
Target: The desired output of a network, given some input. Deviation from the target is quantified with an error function.
Unsupervised learning: Learning in which the error function does not involve a separate output target. Instead, errors are computed using other information readily available to the network, such as the input itself or the next observation in a sequence.
Weights: Network parameters that determine the strength of neuron–neuron connections. A presynaptic neuron connected to a postsynaptic neuron with a high weight will greatly influence the activity of the postsynaptic neurons, and vice versa.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lillicrap, T.P., Santoro, A., Marris, L. et al. Backpropagation and the brain. Nat Rev Neurosci 21, 335–346 (2020). https://doi.org/10.1038/s41583-020-0277-3

Download citation

Accepted: 07 February 2020
Published: 17 April 2020
Issue Date: June 2020
DOI: https://doi.org/10.1038/s41583-020-0277-3
Springer Nature Limited

This article is cited by

A sparse quantized hopfield network for online-continual memory
- Nicholas Alonso
- Jeffrey L. Krichmar
Nature Communications (2024)
Inferring neural activity before plasticity as a foundation for learning beyond backpropagation
- Yuhang Song
- Beren Millidge
- Rafal Bogacz
Nature Neuroscience (2024)
Forward layer-wise learning of convolutional neural networks through separation index maximizing
- Ali Karimi
- Ahmad Kalhor
- Melika Sadeghi Tabrizi
Scientific Reports (2024)
Learning efficient backprojections across cortical hierarchies in real time
- Kevin Max
- Laura Kriener
- Mihai A. Petrovici
Nature Machine Intelligence (2024)
Learning high-level visual representations from a child’s perspective without strong inductive biases
- A. Emin Orhan
- Brenden M. Lake
Nature Machine Intelligence (2024)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Backpropagation and the brain

From

Abstract

Access this article

Similar content being viewed by others

Artificial Neural Networks and Backpropagation

Backpropagation Issues with Deep Feedforward Neural Networks

Fundamentals of Artificial Neural Networks and Deep Learning

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Peer review information

Publisher’s note

Supplementary information

Supplementary Information

Glossary

Rights and permissions

About this article

Cite this article

This article is cited by

A sparse quantized hopfield network for online-continual memory

Inferring neural activity before plasticity as a foundation for learning beyond backpropagation

Forward layer-wise learning of convolutional neural networks through separation index maximizing

Learning efficient backprojections across cortical hierarchies in real time

Learning high-level visual representations from a child’s perspective without strong inductive biases

Navigation

Backpropagation and the brain

Abstract

Access this article

Similar content being viewed by others

Explore related subjects

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Peer review information

Publisher’s note

Supplementary information

Glossary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Navigation