Abstract
During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to determine the effect of an individual synaptic modification on the behaviour of the system. The backpropagation algorithm solves this problem in deep artificial neural networks, but historically it has been viewed as biologically problematic. Nonetheless, recent developments in neuroscience and the successes of artificial neural networks have reinvigorated interest in whether backpropagation offers insights for understanding learning in the cortex. The backpropagation algorithm learns quickly by computing synaptic updates using feedback connections to deliver error signals. Although feedback connections are ubiquitous in the cortex, it is difficult to see how they could deliver the error signals required by strict formulations of backpropagation. Here we build on past and recent developments to argue that feedback connections may instead induce neural activities whose differences can be used to locally approximate these signals and hence drive effective learning in deep networks in the brain.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Hebb, D. O. The Organization of Behavior: A Neuropsychological Approach (John Wiley & Sons, 1949).
Markram, H. & Sakmann, B. Action potentials propagating back into dendrites trigger changes in efficacy of single-axon synapses between layer V pyramidal neurons. Soc. Neurosci. Abstr. 21, 2007 (1995).
Markram, H., Lübke, J., Frotscher, M. & Sakmann, B. Regulation of synaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science 275, 213–215 (1997).
Gerstner, W., Kempter, R., van Hemmen, J. L. & Wagner, H. A neuronal learning rule for sub-millisecond temporal coding. Nature 383, 76–78 (1996).
Bliss, T. V. & Lømo, T. Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. J. Physiol. 232, 331–356 (1973).
Bishop, C. M. Neural Networks for Pattern Recognition (Oxford University Press, 1995).
Werbos, P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD Thesis, Harvard Univ. P. (1974).
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning Internal Representations by Error Propagation. Technical Report (DTIC Document, 1985).
LeCun, Y. in Proc. Cognitiva 85, 559–604 (CESTA, 1985).
Parker, D. B. Learning-Logic: Casting the Cortex of the Human Brain in Silicon. Technical Report Tr-47 (Center for Computational Research in Economics and Management Science, MIT, 1985).
Hannun, A. et al. Deep speech: scaling up end-to-end speech recognition. Preprint at http://arXiv.org/1412.5567 (2014).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Adv. Neural Inf. Process. Syst. 1097–1105 (NIPS, 2012).
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proc. IEEE Conf. Comput. Vision Patt. Recog., 770–778 (2016).
Vaswani, A. et al. in Adv. Neural Inf. Process. Syst. 6000–6010 (NIPS, 2017).
Oord, A. v. d., Kalchbrenner, N. & Kavukcuoglu, K. Pixel recurrent neural networks. PMLR 48, 1747–1756 (2016).
Van den Oord, A. et al. Wavenet: a generative model for raw audio. Preprint at https://arXiv.org/1609.03499 (2016)
Jozefowicz, R., Vinyals, O., Schuster, M., Shazeer, N. & Wu, Y. Exploring the limits of language modeling. Preprint at https://arXiv.org/1602.02410 (2016).
Oh, J., Guo, X., Lee, H., Lewis, R. L. & Singh, S. in Adv. Neural Inf. Process. Syst. 2863–2871 (NIPS, 2015).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
Silver, D. et al. Mastering the game of go without human knowledge. Nature 550, 354–359 (2017).
Moravčík, M. et al. DeepStack: expert-level artificial intelligence in heads-up no-limit poker. Science 356, 508–513 (2017).
Gilbert, C. D. & Li, W. Top-down influences on visual processing. Nat. Rev. Neurosci. 14, 350–363 (2013).
Tong, F. Primary visual cortex and visual awareness. Nat. Rev. Neurosci. 4, 219–229 (2003).
Grossberg, S. Competitive learning: from interactive activation to adaptive resonance. Cogn. Sci. 11, 23–63 (1987).
Marr, D. Simple memory: a theory for archicortex. Philos. Trans. R. Soc. Lond. B Biol. Sci. 262, 23–81 (1971).
Hinton, G. E. & McClelland, J. L. in Adv. Neural Inf. Process. Syst. 358–366 (NIPS, 1988).
Crick, F. The recent excitement about neural networks. Nature 337, 129–132 (1989).
Roelfsema, P. R. & Holtmaat, A. Control of synaptic plasticity in deep cortical networks. Nat. Rev. Neurosci. 19, 166–180 (2018).
Whittington, J. C. & Bogacz, R. Theories of error back-propagation in the brain. Trends Cogn. Sci. 23, 235–250 (2019).
Almeida, L. B. in Artificial Neural Networks 102–111 (ACM Digital Library, 1990).
Pineda, F. J. Generalization of back-propagation to recurrent neural networks. Phys. Rev. Lett. 59, 2229–2232 (1987).
Pineda, F. J. Dynamics and architecture for neural computation. J. Complex. 4, 216–245 (1988).
O’Reilly, R. C. Biologically plausible error-driven learning using local activation differences: the generalized recirculation algorithm. Neural Comput. 8, 895–938 (1996).
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).
Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The ‘wake–sleep’ algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).
Movellan, J. R. in Connectionist Models: Proc. 1990 Summer School 10–17 (ScienceDirect, 1991).
LeCun, Y., Chopra, S., Hadsell, R., Ranzato, M. & Huang, F. in Predicting Structured Data Vol. 1 (eds Bakir, G., Hofman, T., Scholkopf, B., Smola, A. & Taskar, B.) 191–245 (MIT Press, 2006).
Xie, X. & Seung, H. S. Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Comput. 15, 441–454 (2003).
Bengio, Y. How auto-encoders could provide credit assignment in deep networks via target propagation. Preprint at http://arXiv.org/1407.7906 (2014).
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. in Joint Eur. Conf. Machine Learning Knowl. Discov. Databases 498–515 (Springer, 2015).
Mazzoni, P., Anderson, R. A. & Jordan, M. I. A more biologically plausible learning rule for neural networks. Proc. Natl Acad. Sci. USA 88, 4433–4437 (1991).
Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003).
Werfel, J., Xie, X. & Seung, H. S. Learning curves for stochastic gradient descent in linear feedforward networks. Neural Comput. 17, 2699–2718 (2005).
Spall, J. C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control. 37, 332–341 (1992).
Williams, R. J. in Reinforcement Learning 5–32 (Springer, 1992).
Flower, B. & Jabri, M. Summed weight neuron perturbation: an O(n) improvement over weight perturbation. in Adv. Neural Inf. Process. Syst. 212–219 (NIPS, 1993).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).
Deisenroth, M. P. et al. A survey on policy search for robotics. Found. Trends R. Robot. 2, 1–142 (2013).
Lillicrap, T. P. et al. Continuous control with deep reinforcement learning. Preprint at http://arXiv.org/1509.02971 (2015).
Rumelhart, D., Hinton, G. & Williams, R. Learning representations by back-propagation errors. Nature 323, 533–536 (1986).
Andersen, P., Sundberg, S., Sveen, O., Swann, J. & Wigström, H. Possible mechanisms for long-lasting potentiation of synaptic transmission in hippocampal slices from guinea-pigs. J. Physiol. 302, 463–482 (1980).
Guillery, R. & Sherman, S. M. Thalamic relay functions and their role in corticocortical communication: generalizations from the visual system. Neuron 33, 163–175 (2002).
Sherman, S. M. & Guillery, R. Distinct functions for direct and transthalamic corticocortical connections. J. Neurophysiol. 106, 1068–1077 (2011).
Viaene, A. N., Petrof, I. & Sherman, S. M. Properties of the thalamic projection from the posterior medial nucleus to primary and secondary somatosensory cortices in the mouse. Proc. Natl Acad. Sci. USA 108, 18156–18161 (2011).
Abdelghani, M., Lillicrap, T. & Tweed, D. Sensitivity derivatives for flexible sensorimotor learning. Neural Comput. 20, 2085–2111 (2008).
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).
Cadieu, C. F. et al. Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput. Biol. 10, e1003963 (2014).
Yamins, D. L. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111, 8619–8624 (2014).
Elston, G. N. Cortex, cognition and the cell: new insights into the pyramidal neuron and prefrontal function. Cereb. Cortex 13, 1124–1138 (2003).
Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207–214 (2014).
Jiang, X. et al. Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350, aac9462 (2015).
Tasic, B. et al. Shared and distinct transcriptomic cell types across neocortical areas. Nature 563, 72–78 (2018).
Mountcastle, V. B. et al. Modality and topographic properties of single neurons of cat’s somatic sensory cortex. J. Neurophysiol. 20, 408–434 (1957).
Mountcastle, V. B., Motter, B., Steinmetz, M. & Sestokas, A. Common and differential effects of attentive fixation on the excitability of parietal and prestriate (V4) cortical visual neurons in the macaque monkey. J. Neurosci. 7, 2239–2255 (1987).
Douglas, R. J., Martin, K. A. & Whitteridge, D. A canonical microcircuit for neocortex. Neural Comput. 1, 480–488 (1989).
Bastos, A. M. et al. Canonical microcircuits for predictive coding. Neuron 76, 695–711 (2012).
Zipser, D. & Andersen, R. A. A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature 331, 679–684 (1988).
Lillicrap, T. P. & Scott, S. H. Preference distributions of primary motor cortex neurons reflect control solutions optimized for limb biomechanics. Neuron 77, 168–179 (2013).
Khaligh-Razavi, S.-M. & Kriegeskorte, N. Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10, e1003915 (2014).
Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1, 417–446 (2015).
Wenliang, L. K. & Seitz, A. R. Deep neural networks for modeling visual perceptual learning. J. Neurosci. 38, 6028–6044 (2018).
Pinto, N., Cox, D. D. & DiCarlo, J. J. Why is real-world visual object recognition hard? PLoS Comput. Biol. 4, e27 (2008).
Freeman, J. & Simoncelli, E. P. Metamers of the ventral stream. Nat. Neurosci. 14, 1195–1201 (2011).
Ullman, S., Assif, L., Fetaya, E. & Harari, D. Atoms of recognition in human and computer vision. Proc. Natl Acad. Sci. USA 113, 2744–2749 (2016).
Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V. & McDermott, J. H. A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron 98, 630–644 (2018).
Mitchell, M. An Introduction to Genetic Algorithms (MIT Press, 1998).
Saxe, A. M. Deep Linear Neural Networks: A Theory of Learning in the Brain and Mind. PhD thesis, Stanford Univ. (2015).
Zmarz, P. & Keller, G. B. Mismatch receptive fields in mouse visual cortex. Neuron 92, 766–772 (2016).
Issa, E. B., Cadieu, C. F. & DiCarlo, J. J. Neural dynamics at successive stages of the ventral visual stream are consistent with hierarchical error signals. eLife 7, e42870 (2018).
Schultz, W., Dayan, P. & Montague, P. R. A neural substrate of prediction and reward. Science 275, 1593–1599 (1997).
Zipser, D. & Rumelhart, D. in Computational Neuroscience (ed. Schwartz, E. L.) 192–200 (1990).
Stork, D. G. in Int. Joint Conf. Neural Netw. 2 (1989), 241–246.
Brandt, R. D. & Lin, F. in Proc. 1996 IEEE Int. Conf. Neural Netw. 300–305 (1996).
Brandt, R. D. & Lin, F. in Proc. 1996 IEEE Int. Symp. Intell. Control 86–90 (1996).
Oztas, E. Neuronal tracing. Neuroanatomy 2, 2–5 (2003).
Harris, K. D. Stability of the fittest: organizing learning through retroaxonal signals. Trends Neurosci. 31, 130–136 (2008).
Venkateswararao, L. C. Adaptive Optimal-Control Algorithms for Brainlike Networks PhD Thesis, Univ. Toronto (2010).
Hinton, G. The ups and downs of Hebb synapses. Can. Psychol. 44, 10–13 (2003).
Kolen, J. F. & Pollack, J. B. in IEEE World Congress Comput. Intell. 3, 1375–1380 (IEEE, 1994).
Körding, K. P. & König, P. Supervised and unsupervised learning with two sites of synaptic integration. J. Comput. Neurosci. 11, 207–215 (2001).
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random feedback weights support learning in deep neural networks. Preprint at http://arXiv.org/1411.0247 (2014).
Nøkland, A. in Adv. Neural Inf. Process. Syst. 1045–1053 (NIPS, 2016).
Guergiuev, J., Lillicrap, T. P. & Richards, B. A. Deep learning with segregated dendrites. eLife 6, e22901 (2017).
Liao, Q., Leibo, J. Z. & Poggio, T. How important is weight symmetry in backpropagation? Preprint at https://arXiv.org/1510.05067 (2015).
Samadi, A., Lillicrap, T. P. & Tweed, D. B. Deep learning with dynamic spiking neurons and fixed feedback weights. Neural Comput. 29, 578–602 (2017).
Moskovitz, T. H., Litwin-Kumar, A. & Abbott, L. Feedback alignment in deep convolutional networks. Preprint at https://arXiv.org/1812.06488 (2018).
Xiao, W., Chen, H., Liao, Q. & Poggio, T. Biologically-plausible learning algorithms can scale to large datasets. Preprint at https://arXiv.org/1811.03567 (2018).
Amit, Y. Deep learning with asymmetric connections and Hebbian updates. Front. Comput Neurosci. 13, 18 (2019).
Bartunov, S. et al. in Adv. Neural Inf. Process. Syst. 9390–9400 (NIPS, 2018).
Akrout, M., Wilson, C., Humphreys, P. C., Lillicrap, T. & Tweed, D. Using weight mirrors to improve feedback alignment. Preprint at https://arXiv.org/1904.05391 (2019).
Pascanu, R., Mikolov, T. & Bengio, Y. in Proc. Int. Conf. Machine Learning 1310–1318 (ICML, 2013).
Coesmans, M., Weber, J. T., De Zeeuw, C. I. & Hansel, C. Bidirectional parallel fiber plasticity in the cerebellum under climbing fiber control. Neuron 44, 691–700 (2004).
Yang, Y. & Lisberger, S. G. Purkinje-cell plasticity and cerebellar motor learning are graded by complex-spike duration. Nature 510, 529–532 (2014).
Li, W., Piëch, V. & Gilbert, C. D. Contour saliency in primary visual cortex. Neuron 50, 951–962 (2006).
Motter, B. C. Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J. Neurophysiol. 70, 909–919 (1993).
Moran, J. & Desimone, R. Selective attention gates visual processing in the extrastriate cortex. Front. Cognit. Neurosci. 229, 342–345 (1985).
Spitzer, H., Desimone, R. & Moran, J. Increased attention enhances both behavioral and neuronal performance. Science 240, 338–340 (1988).
Chelazzi, L., Miller, E. K. & Duncanf, J. A neural basis for visual search in inferior temporal cortex. Nature 363, 27 (1993).
Chelazzi, L., Miller, E. K., Duncan, J. & Desimone, R. Responses of neurons in macaque area V4 during memory-guided visual search. Cereb. Cortex 11, 761–772 (2001).
Treue, S. & Maunsell, J. H. Attentional modulation of visual motion processing in cortical areas MT and MST. Nature 382, 539–541 (1996).
Luck, S. J., Chelazzi, L., Hillyard, S. A. & Desimone, R. Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. J. Neurophysiol. 77, 24–42 (1997).
Ito, M. & Gilbert, C. D. Attention modulates contextual influences in the primary visual cortex of alert monkeys. Neuron 22, 593–604 (1999).
McAdams, C. J. & Maunsell, J. H. Effects of attention on orientation-tuning functions of single neurons in macaque cortical area V4. J. Neurosci. 19, 431–441 (1999).
Reynolds, J. H. & Desimone, R. Interacting roles of attention and visual salience in V4. Neuron 37, 853–863 (2003).
Abbott, L. F., Varela, J., Sen, K. & Nelson, S. Synaptic depression and cortical gain control. Science 275, 221–224 (1997).
Chance, F. S., Abbott, L. & Reyes, A. D. Gain modulation from background synaptic input. Neuron 35, 773–782 (2002).
Girard, P., Hupé, J. & Bullier, J. Feedforward and feedback connections between areas V1 and V2 of the monkey have similar rapid conduction velocities. J. Neurophysiol. 85, 1328–1331 (2001).
De Pasquale, R. & Sherman, S. M. Synaptic properties of corticocortical connections between the primary and secondary visual cortical areas in the mouse. J. Neurosci. 31, 16494–16506 (2011).
Kosslyn, S. M. & Thompson, W. L. When is early visual cortex activated during visual mental imagery? Psychol. Bull. 129, 723–746 (2003).
Bridge, H., Harrold, S., Holmes, E. A., Stokes, M. & Kennard, C. Vivid visual mental imagery in the absence of the primary visual cortex. J. Neurol. 259, 1062–1070 (2012).
Manita, S. et al. A top-down cortical circuit for accurate sensory perception. Neuron 86, 1304–1316 (2015).
Fyall, A. M., El-Shamayleh, Y., Choi, H., Shea-Brown, E. & Pasupathy, A. Dynamic representation of partially occluded objects in primate prefrontal and visual cortex. eLife 6, e25784 (2017).
Mignard, M. & Malpeli, J. G. Paths of information flow through visual cortex. Science 251, 1249–1252 (1991).
Markov, N. T. & Kennedy, H. The importance of being hierarchical. Curr. Opin. Neurobiol. 23, 187–194 (2013).
Ahissar, M. & Hochstein, S. The reverse hierarchy theory of visual perceptual learning. Trends Cognit. Sci. 8, 457–464 (2004).
Lee, T. S. & Mumford, D. Hierarchical Bayesian inference in the visual cortex. J. Opt. Soc. Am. A Opt Image Sci. Vis. 20, 1434–1448 (2003).
Lewicki, M. S. & Sejnowski, T. J. in Adv. Neural Inf. Process. Syst. 529–535 (NIPS, 1997).
Knill, D. C. & Richards, W. Perception as Bayesian Inference (Cambridge Univ. Press, 1996).
Dayan, P., Hinton, G. E., Neal, R. M. & Zemel, R. S. The Helmholtz machine. Neural Comput. 7, 889–904 (1995).
Von Helmholtz, H.& Southall, J. P. C. Treatise on Physiological Optics (Courier Corp., 2005).
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. in Readings in Computer Vision 522–533 (Elsevier, 1987).
Whittington, J. C. & Bogacz, R. An approximation of the error backpropagation algorithm in a predictive coding network with local Hebbian synaptic plasticity. Neural Comput. 29, 1229–1262 (2017).
Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. Dendritic error backpropagation in deep cortical microcircuits. Preprint at https://arXiv.org/1801.00062 (2017).
Sacramento, J., Costa, R. P., Bengio, Y. & Senn, W. in Adv. Neural Inf. Process. Syst. 8721–8732 (NIPS, 2018).
Scellier, B. & Bengio, Y. Towards a biologically plausible backprop. Preprint at https://arXiv.org/1602.05179.914 (2016).
Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).
Hinton, G. How to do backpropagation in a brain. Deep Learning Workshop (NIPS, 2007).
Vincent, P., Larochelle, H., Bengio, Y. & Manzagol, P.-A. in Proc. 25th Int. Conf. Machine Learning 1096–1103 (ICML, 2008).
Smolensky, P. Information Processing in Dynamical Systems: Foundations of Harmony Theory Technical Report (Univ. Colorado at Boulder, 1986).
LeCun, Y. in Disordered Systems and Biological Organization 233–240 (Springer, 1986).
LeCun, Y. Modèles connexionnistes de l’apprentissage. PhD Thesis, Univ. Paris 6 (1987).
Coogan, T. & Burkhalter, A. Conserved patterns of cortico-cortical connections define areal hierarchy in rat visual cortex. Exp. Brain Res. 80, 49–53 (1990).
D’Souza, R. D. & Burkhalter, A. A laminar organization for selective cortico-cortical communication. Front. Neuroanat. 11, 71 (2017).
Wimmer, V. C., Bruno, R. M., De Kock, C. P., Kuner, T. & Sakmann, B. Dimensions of a projection column and architecture of VPM and POm axons in rat vibrissal cortex. Cereb. Cortex 20, 2265–2276 (2010).
Williams, L. E. & Holtmaat, A. Higher-order thalamocortical inputs gate synaptic long-term potentiation via disinhibition. Neuron 101, 91–102 (2019).
Larkum, M. E., Zhu, J. J. & Sakmann, B. A new cellular mechanism for coupling inputs arriving at different cortical layers. Nature 398, 338–341 (1999).
Gordon, U., Polsky, A. & Schiller, J. Plasticity compartments in basal dendrites of neocortical pyramidal neurons. J. Neurosci. 26, 12717–12726 (2006).
Branco, T., Clark, B. A. & Häusser, M. Dendritic discrimination of temporal input sequences in cortical neurons. Science 329, 1671–1675 (2010).
Branco, T. & Häusser, M. Synaptic integration gradients in single cortical pyramidal cell dendrites. Neuron 69, 885–892 (2011).
Losonczy, A., Makara, J. K. & Magee, J. C. Compartmentalized dendritic plasticity and input feature storage in neurons. Nature 452, 436–441 (2008).
Polsky, A., Mel, B. W. & Schiller, J. Computational subunits in thin dendrites of pyramidal cells. Nat. Neurosci. 7, 621–627 (2004).
Urbanczik, R. & Senn, W. Learning by the dendritic prediction of somatic spiking. Neuron 81, 521–528 (2014).
Naud, R. & Sprekeler, H. Sparse bursts optimize information transmission in a multiplexed neural code. PNAS 115, E6329–E6338 (2018).
Schiess, M., Urbanczik, R. & Senn, W. Somato-dendritic synaptic plasticity and error-backpropagation in active dendrites. PLoS Comput. Biol. 12, e1004638 (2016).
Klausberger, T. & Somogyi, P. Neuronal diversity and temporal dynamics: the unity of hippocampal circuit operations. Science 321, 53–57 (2008).
Sjöström, P. J. & Häusser, M. A cooperative switch determines the sign of synaptic plasticity in distal dendrites of neocortical pyramidal neurons. Neuron 51, 227–238 (2006).
Richards, B. A. & Lillicrap, T. P. Dendritic solutions to the credit assignment problem. Curr. Opin. Neurobiol. 54, 28–36 (2019).
Muller, S. Z., Zadina, A., Abbott, L. & Sawtell, N. Continual learning in a multi-layer network of an electric fish. Cell 179, 1382–1392.e10 (2019).
Bittner, K. C. et al. Conjunctive input processing drives feature selectivity in hippocampal CA1 neurons. Nat. Neurosci. 18, 1133–1142 (2015).
Bittner, K. C., Milstein, A. D., Grienberger, C., Romani, S. & Magee, J. C. Behavioral time scale synaptic plasticity underlies CA1 place fields. Science 357, 1033–1036 (2017).
Guerguiev, J., Lillicrap, T. P. & Richards, B. A. Towards deep learning with segregated dendrites. eLife 6, e22901 (2017).
Kwag, J. & Paulsen, O. The timing of external input controls the sign of plasticity at local synapses. Nat. Neurosci. 12, 1219–1221 (2009).
Dale, H. Pharmacology and nerve-endings. Proc. R. Soc. Med. 28, 319–332 (1935).
Osborne, N. N. Is Dale’s principle valid? Trends Neurosci. 2, 73–75 (1979).
O’Donohue, T. L., Millington, W. R., Handelmann, G. E., Contreras, P. C. & Chronwall, B. M. On the 50th anniversary of Dale’s law: multiple neurotransmitter neurons. Trends Pharmacol. Sci. 6, 305–308 (1985).
Draye, J.-P., Cheron, G., Libert, G. & Godaux, E. Emergence of clusters in the hidden layer of a dynamic recurrent neural network. Biol. Cybern. 76, 365–374 (1997).
De Kamps, M. & van der Velde, F. From artificial neural networks to spiking neuron populations and back again. Neural Netw. 14, 941–953 (2001).
Parisien, C., Anderson, C. H. & Eliasmith, C. Solving the problem of negative synaptic weights in cortical models. Neural Comput. 20, 1473–1494 (2008).
Zeiler, M. D. & Fergus, R. in Eur. Conf. Comput. Vision 818–833 (2014).
Author information
Authors and Affiliations
Contributions
T.P.L. and A.S. contributed equally to this work. T.P.L., G.H. and A.S. researched data for the article, and T.P.L., G.H., C.J.A. and A.S. wrote the article. The authors all provided substantial contributions to discussion of the content and reviewed and edited the manuscript before submission. The authors contributed equally to all aspects of the article.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information
Nature Reviews Neuroscience thanks Y. Amit, J. DiCarlo, W. Senn and T. Toyoizumi for their contribution to the peer review of this work.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Glossary
- Auto-encoders
-
Networks showing unsupervised learning in which the target is the input itself. One application of auto-encoding is the training of feedback connections to coherently carry ‘targets’ to earlier layers.
- Backpropagation of error (backprop)
-
An algorithm for explicitly computing the changes to prescribe to synapses in deep networks in order to improve performance. It involves the flow of error signals through feedback connections from the output of the network towards the input.
- Credit assignment
-
Determination of the degree to which a particular parameter, such as a synaptic weight, contributes to the magnitude of the error signal.
- Deep learning
-
Learning in networks that consist of hierarchical stacks, or layers, of neurons. Deep learning is especially difficult because of the difficulty inherent in assigning credit to a vast number of synapses situated deep within the network.
- Error function
-
An explicit quantitative measure for determining the quality of a network’s output. It is also frequently called a loss or objective function.
- Error signals
-
Contribution to the error by the activities of neurons situated closer to the output. In backpropagation, these signals are sent backward through the network in order to inform learning.
- ImageNet
-
A large dataset of images with their corresponding word labels. The task associated with the dataset is to guess the correct label for each image. ImageNet has become a de facto standard for measuring the strength of deep-learning algorithms and architectures.
- Internal representations
-
Hidden activity of a network that represents the network’s input data. ‘Useful’ representations tend to be those that efficiently code for redundant features of the input data and lead to good generalization, such as the existence of oriented edges in handwritten digits.
- Learning
-
The modification of network parameters, such as synaptic weights, to enable better performance according to some measure, such as an error function.
- Reinforcement learning
-
Learning in an interactive trial-and-error loop, whereby an agent acts stochastically in an environment and uses the correlations between actions and the accumulated scalar rewards to improve performance.
- Supervised learning
-
Learning in which the error function involves an explicit target. The target tends to contain information that is unavailable to the network, such as ground truth labels.
- Target
-
The desired output of a network, given some input. Deviation from the target is quantified with an error function.
- Unsupervised learning
-
Learning in which the error function does not involve a separate output target. Instead, errors are computed using other information readily available to the network, such as the input itself or the next observation in a sequence.
- Weights
-
Network parameters that determine the strength of neuron–neuron connections. A presynaptic neuron connected to a postsynaptic neuron with a high weight will greatly influence the activity of the postsynaptic neurons, and vice versa.
Rights and permissions
About this article
Cite this article
Lillicrap, T.P., Santoro, A., Marris, L. et al. Backpropagation and the brain. Nat Rev Neurosci 21, 335–346 (2020). https://doi.org/10.1038/s41583-020-0277-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41583-020-0277-3
- Springer Nature Limited
This article is cited by
-
A sparse quantized hopfield network for online-continual memory
Nature Communications (2024)
-
Inferring neural activity before plasticity as a foundation for learning beyond backpropagation
Nature Neuroscience (2024)
-
Forward layer-wise learning of convolutional neural networks through separation index maximizing
Scientific Reports (2024)
-
Learning efficient backprojections across cortical hierarchies in real time
Nature Machine Intelligence (2024)
-
Learning high-level visual representations from a child’s perspective without strong inductive biases
Nature Machine Intelligence (2024)