Introduction

The cerebellum is involved in the acquisition of procedural memory, and several attempts have been done at linking cerebellar learning to the underlying neuronal circuit mechanisms. The first hypothesis was proposed within the Motor Learning Theory, which indicated that some form of long-term depression (LTD) or long-term potentiation (LTP) [1, 2] had to occur at the parallel fiber–Purkinje cell (PF-PC) synapse under guidance of the CFs, which were assumed to convey an error signal. Following the demonstration that a PF-PC LTD compatible with theory actually existed [3], many other works have reported that several forms of synaptic and nonsynaptic plasticity exist in the cerebellum. Now, synaptic plasticity is known to be distributed in the granular layer, molecular layer, and deep cerebellar nuclei (DCN) [46] involving both excitatory and inhibitory synaptic transmission as well as neuronal intrinsic excitability. Most of these different forms of plasticity eventually impinge on three main neurons, namely granule cells (GrCs), PCs, and DCN cells, which act as nodes integrating excitatory and inhibitory plasticity.

These new findings have complicated rather than clarified the issue of how the cerebellum might learn and store information using its internal circuitry. At present, there is not yet agreement about the type of information conveyed by the climbing fibers into the cerebellum or about their potential role. The Marr-Albus theory maintains that climbing fibers carry either an error signal related to directional information [7] or a binary teaching signal [8, 9]. Conversely, considering the periodic nature of climbing fiber activity, others [10] maintain that IO activity is related with the timing of movement. However, investigations in which this periodicity was not observed [11] suggested that the climbing fiber activity was correlated with the onset of movements. The controversy extends to IO functional properties, which are not yet univocally defined [12--14]. Finally, the different cerebellar plasticity mechanisms recently observed in the cerebellum and related nuclei suggest that motor learning may not be exclusively related to climbing fiber activity [6, 15--17]. Alternative hypotheses have suggested an important role for plasticity in the DCN [18] or in the vestibular nuclei [19]. However, no clues were given to integrate the role of all the different plasticity mechanisms into a coherent view.

When trying to face this issue, a fundamental question emerges: how could the role of multiple plasticity mechanisms be determined within a complex system of circuit loops transporting feedback signals related to ongoing behavior? Recently, the problem has been faced through two series of experiments, in which the cerebellar circuit was engaged in learning tasks during closed-loop signal processing.

In a first set of tests, eyeblink classical conditioning (EBCC) was elicited in humans, and its effectiveness was impaired using TMS [20, 21], which proved able to alter specific learning components and cerebellar subcircuits. In the second set of tests, the cerebellar circuit was reconstructed using detailed models of neurons and synapses [22]. Then, the models were adapted and inserted into robotic control systems capable of reproducing the same behaviors that are known to engage cerebellar learning in leaving beings [23-- 28]. These robotic tests allowed a direct assessment of the way the cerebellum might use distributed plasticity to process incoming information and generate an internal memory useful to drive sensori-motor adaptation.

Distributed Plasticity in the Cerebellar Network

Recent reviews have dealt with the multiple forms of long-term plasticity (at least 15 synaptic and 3 of intrinsic excitability) discovered in the cerebellar circuit [4--6, 29], which are briefly summarized here (Fig. 1):

Fig. 1
figure 1

Distributed plasticity in the olivo-cerebellar circuit. This schematic view shows the main architecture of cerebellar microcircuits. Inputs from mossy fibers (MFs, in red), parallel fibers (PFs, in red), and inferior olive (IO) projections (climbing fibers, CFs, in orange) provide the excitatory drive, while the inhibitory connections are shown in blue. In particular, the granular layer and the molecular layer include an inhibitory loop mediated by local interneurons (Golgi cell, GoC, and molecular layer interneuron, MLI, respectively), while the whole cerebellar cortex acts as the inhibitory loop to the deep cerebellar nuclei (DCN) neurons, through the Purkinje cell (PC) connection. MFs and CFs project both to the cerebellar cortex and to the DCN neurons. MFs contact granule cells (GrCs) and send collaterals to inhibitory GoCs. GrCs originate the PFs that make synaptic contact with PCs, MLI, and GoCs (originating a granular layer feedback loop). The figure highlights the major forms of plasticity reported experimentally in the cerebellar network: synaptic long-term potentiation (LTP), synaptic long-term depression (LTD), and plasticity of intrinsic excitability (ie). At the PF-PC connection, the forms of presynaptic LTP or LTD (pre LTP, pre LTD) and postsynaptic LTP or LTD (post-LTP, post-LTD) are indicated (color figure online)

  • In the granular layer, synaptic plasticity has been reported to occur at the mossy fiber (MF)–granule cell (GrC) relay as LTP [30--32] and LTD) [33, 34]. LTP and LTD have been also observed in vivo [35, 36]. MF-GrC LTP proved dependent on NMDAR [30] activation and showed a presynaptic expression probably mediated by NO release from GrCs [37, 38]. According to the Bienenstock-Cooper-Munro (BCM) plasticity rule [39], LTP and LTD induction correlated with stimulus duration and frequency through a postsynaptic calcium-dependent mechanism [33, 34] with a sliding threshold controlled by neuromodulators [40]. Forms of plasticity in the Golgi cell inhibitory loop remain hypothetical at the present (except for some evidence for LTD at the PF-GoC synapse [41]), although modeling predictions suggested that they may provide a powerful regulatory mechanism [27].

  • In the molecular layer, synaptic plasticity has been described in multiple forms at the parallel fiber to Purkinje cell (PF-PC) synapse, at the climbing fiber to Purkinje cell (CF-PC) synapse, and parallel fiber to molecular layer interneuron (PF-MLI) synapse and molecular interneuron–Purkinje cell (MLI-PC) synapse.

    At the PF-PC connection, several forms of plasticity have been observed: presynaptic LTP [42, 43], presynaptic LTD [44], postsynaptic LTP [45, 46], and postsynaptic LTD [47--49]. The postsynaptic forms of LTP and LTD have been reported to be bidirectional according to an inverse BCM plasticity rule. Moreover, while postsynaptic PF-PC LTD is generally assumed to require paired climbing fiber (CF) activation, this may not be an absolute requirement in some cases [17]. Although PF-PC plasticity has been observed in vivo [50--52], it is not clear whether all these forms of plasticity are present in vivo and cooperate in regulating PC activity state.

    Climbing fiber–Purkinje cell (CF-PC) plasticity has been suggested to play a pivotal role in controlling the PF-PC state of activity. CF-induced complex spikes in PCs are an important source of intracellular calcium that can determine the direction of plasticity at the PF-PC synapse. Indeed, CF-PC LTD [53] was shown to affect the probability of postsynaptic LTP and LTD induction at the PF-PC synapses [45].

    PF-MLI LTP [54] and LTD [55], respectively pre- and postsynaptically expressed, have been described. Interestingly, PF-MLI LTP may be induced by paired activation of PFs and CFs in vivo [56]. A form of MLI-PC LTP has been reported to depend on the CF-induced rebound potentiation of inhibitory currents in PCs [57].

    As far as the molecular mechanisms of molecular layer plasticity are considered, the involvement of NMDARs and NO was reported both at PF-PC, CF-PC, and PF-MLI synapses [58--61].

  • In the DCN, several forms of synaptic plasticity have been described, at MF-DCN and PC-DCN synapses. A MF burst that precedes a DCN post-inhibitory rebound depolarization (consequent to PC activation) induces a synapse-specific MF-DCN LTP [62]. This induction protocol mimics the predicted time-course of excitation and inhibition during eyeblink conditioning. Interestingly, MF-DCN LTP has been shown to depend on the timing of the two different signals acting independently, rather than being a coincidence detector enabled by reaching a calcium threshold. This mechanism is likely to be adequate to allow adaptive plasticity during associative learning tasks [18, 63]. Moreover, a form of calcium-dependent MF-DCN LTD has been described [64].

    At the PC-DCN connection, both LTP [65, 66] and LTD [67] have been observed. LTP and LTD appeared to depend on NMDARs activation and postsynaptic intracellular calcium increase. As a consequence, plasticity at these synapses strongly depends on excitatory (MF and CF) synapses activation level [65--67].

Special Issues in Plasticity Regulation and Control

The identification of the different forms of plasticity, mostly through experiments carried out in brain slices, is surely fundamental to understand the possible mechanisms at work. However, understanding how plasticity is controlled is then critical to realize when these mechanisms are called into play and, in most cases, this requires experiments in vivo. The precise identification of mechanisms in vivo is less precise than in brain slices, but in turn, the interplay of numerous distributed mechanisms can be appreciated. The integrated analysis of these results is beginning to provide a picture of the potential impact of plasticity in the cerebellar network.

Plasticity mechanisms in the granular layer may serve to improve spatiotemporal recoding of MF inputs into specific GrC spike patterns (expansion recoding [68]). Overall, synaptic plasticity in the molecular layer may serve to store correlated granular layer spike patterns, through PFs activation, under the CFs teaching signal [69]. Synaptic plasticity in the DCN may serve to store MF spike patterns [62, 70] depending on control signals generated through the cerebellar cortical loop. Recent works [18, 71, 72] have suggested the importance for MF-DCN and PC-DCN plasticity in controlling cerebellar learning in eyeblink conditioning and vestibulo-ocular reflex (VOR). Moreover, long-term changes in intrinsic excitability in GrCs [73], PCs [74, 75], and DCN [76, 77] cells could further regulate the global activity level in these neurons contributing to homeostasis and plasticity in the circuit (e.g., see [78]).

The temporal input patterns could play a relevant role in determining where and how plasticity is generated in the cerebellar circuit. The cerebellar neurons are designed to accurately process temporal patterns, and the synapses can decode these patterns through various forms of spike-timing-dependent plasticity (STDP). Granule cells are designed for high temporal precision [79] and can control output spike patterns on the millisecond range [68, 80] also exploiting their own plasticity mechanisms [35, 79, 81]. These patterns in turn are critical for regulating PC activation and plasticity [68]. Additional control over PC plasticity can be exerted by the IO, reflecting the variability of burst duration [82, 83]. Finally, PCs are endowed with complex mechanisms of coincidence detection, which integrate the burst patterns conveyed by granule cells, the inhibitory control of MLIs, and the signals conveyed by the IO, fine tuning spike bursts and pauses at their output [84, 85]. Another important but still puzzling aspect is the role of activity oscillations and resonance, from low to high-frequencies, which could be instrumental to implement STDP rules in cerebellar subsections [86].

Complex Spatiotemporal Dynamics of Cerebellar Learning

The nature of cerebellar learning is complex, and different components and properties have been revealed in experiments in animals and humans. A leading hypothesis is that cerebellar learning is composed of two phases [87, 88]: the fast reversible learning phase is thought to occur in the cerebellar cortex, while persistent memory should then be stored into deeper structures, for example, the DCN. A useful test that can be used to explore cerebellar learning is the EBCC reflex. An unconditioned stimulus (US, like a corneal touch or an electrical stimulus on the supraorbital nerve) elicits an eyeblink. This can be associated with a conditioned stimulus (CS, like a tone) to elicit a blink with precise time relationship to US. The EBCC is useful as it involves prediction of an event with precise timing through associative learning, thereby summarizing in an elementary form the essential elements of cerebellar functioning [89].

The involvement of the cerebellar cortex in EBCC was previously suggested by experiments in which the GABA-A receptor agonist muscimol was infused to transiently inactivate local circuit functions in rats. Infusion of muscimol in the posterior cerebellar cortex (lobule HVI) was effective after short (5–45 min) [90] but not after longer delays (90 min) [91]. Conversely, muscimol infusion in the anterior interpositus nucleus just after training was poorly effective. These experiments suggested that learning was transferred quite early from a cortical into a nuclear neuronal site.

In recent experiments (Fig. 2), EBCC has been elicited, and then its components have been disrupted using TMS in humans [20, 21]. Consistent with animal experiments, TMS applied just after training (5–10 min) affected the transient phase of learning. The cellular mechanisms of EBCC learning are thought to depend on long-term synaptic plasticity at cortical and deep cerebellar nuclei (DCN) synapses [90--92]. The parallel fiber–Purkinje cells synapse is strategically located at the convergence between the mossy fiber–parallel fiber pathway (carrying the CS) and the climbing fiber pathway (carrying the US). Another site of convergence is the DCN, which collects both mossy fiber and climbing fiber signals, in addition to being modulated by Purkinje cells [4].

Fig. 2
figure 2

EBCC in two-session protocols reveals multiple learning mechanisms. EBCC was induced in human subjects in a two-session protocol. The first EBCC training was followed by a second identical session 1 week later. Just after the first session, in a group of subject, a theta-burst TMS protocol was applied on the cerebellum. a The EBCC is a reflex in which the olivo-cerebellar system operates in closed-loop. The unconditioned stimulus (US) is an electrical stimulus to the supraorbital nerve and is conveyed to the sensory trigeminal nucleus (V). The conditioned stimulus (CS) corresponds to a tone. CS and US coterminate (“delay” EBCC). The olivo-cerebellar circuit learns to produce conditioned responses (CRs), i.e., an eyelid blink anticipating the US onset. In this system, the movement is triggered by stimulus and can be subsequently corrected in the nuclei of the facial nerve (VII) by the cerebellar intervention. The US is conveyed to the IO and generates CF signals, and the CS is conveyed through the auditory system and generates MF signals. No loop between cerebellum and cerebral cortex is required. The eyelid muscles and skin also convey proprioceptive and esteroceptive signals to MFs. b Number of CRs (%) along trials (six acquisition blocks followed by an extinction block) progressively learnt to generate CRs anticipating the US, to rapidly extinguish them and to consolidate the learnt association to be exploited in the retest session (sham indicates an ineffective stimulation)

At both sites, long-term synaptic plasticity has been suggested to play important roles in EBCC [93]. In particular, cortical plasticity has been associated with the fast learning process and DCN plasticity with the slow learning process [26, 72]. Thus, the effect of TMS is compatible with disruption of cortical rather than DCN plasticity. Given the distributed nature of cerebellar cortical plasticity, a working hypothesis is that TMS operated at multiple cortical sites [4]: (i) in the granular layer, on N-methyl-d-aspartate (NMDA) receptor-dependent LTP and LTD at the MF-GrC synapses as well as on long-lasting changes in granule cell intrinsic excitability; and (ii) in the molecular layer, on various forms of NMDA receptor-independent LTP and LTD at PF-PC, at climbing fiber–Purkinje cell synapses, at molecular interneuron synapses as well as on long-lasting changes in Purkinje cells intrinsic excitability.

Models of Cerebellar Synaptic Plasticity

In order to conceptualize the different forms of cerebellar plasticity, a set of four simplified rules have recently been proposed following the main biological properties reported above (cf. Fig. 1). All these plasticity rules were conceived to be bidirectional and have been based on simplified formalisms. These rules have been rescaled and assigned to specific synapses and cerebellar microcomplexes (i.e., the morpho-functional units in which the cerebellar model has been subdivided) in order to deal with the complexity that neurorobotic tasks imposed (see below). The equations are constructed to generate a variation of synaptic strength depending on the difference between the LTP and LTD terms, which have their own maximum size and rate of change. Additional terms represent the dependence of LTP and LTD on other critical processes, like activity in certain neurons and synapses. In the conventional system used in the simulations, LTPmax and LTDmax represent the maximum percentage changes of LTP and LTD and are related to the corresponding changes in synaptic currents. The time constant α represents the rate of decay of plasticity after having been established and is related to the physiological time-course of plasticity based on observations in vitro and in vivo. The time-dependent terms (e.g., O(t)) are related to the average firing rate of a given neuronal population during the simulation.Footnote 1

  1. 1.

    PF-PC synaptic plasticity is, by far, the most investigated cerebellar plasticity mechanism, as evidenced by the large amount of studies supporting the existence of multiple forms of LTD [3, 45, 94] and LTP [6, 45, 94]. Proof of this PF-PC plasticity trace was recently encountered in both anesthetized (Ramakhrishan and D’Angelo, unpublished observations) and alert animals [95]. The most renowned form of LTD is heterosynaptically driven by CF activity and therefore by the complex spikes (CSs) elicited in PCs, whereas the main form of LTP does not require CF activity and, therefore, it is related to the simple spikes generated by PF activity. The specific formalism developed to describe the PF-PC plasticity rule depended on the model adopted to describe the cerebellar granular layer. Assuming that PFs were active following a certain time sequence during movement [96--98], PF-PC synaptic plasticity could be implemented as follows [26, 28]:

    $$ \begin{array}{l}\varDelta {W}_{P{F}_j-P{C}_i}(t)=\left\{\begin{array}{c}\hfill \frac{LT{P}_{Max}}{{\left(I{O}_i(t)+1\right)}^{\alpha }}-LT{D}_{Max}\cdot I{O}_i(t), if\kern0.5em P{F}_j\kern0.1em is\kern0.5em active\kern0.5em at\kern0.5em t\hfill \\ {}\hfill 0\kern1em otherwise\hfill \end{array}\right.\\ {}\kern0.1em \\ {} where\kern0.5em i\in \left\{1,2,\dots, Number\kern0.6em of\kern0.5em microcomplexes\right\}\end{array} $$

    where \( \varDelta {W}_{P{F}_j-P{C}_i}(t) \) represents the weight change between the jth PF and the target PC associated with the ith microcomplex. IO i (t) stands for the current activity coming from the associated climbing fiber, LTP Max and LTD Max are the maximum long-term potentiation/long-term depression (LTP/LTD) values, and α is the LTP decaying factor. This rule assumes that LTP and LTD coexist at the same PF-PC synapse. Since LTP and LTD, by definition, induce opposite effects in relation to CF activity, providing the mathematical expression with appropriate parameters makes the synaptic weight variation to be positive (LTP) when CF activity is approaching 0 (low error levels in the movement) and makes the weight variation to be negative (LTD) and linearly proportional to CF activity otherwise. In previous approaches, a linear function was used [99] when the synaptic weights were modified according to a teaching signal, but this implied the inability of the synaptic learning rule to fully remove the manipulation error since LTD was always counterbalanced by “unsupervised” LTP. The present rule overcomes the linearity problem by inserting the α decaying factor.

  2. 2.

    MF-DCN synaptic plasticity has been shown to depend on the intensity of DCN cell excitation [18, 64, 100, 101] and could be implemented as follows [26, 28]

    $$ \begin{array}{l}\varDelta {W}_{MF-DC{N}_i}(t)=\frac{LT{P}_{Max}}{{\left(P{C}_i(t)+1\right)}^{\alpha }}-LT{D}_{Max}\cdot P{C}_i(t),\\ {}\kern0.1em \\ {} where\kern0.5em i\in \left\{1,2,\dots, Number\kern0.6em of\kern0.5em microcomplexes\right\}\end{array} $$

    where ∆W MF-DCNi (t) represents the weight change between the active MF and the target DCN associated with the ith microcomplex, PC i (t) is the current activity coming from the associated PCs, LTP Max and LTD Max are the maximum LTP/LTD values, and α is the LTP decaying factor. The MF-DCN learning rule, despite its resemblance to the PF-PC learning rule, bears two significant differences. The first difference is a consequence of the limited capability of MFs, compared with PFs, to generate sequences of nonrecurrent states [98, 102, 103]. The second difference involves the connection driving LTD and LTP. Whilst PF-PC plasticity is driven by CF activity, MF-DCN plasticity is driven by PC activity. This mechanism is capable of optimizing the activity range in the whole inhibitory pathway comprising MF-PF-PC-DCN connections: High PC activity causes MF-DCN LTD, whereas low PC activity causes MF-DCN LTP. This mechanism implements an effective cerebellar gain controller able to adapt its output activity range in order to minimize the amount of inhibition generated in the MF-PF-PC-DCN inhibitory loop.

  3. 3.

    PC-DCN synaptic plasticity was reported to depend on the intensity of DCN cells and PC excitation [65--67, 99] and could be implemented as follows [26, 28]:

    $$ \begin{array}{l}\varDelta {W}_{P{C}_i-DC{N}_i}(t)=LT{P}_{Max}\cdot P{C}_i{(t)}^{\alpha}\cdot \left(1-\frac{1}{{\left(DC{N}_i(t)+1\right)}^{\alpha }}\right)-LT{D}_{Max}\cdot \left(1-P{C}_i(t)\right),\\ {}\kern0.1em \\ {} where\kern0.5em i\in \left\{1,2,\dots, Number\kern0.6em of\kern0.5em microcomplexes\right\}\end{array} $$

    where ∆W PCi-DCNi (t) is the synaptic weight adjustment at the PC-DCN connection reaching the DCN cell associated with the ith microcomplex, PC i (t) is the current activity coming from the associated PCs, and finally, DCN is the current activity regarding DCN cells. LTPMax and LTDMax are the maximum LTP and LTD values, and α is the LTP decaying factor. This learning rule leads the PC-DCN synapses into a synaptic weight appropriate to match the activity from the cortex (MF-PF-PC-DCN) and the activity from the excitatory pathway (MF-DCN). According to this learning rule, LTP occurs only when both the PCs and their target DCN cell are simultaneously active.

  4. 4.

    Finally, it has recently been proposed that IO-DCN synaptic plasticity may provide an efficient way to embed the feedback controller predicted by Ito [104] within the cerebellar circuitry. This controller was able to generate a proper command in motor cortex capable of tuning the viscoelastic properties of the musculo-skeletal system. This was conceived as a fast mid-term adaptation mechanism to cope with the initial control phase when plasticity has not yet progressed in the rest of the cerebellar circuit. Within this hypothesis, IO-DCN plasticity was implemented to regulate the initial synaptic strength of DCN cells driven by the IO as follows [28]:

    $$ \begin{array}{l}\varDelta {W}_{IO-DCN,i}(t)=MT{P}_{Max}\cdot I{O}_i(t)-\frac{MT{D}_{Max}}{{\left(I{O}_i(t)+1\right)}^{\alpha }}\\ {}, where\kern0.5em i\in \left\{1,2,\dots, Number\kern0.6em of\kern0.5em microcomplexes\right\}\end{array} $$

    where ∆W IO-DCN,i (t) represents the differential synaptic weight factor related to the active connection at time t (whose associated activity state corresponds to IO i (t)). The connection corresponds to the DCN cell associated to the ith microclomplex. IO i (t) stands for the current activity coming from the associated climbing fiber. MTPMax and MTDMax are the maximum midterm potentiation and depression, and α is the MTD decaying factor. MTPMax and MTDMax are large in comparison to LTP and LTD at the other synapses ensuring a fast response and a negligible contribution to the learning process in the long term.

    Whilst these equations appropriately address the learning process of the cerebellar network, some parameters, including the plasticity decaying rates (α) and the LTPmax and LTDmax scaling factors, are the “condensed” expression of multiple mechanisms so that their correspondence with real synaptic parameters needs to be worked out. Moreover, the variety of biological mechanisms is not fully represented by these equations. In fact, there are many more plasticity rules located at the PF-PC synapses than considered here, as well as there are plasticity mechanisms within the granular layer that were sidestepped, and there is a plasticity mechanism at the IO-DCN connection that was predicted but not proved yet.

Closed-Loop Robotic Simulations Embedding Multiple Plasticity Rules

In a recent series of papers, we have explored the impact of distributed cerebellar plasticity using a reverse engineering approach, i.e., making a biologically plausible reconstruction of the system to explore its internal mechanisms of function. Since the classical long-term synaptic plasticity between PFs and PCs, which is driven by the IO, can only account for limited aspects of learning, we have used distributed forms of plasticity in the molecular layer and DCN [23, 26, 28, 105]. In the model, the CFs provide a teaching signal driving long-term synaptic plasticity both at the IO-PC and IO-DCN connections. We have developed analog and spiking robotic controllers. An example of a spiking robotic controller with reversible PF-PC plasticity is shown in Fig. 3, and an example of simulations obtained with the same controller equipped with an analog cerebellar module with reversible plasticity at the PF-PC, PC-DCN, and MF-DCN synapses is shown in Fig. 4.

Fig. 3
figure 3

Closed-loop simulations using an olivo-cerebellar model single plasticity. An olivo-cerebellar spiking-neural network (OC-SNN) model was coupled to a robotic control system through a radial basis function (RBF) interface to simulate an obstacle collision avoidance task, an associative Pavlovian-like behavior emulating EBCC (cf. Fig. 2). In this task, the IO-SNN operated as a forward controller by regulating the firing pattern in DCN neurons under PC control. a The OC-SNN was operated in closed loop. The conditioned stimulus (CS) represents a Warning signal, detected by the optical tracker, activating at a given distance threshold between the moving robot end-effector and the fixed obstacle placed along its trajectory. The unconditioned stimulus (US) corresponds to the collision event (crash). CS and US coterminate (as in the “delay” EBCC). The olivo-cerebellar model learns to produce conditioned responses (CRs), i.e., a stop of the robotic arm (collision avoidance) anticipating the US onset. In this system, the trajectory planner generates a movement that is subsequently corrected in the motor controller by the cerebellar intervention. No loop is active between cerebellum and trajectory planner. The US is generated by collision during the task and conveyed by the sensory controller to the IO. The CS is generated by the optical tracker. The sensory controller also conveys proprioceptive signal from the robotic arm sensors to MFs of the OC-SNN. b Number of CRs (%) along trials (80 acquisition trials and 20 extinction trials for two sessions in a row; CR% is computed as percentage number of CR occurrence within blocks of 10 trials each). The black curve is the median on 15 tests, and the gray area is the interquartile interval. Despite uncertainty and variability introduced by the direct interaction with a real environment, the OC-SNN progressively learnt to generate CRs anticipating the US, to rapidly extinguish them and to consolidate the learnt association to be exploited in the retest session. Note the similarity with EBCC acquisition in Fig. 2. c PCs and DCNs spike distribution along trial time (500 ms from CS onset, t0) for all trials. Each pixel represents one time-bin of 10 ms, within which the number of spikes of the correspondent group is computed (first column PC cell population, second column DCN cell population). After learning, the response of PCs to MF inputs decreased, and this increased the discharge in DCN neurons. The process was better exemplified in the adaptation of the EBCC, in which a precise time relationship between the events can be established. Since the DCN spike pattern changes occurred before the US arrival, the DCN discharge accurately predicted the US and therefore could facilitate the release of an anticipatory behavioral response. At the same time, the IO signal carrying US decreased. The prediction of a noxious stimulus triggers an anticipatory motor command. The inhibition mechanism of the IOs by the DCNs translates the motor command into a sensory prediction signal, allowing a single cerebellar area to simultaneously tackle both motor execution and sensory prediction

Fig. 4
figure 4

Dynamic plasticity processing in closed-loop robotic simulations using an olivo-cerebellar model with distributed plasticity. An olivo-cerebellar analog neural network (OC-ANN) model embedded with distributed plasticity was coupled to a robotic control system as in Fig. 3 to simulate an obstacle collision avoidance task, an associative Pavlovian-like behavior emulating EBCC (cf. Fig. 2). In this task, the IO-ANN operated as a forward controller by regulating the firing pattern in DCN neurons under PC control. Plasticity was implemented at the PF-PC, MF-DCN, and PC-DCN synapses. The figure demonstrates that learning of the CR occurs with both with one or three plasticity rules into the OC-ANN. However, with three plasticities, there is faster acquisition and dynamic plasticity transfer from PF-PC to MF-DCN and PC-DCN synapses generating the two-phase learning predicted by theory and observed experimentally in EBCC

The robotic simulations not only revealed that PF-PC plasticity was fundamental to relate cerebellar plasticity to motor errors but also revealed that PF-PC plasticity proved insufficient per se to make the cerebellum an effective adaptive controller. LTD and LTP had to coevolve dynamically in order to control PF-PC transmission making it reversible for resetting and reuse. The memory stored in the PF-PC synapse was then transferred into the DCN allowing consolidation. This memory transfer was controlled by feedback signals arriving through extracerebellar loops and proved critical to allow self-rescaling and automatic gain adjustment, preventing PF-PC saturation. This operation required double adjustment of MF-DCN and PC-DCN synapses in order to balance memory deposition in DCN neurons. Moreover, in order to accelerate and stabilize learning, the closed-loop robotic simulations suggested that cerebellar gain control could be adjusted through MF-DCN and PC-DCN synaptic plasticity working in equilibrium with IO-DCN plasticity. IO-DCN connections ensure stable outputs in the early learning stages, when the strength of MF-DCN and PC-DCN connections is not set yet through the learning process. When the strength of the synaptic weights of MF-DCN and PC-DCN connections begins to stabilize, the synaptic strength of the IO-DCN connection diminishes. Therefore, at the end of the learning process, the effect of the IO-DCN connection in determining the cerebellar output is negligible. Nonetheless, the IO-DCN connection remains ready to act when new unexpected patterns have to be learnt. In addition, a proper synaptic weight adjustment at DCN synapses allows the PFs to operate over their complete frequency range, enhancing the precision of the cerebellar output. To sum up, the IO-DCN pathway could allow a global feedback error reduction facilitating early and fast error corrections. The MF-PF-PC-DCN system would operate by achieving more accurate corrections in the long-term, but it required slow learning [28].

An interesting aspect of the robotic simulations was that they could be successfully applied to different behaviors known to involve the cerebellum, including VOR, EBCC, force field correction, and arm trajectory control [23, 105], indicating that the implicit algorithm of the cerebellar network was of general applicability. With reference to the EBCC case illustrated above, EBCC simulations supported the concept that memory transfer between PF-PC and DCN synapses has to occur rapidly after the beginning of learning, helping to define the possible patterns of alterations leading to EBCC impairment caused by cTBS (Casellato et al. unpublished).

The robotic simulations provide a series of major conceptual advancements. First, the synaptic plasticity rules can be observed at work inside an entire sensory-motor system or even in closed-loop. This is a privileged way to understand how the properties revealed in physiological experiments in isolation (e.g., in brain slices) can impact on learning and behavior. Secondly, several plasticities can be seen at work simultaneously, yet maintaining full control over their individual evolution. Thirdly, the nature of changes in synaptic transmission and neuronal firing occurring during learning can be predicted and later tested for biological validations. Fourthly, the quantitative nature of the data can be exploited for developing theoretical models of the cerebellar function. Clearly, the precision of model predictions depends on the precision and completeness of model internal mechanisms. These are at the moment rather simplified in terms of neuronal and synaptic dynamics but complete enough to generate a coherent picture. It will be a challenge for the future to improve and make more realistic network and robotic models in order to make predictions more and more reliable.

Distributed Plasticity: New Perspectives for Cerebellar Learning

TMS–EBCC experiments in humans in vivo and closed-loop robotic simulations have provided new insight on how the sensori-motor control system could exploit distributed plasticity in the cerebellar network to generate biologically plausible learning. TMS–EBCC experiments have indicated that memory has to be transferred from the cerebellar cortex to DCN in order to stabilize learning, although the exact time constant of memory transfer is still unknown. Robotic simulations have implemented this memory transfer by allowing the cerebellar circuit to dynamically adjust synaptic weights between the PF-PC synapse and the DCN by exploiting the sensory-feedback deriving from ongoing activity in closed-loop.

Robotic simulations revealed that a supervised mechanism relating cerebellar learning to motor errors at the PF-PC synapse remains a critical constraint. However, PF-PC plasticity also proved insufficient per se to make the cerebellum an effective adaptive controller, and other forms of plasticity distributed throughout the network appeared to be critical. These include plasticity not only in the DCN but also probably in the granular and molecular layers, although plasticity in these two latter subcircuits has not been tested in robotic simulations yet. Plasticity in the granular layer is indeed expected to determine and store the large variety of spatiotemporal patterns required to implement the expansion recoding of MF signals to be presented to PCs and could become critical when multiple forms of input signals from extended sensori-motor structures will be considered.

There are some predictions descending from these investigations about the nature of plasticity mechanisms in the cerebellar circuit. First, all plasticities should be reversible, so they could have both LTP and LTD. Secondly, since the memory transferred into downstream structures (e.g., from PF-PC into DCN) is controlled by feedback signals arriving through extracerebellar loops, understanding distributed plasticity requires the whole system working in closed loop. Thirdly, there are forms of plasticity that may not last for long in the freely behaving animal (e.g., PF-PC LTD itself), and this should be taken into account when searching for such plasticities experimentally. Fourthly, there could be forms of plasticity that have not yet been identified experimentally but could have remarkable impact on cerebellar learning (e.g., the IO-DCN plasticity). Finally, DCN neurons could process not just two but even three forms of plasticity coming from MF-DCN, PC-DCN, and potentially also IO-DCN synapses. Therefore, further experimental investigation on plasticity of synapses impinging on DCN neurons is needed.

The robotic cerebellar models need themselves to be improved by implementing more realistic spiking networks and learning rules. For example, the variety of plasticities impinging on PC and DC synapses is not yet represented in the models. Moreover, the granular layer needs still to be fully implemented. Finally, robotic simulations will have to be counterchecked by performing biological experiments in which critical plasticities can be selectively switched off to see whether comparable alterations emerge in animal behavior. Genetic mutant mice with inducible cell-specific alterations may be used to selectively block one or more plastic mechanisms. Alternatively, optogenetics may be used to switch on-off plasticity at certain synapses.

In conclusion, distributed plasticity is opening a new perspective for interpreting the complex processes underlying cerebellar learning, and its understanding needs to make use of the new tools provided by neural circuit modeling and neurorobotics in combination with advanced biological techniques for selective brain circuit control and monitoring. It is also envisaged that new robotic controllers and robots embedding distributed plasticity rules will demonstrate improved versatility and self-adapting properties allowing in turn to better understand how the forward/feedback controller operations of the cerebellum take place in nature.