Keywords

1 Introduction

Besides its involvement in motor control, the cerebellum plays a role in cognitive processing including music processing. Adults show deficits in various cognitive and affective domains, in particular to executive, attentional, memory and behaviour functions after damage to the cerebellum, which subsequently has been subsumed for a syndrome called “cerebellar cognitive affective syndrome (CCAS).” In terms of the impairments in emotion, CCAS may involve deficiencies in emotion identification as well as in emotion regulation (see an overview in Adamaszek et al. 2017). Therefore, music is also a topic for research on the cerebellum in concern to these noted cognitive and affective issues. In particular, the role of the cerebellum in relation to emotions in music is a hitherto little considered but nevertheless exciting area in neuroscience on specific higher order functions of the cerebellum, which raises some specific enquiries to clarify the underlying mechanisms. In this chapter, we aim to review the different links between music perception or music production and the cerebellum.

2 Music Perception Network and the Cerebellum

Music perception always involves a network of different brain structures depending on the type of music and the circumstances of perception. Among these brain structures, the cerebellum has been frequently identified to be included within the responsible networks, whereby details of the underlying mechanisms remain unclear. According to the data available so far, the cerebellum has a separate function in the temporal dimension of signal processing of sensory perception, which in turn influences the respective motor execution as well as the associative cognitive and affective perception. Given that time information processing is essential for motor control as well as for music perception (Molinari et al. 2003), the investigation of musical skills in people with cerebellar damage seems not only sensible but also promising, and such investigations are also quite practicable in musically untrained patients, since non-musicians have basic musical skills.

Nevertheless, experimental approaches in brain research on music are difficult to carry out because music is a complex cognitive task, and its perception also depends on pedagogical and cultural aspects. Therefore, the different aspects of music processing have been examined as separate parameters and with different methods. We follow this approach in this narrative review and summarise the findings on the link between the cerebellum and music perception for the different parameters of music.

2.1 Rhythm

In neuroimaging studies, Parsons described distinct neural structures as activated when musicians had to detect specific melodic, harmonic, or rhythmic errors (Parsons 2001). In this study, among other brain regions, the cerebellum showed activation during all three exercises, but in the rhythm task cerebellar activation was about twice as high as compared to the harmony and the melody task. In particular, when musicians and non-musicians were asked to discriminate pairs of rhythm, non-musicians showed higher cerebellar activity in the meter, tempo and pattern discrimination. However, musicians displayed a stronger activation of the cerebellum for the duration discrimination. This finding presumably reflects differences in strategy, skill and cognitive representation of musicians and non-musicians, which in turn might indicate a dependency of brain function on specific skill acquisition at cerebral and cerebellar sites.

A further interesting study was published by Molinari et al. (2003), who studied a group of patients with cerebellar atrophy and another with focal cerebellar damage in comparison to healthy control subjects. In this study, participants had to detect changes of rhythm frequency. As a result, the subjects of cerebellar atrophy showed significant inferior results, suggesting that cerebellar pathologies may differentially influence the ability to appreciate rhythm changes. In addition, cerebellar damage seems to affect the variability and thus the stability of motor response as both groups were tested in their ability to tap synchronously following an auditory stimulus. Interestingly, an involvement of the anterior and posterior lobes of the cerebellum in the memory for rhythm has been confirmed by a study applying a specific functional magnetic resonance imaging (fMRI) protocol (Sakai et al. 1999).

The cerebellum has been shown to be involved in the perception and synchronisation of musical beat. In an MRI study by Paquette and collaborators, the researchers used voxel-based morphometry to correlate inter-individual differences in the performance of the Harvard Beat Assessment Test (H-BAT) with local inter-individual variations in grey matter volumes across the entire brain in 60 individuals (Paquette et al. 2017). The analysis of the obtained data identified significant co-variations between performances on two perceptual tasks of the H-BAT in association with the beat interval change discrimination (faster, slower) and grey matter volume variations in the cerebellum. The threshold for discrimination in the Beat Finding Interval Test (quarter note beat) was positively associated with grey matter volume variation in cerebellum lobule IX in the left hemisphere and crus I in both hemispheres. Moreover, the discrimination threshold for the Beat Interval Test (simple series of tones) showed a positive association with grey matter volume variations in left cerebellar crus I/II. Summarising these results, the cerebellum was shown to take part in beat interval discrimination tasks, whereby the cerebellar grey matter and overall cerebellar integrity are presumably important for temporal discrimination abilities.

Rhythm as the basic time structure of music is composed of distinct temporal components such as meter, pattern and tempo. Each component requires different computational processes, i.e. meter involves repeating cycles of strong and weak beats, whereas pattern relies on intervals at each local time point, which vary in length across segments and are linked hierarchically. Finally, tempo is subject to frequency rates of underlying pulse structures. A positron emission tomography (PET) of Thaut et al. (2014) assessed the neural patterns of brain activity to rhythmic elements in adult musicians and non-musicians. In this study, crucial interests were put on the covert same-different discriminations of (a) pairs of rhythmic, monotonic tone sequences representing changes in pattern, tempo and meter, and (b) pairs of isochronous melodies. In their peculiar neuroimaging study, pattern, meter and tempo tasks were associated with focal activities in right, or bilateral, areas of frontal, cingulate, parietal, prefrontal, temporal and cerebellar cortices. In more detail, meter processing alone activated areas in right prefrontal and inferior frontal cortex, which were associated with more cognitive and abstract representations. Moreover, pattern processing alone recruited right cortical areas involved in different kinds of auditory processing. Tempo processing alone engaged mechanisms of somatosensory and premotor information (e.g. posterior insula, postcentral gyrus). Finally, melody resulted in activity different from the rhythm conditions (e.g. right anterior insula and various cerebellar areas). These findings suggest the outlines of some distinct neural components underlying the components of rhythmic structure (Thaut et al. 2014).

2.2 Timing Aspects

Several studies examined the timing aspect of the cerebellum. It was reported that cerebellar patients are impaired in discriminating time intervals (Nichelli et al. 1996). In addition, it was demonstrated that the cerebellum contributes to the production of a timed motor response, especially when it is complex and/or novel (Ivry et al. 1988; Jueptner et al. 1995; Penhune et al. 1998).

A precise movement control such as for tapping, in addition to certain time keeping functions, is required to perform tasks of rhythm and metrum. In particular, metrum tasks are impaired in patients with cerebellar lesions (Ivry and Keele 1989). Although the rhythm determines the metrum, the retention of a stable metrum demands a targeted motor timing as well as a precise motor control. As shown in previous research, patients with cerebellum damage may develop impairments of these properties. Moreover, as a meta-analysis on tasks related to music and timing outlined (Keren-Happuch et al. 2014), the two aspects of the field are inherently similar, where it is a matter of knowing the chronological sequence, tasks in these two domains recruit separate activations. Based on the theory that parts of the cerebellum process via an internal timing mode, and again considering the evidence suggesting that timing is associated with knowledge of temporal order, it was expected that among the responsible cerebellar activations for timing, there would be subsets of activations for the rhythmic aspects of music processing. However, results in this meta-analysis revealed recruitment of distinct regions, with music-related tasks consistently demonstrating significant activation in right lobule IV/V, bilateral lobules VI and VIII, whereas timing uniquely showed an activation of the right lobule VI (Keren-Happuch et al. 2014).

2.3 Pitch and Timbre

Besides rhythm, metrum and timing, pitch and timbre are the other important basic parameters of music processing, in which cerebellar structures are presumably involved. For example, a study of Parsons found that patients with cerebellar degeneration were strongly impaired in a pitch discrimination task, whereby the degree of impairment correlated with the severity of their ataxia disorder (Parsons 2001). Another study of Parsons et al. assessed the performance of patients with cerebellar degeneration in an ordinary pitch discrimination task (Parsons et al. 2009) and revealed strongly impaired discrimination abilities. Interestingly, the amount of this impaired discrimination correlated with the degree of cerebellar atrophy.

With regard to the underlying structural aspects, Gaab et al. (2003) stated that the supramarginal gyrus of the parietal lobe may function as a storage site of short-term pitch information. Moreover, the authors suggested an important role of the cerebellum for performing a pitch memory task. High cognitive processes like auditory information retention might depend on a mechanism separate from that of comparing two successive tones. According to a study by Zatorre et al. (1994) by applying a positron emission tomography (PET), changes of cerebral blood flow revealed a wider cortical and subcortical range being involved if, in contrast to the absence of memory load, the pitch retention was required. Therefore, as a summary of the present literature, the cerebellum is undoubtedly involved in tasks that require sensory data acquisition such as pitch via multiple connections to cortical and subcortical areas.

The putative function of the cerebellum for sound processing was also studied in two auditory tasks by Lega et al. using a pitch discrimination and a timbre discrimination task (Lega et al. 2016). In their study, healthy subjects performed a pitch and a timbre discrimination task prior and after receiving offline low-frequency transcranial magnetic stimulation (TMS) over the right cerebellum. As a result, when activity within the right side of the cerebellum was suppressed by inhibitory 1 Hz TMS, the participants’ ability to discriminate pitches, but not timbres, was impaired. These findings point at least in some aspects to a causal role of the cerebellum of sound processing, which might be important for understanding the impact of cerebellar lesions on sensory functions in particular at a clinical perspective.

Congenital amusia is a lifelong neurodevelopmental disorder of music-related pitch processing. A study research of Zhang using a specific fMRI protocol focused on the neuronal network in patients with congenital amusia speaking a tonal language (Cantonese) (Zhang et al. 2017). The rationale for this study was that previous studies of speakers of non-tonal languages had suggested neural deficits of innate amusia in the music-selective neural circuitry in the right inferior frontal gyrus (IFG). After all, it had remained unclear whether this finding could be generalised to congenital amusics in tonal languages. To address this question, the research group investigated the neural circuits that underlie the processing of relative pitch intervals in pitch-matched Cantonese level tones and musical stimuli in Cantonese speakers with congenital amusia and musically healthy controls. As a result, Cantonese-speaking amusic subjects depicted abnormal brain activities in a largely distributed neuronal network during the processing of lexical tone and musical stimuli. Furthermore, while control subjects showed significant activation in the right superior temporal gyrus (STG) and cerebellum independent of the lexical tones and music, amusic subjects showed no activation in these regions at all. According to the authors, their observation rather reflects a functional deviation of the neural mechanism of relative pitch processing in the amusic subjects. Interestingly, the authors found no significant group difference in the right IFG. Taken together, the findings of Zhang et al. (2017) imply that the neuronal deficits in tonal language speakers might differ from those in non-tonal language speakers, and overlap partly with the neuronal circuitries of lexical tone and musical stimuli processing (e.g. right STG).

In summary, evidence from activation studies, lesional studies and congenital musia suggest that the cerebellum is involved in pitch, but not in timbre, processing in a clinically pertinent way.

2.4 Music Identification

The cerebellum is also involved in higher cognitive functions such as identifying and categorising sounds as music. The most effective cues for music identification, i.e. rhythm or melody, were studied in three experiments of Hébert and Peretz (1997). In their study, the combination of pitch and duration, i.e. also cerebellar functions, was most relevant for retrieving data from long-term memory for music (Hébert and Peretz 1997). Accordingly, data of several neuroimaging studies demonstrated separate activations of specific brain regions while accessing stored pieces of music (Platel et al. 1997; Rauschecker 2005; Satoh et al. 2006). Satoh et al. (2006) suggested an implementation of brain regions responsible for retrieval from long-term memory as well as verbal and emotional processing (i.e. bilateral anterior parts of the temporal lobe, superior temporal regions and parahippocampal gyri) when evaluating the familiarity of a composition. In fact, the process of recognising familiar melodies deems to proceed in several steps. After sensory auditory information acquisition, a melody image is formed. In subsequent steps, melodies stored in long-term memory are retrieved and compared with this melody image. Satoh et al. also point out that during the recognition of familiar melodies, brain areas involved in the formation of a melodic image are active, as are those areas involved in the retrieval of melodies from long-term memory (Satoh et al. 2006).

Nevertheless, the involvement of the cerebellum in these pathways of music recognition remained uncertain after the first studies. In several PET series, a contralateral coactivation of auditory temporal cortex and lateral cerebellum suggests that they form a distributed circuit of auditory processing (Parsons 2001). According to neuroimaging (Habas et al. 2009) and anatomical (Strick et al. 2009) findings, there is a clear evidence of cerebro-cerebellar connectivity as representative for cerebellar involvement in this cognition (Petacchi et al. 2011). The PET study of Petacchi et al. detected a particular activity of the cerebellum for passive listening in healthy subjects. This activity was even elevated during pitch discrimination, and furthermore associated with the difficulty of the task (Petacchi et al. 2011).

Of separate interest in the study of cerebral mechanisms in music processing is also the identification of those areas in the cerebral cortex in which the internal processes of the encoding of musical elements are controlled. In a study by Rauschecker, the brain areas involved in the encoding and retrieval of melodies were investigated via a functional MRI examination (Rauschecker 2005). In this experimental study, the brain activation of healthy subjects was investigated during the silent anticipation of familiar music, since it could be assumed that with the first sequence of a familiar melody, the following part can be anticipated. In the study, the respective regional activations during the silent anticipation of a melody on the one hand, and those activations during simple listening to music on the other hand, were recorded and compared with each other. In this interesting paradigm, in addition to the activations of different cerebral areas, namely the anterior part of the right superior temporal cortex, the right inferior frontal cortex and anterior insula, the left anterior prefrontal cortex, and the anterior cingulate, a significant bilateral activation of the posterior lateral cerebellum was observed during the anticipation task. With these results, it was shown which brain areas are involved in the perception of music via bottom-up and again top-down control pathways, with the lateral cerebellar areas being involved in the anticipation of musical sequences and thus the construction of melodic strands.

The importance of a melodic image for the recognition of melodies has already been explained in an earlier section. As a work group around Herholz found, this generation of musical images is accompanied by integrated activations within distinct networks of predominantly cerebral, but also proportionally cerebellar brain structures (Herholz et al. 2012). In this indicative work of Herholz et al. about the neural underpinnings of the processes that characterise the generation of a melodic image, participants were presented with the titles and lyrics of well-known songs, which stimulated them to mentally vocalise. In this functional MRI study, the main results were even expected, i.e. during the imagination of the vocalisation, co-activations of motor and premotor areas were detected. Interestingly, the area in the cerebellum that represents the tongue and lip movements was activated too. With these active brain areas revealed during the imagination, the importance of an auditory motor loop was conclusively emphasised. In addition, the importance of the cerebellum for mental vocalisation during musical imagination was also elaborated.

With regard to the mechanisms of emotional and memory levels functioning in music, imaging studies on the neural brain structures responsible for this are of separate interest. Here, imaging procedures can provide information, as Altenmüller’s research group recently attempted to clarify in a special paradigm (Altenmüller et al. 2014). In their study, subjects were presented with short excerpts of film music with emotional features during an fMRI recording. A particular question here was which brain structures were activated in the sequences of the film music excerpts successfully retrieved from episodic long-term memory (Altenmüller et al. 2014). As a result, sequences of musical stimuli in contrast to sequences of silence activated parts of the left anterior cerebellum in addition to various cerebral cortical regions. Interestingly, old music pieces vs. new music pieces led to activations within the left medial dorsal thalamus and that of the left midbrain. Another finding was that regions within the right inferior frontal gyrus and left cerebellar hemisphere triggered to specific activation for recognised vs. unrecognised old pieces. For the stimuli with pleasant pieces of music, there was again activation prefrontally within the left medial frontal gyrus as well as the right superior frontal gyrus, to that of the left precuneus, but also the left posterior cingulate (PCC), the middle temporal gyri bilaterally and the left thalamus, compared to less pleasant pieces. This very elegant fMRI study thus identified specific brain networks that highlighted interrelated network activities for the retrieval of musical memories and the emotional processing of symphonic film music. With these results, the importance of the valence of a piece of music for the memory performance and its thus also extremely rapid recognition were worked out in a fascinating fashion.

Another functional imaging study by Demorest et al. (2010) had as its object the investigation of cultural factors that influence the shaping of music perception and music memory. In this fMRI study, the hypothesis was formulated that listeners show different activation patterns related to music processing when encoding and retrieving culturally familiar and unfamiliar stimuli. Here, it was hypothesised that people would elicit broader neural activation when presented with culturally unfamiliar or even unfamiliar musical sequences, which in turn would map onto more complex memory tasks (Demorest et al. 2010). In their study, US and Turkish subjects were presented with novel music examples from their own and a foreign culture for identification, which were a series of short excerpts taken from the longer examples. Using this separate paradigm, it was shown that subjects in both groups were more successful in remembering music from their own culture. In the analysis of the fMRI data, stronger activation when listening to culturally unfamiliar music was indeed found within the left cerebellar hemisphere, the right angular gyrus, the posterior precuneus and the right middle to inferior frontal areas. Stronger activations were again recorded in the cingulate gyrus and the right lingual gyrus, provided the subjects remembered culturally unknown music.

Another study demonstrating cerebellar involvement in the networks for the separate perception and processing of musical information has already been undertaken by Griffiths (2000). This PET study involved six subjects with musical hallucinations following acquired deafness. According to the results of this specific PET examination, the brain areas were to be identified in particular, which highlight their functional significance via an increase in activity depending on the severity of the hallucination. A group analysis could not identify any effects for the primary auditory cortex. Instead, cluster-correlated activities were detected in the posterior temporal lobe, the right basal ganglia, the cerebellum and the inferior frontal cortices. Interestingly, these cluster activities captured functional networks that bear quite striking resemblance to those in the normal perception and imagination of segmented sound patterns presented earlier in paradigms. These results are therefore consistent with the proposed neuropsychological and neural mechanisms of music perception.

An interesting question in the investigation of cerebellar mechanism in music perception is to what extent the known process involvement of the cerebellum in spatial domains (Argyropoulos et al. 2020) should be taken into account here. A paper by Picazio et al. (2013) provides an exciting insight here, which addressed this question using non-invasive transcranial magnetic stimulation. In this work, the activity of the cerebellum of subjects was investigated in the connection of spatial and musical domains, which could be presented in the context of tasks of embodied (EMR) or abstract (AMR) mental rotation during listening to Mozart’s sonata KV 448. The Mozart sonata was chosen because it is known to increase spatiotemporal reasoning (one of the so-called “Mozart” effects), which was observed independently of continuous theta burst stimulation (cTBS) of the left cerebellar hemisphere (Picazio et al. 2013). This effect was indeed observed even in the absence of cerebellar cTBS, i.e. listening to music did not influence either of the two mental rotation tasks here either, which means that a specific “Mozart effect” could be judged to be refuted. On the contrary, cerebellar theta-burst stimulation before listening to music caused the subjects to perform the EMR task faster and less accurately, but not the AMR task. Inhibition of the cerebellum by cTBS thus unmasked the effect of music listening on motor imagination. These results thus provided quite vivid evidence for a coupling of cerebellar networks between music listening and a specific sensory-motor integration for the realisation of embodied representations.

In summary, the studies described above suggest an important role of the cerebellum in the recognition of music and song. The cerebellar areas, mainly of the posterior lobes, seem to play an important role in pitch discrimination. Interestingly, the function of the cerebellum in specific music perception has also been demonstrated in musical hallucinations. According to the literature reviewed here, the left anterior parts of the cerebellum in particular can be ascribed a significant activity for the processes of music recognition.

2.5 Emotion Processing

According to previous findings, the cerebellum is active at various levels of emotion processing, in addition to the well-studied brain structures of the prefrontal cortex and the insula, furthermore the amygdala and the hippocampus (Baumgartner et al. 2006). These mechanisms of the cerebellum in emotion processing, which are increasingly understood in detail in the literature, naturally point to a special importance for music perception. An early and indeed pioneering study on the cerebellar parts of emotion processing was published by Schmahmann and Sherman (1998). In this study of patients with acquired or congenital cerebellar damage, which is still exciting today, behavioural changes were recorded that affected areas of executive functions, spatial perception, language, but also personality (Schmahmann and Sherman 1998). In addition to these predominantly cognitive domains, disturbances of affect regulation were also observed, so that the term cerebellar cognitive-affective syndrome was formulated by the authors. In view of the lack of evidence of additional cerebral brain injuries for which these clinical functional changes would be plausible, Schmahmann and Sherman justified the recorded cognitive-affective disorders as causal on the basis of neuronal connectivity of the cerebellum to the cerebral regions. Their assessment of specific cerebellar mechanisms within affective domains was based on earlier identifications of the cerebellum within limbic circuits, as elaborated in the work of Anand et al. (1959) or Snider and Maiti (1976).

Given the growing evidence for specific cerebellar inputs within neural networks of emotion recognition and processing, further work has emerged examining individual details such as features of emotional valence. An investigation by Hopyan et al. (2010) of the emotion processing of children with a treated benign (e.g. astrocytoma) or malignant (e.g. medulloblastoma) cerebellar tumour did not show any significant impairment compared to healthy control subjects in tasks in which immediate recognition of the basic emotions of joy and sadness in pieces of music was required. In the detailed analysis, however, a weakness in the recognition of sad musical moods was found in the children with medulloblastoma. However, another observation in the study by Hopyan et al. that is well documented in the literature was that cross-modal perception and processing of specific emotions depend on an intact cerebellar-cerebral connection. In the aforementioned study, deficits were found in the children with a cerebellar tumour in two control tasks in which the children were asked to compare the emotions of joy and sadness in music and, in contrast, in poetry (Hopyan et al. 2010). The results in the above-mentioned work correspond well with comparable work (for an overview, see Adamaszek et al. 2017) that, on the one hand, the cerebellum takes a larger share in the perception and processing of negative basic emotions, and on the other hand, the cerebello-prefrontal axis exerts a differently pronounced function in the cross-modal allocation of emotions. The special preference of the cerebellum for negative emotions has also been well elaborated in detail in various animal models of emotional learning, in which cerebellar lesions showed deficits especially for negative emotions such as anger or fear (Sacchetti et al. 2009).

In addition to an early study by Reiman et al. (1997), in which the cerebellum was shown to be involved in emotional reactions to exteroceptive sensory stimuli, a similarly early study by Imaizumi and colleagues was also able to show a cerebellar function for different negative and positive valences in their sample (Imaizumi et al. 1997). The authors examined in a PET those brain regions that function significantly in spoken words for the identification of emotions associated with surprise, disgust, pleasure and also fear. In this exciting approach, significant activation was recorded within the cerebellum, but also in parts of the frontal lobe, revealing the strong functional relationship between these brain regions reported elsewhere for these tasks as well. Another neuroimaging study investigated the cerebellar connections to neural networks during the processing of emotional image perception tasks (Bermpohl et al. 2006; Styliadis et al. 2015). Specific cerebellar contributions to the recognition of emotional speech recognition, i.e. especially the affective prosody of spoken sentences, could be characterised quite respectably in the work of Wildgruber et al. (2005) and Adamaszek et al. (2017), among others.

The findings to date on specific cerebellar functional features within the networks that process the experience of emotions are diverse, and yet still leave some questions open with regard to individual aspects such as that of emotional learning (Sacchetti et al. 2009) or emotional valence discrimination (Styliadis et al. 2015). With regard to the particular human capacity of emotional valence discrimination, cerebellar patients tend to have deficits in the recognition of the basic emotions such as joy or anger, and thus pleasant and unpleasant emotions, which they do not reliably grasp (Adamaszek et al. 2017). In relation to this dichotomy of pleasant and unpleasant emotions, research has shown that the neural systems contributing to these categories are closely linked (Lane et al. 1997). With regard to cerebellar characteristics in this categorical distinction of basic emotions, Turner et al. (2007), for example, found significantly increased cortical activity in the left medial temporal lobe as well as in the occipito-temporal cortex, and increased activity in the cerebellum when viewing unpleasant images compared to neutral or pleasant images. In line with the findings of this work, it was assumed that the brain structures usually responsible for processing unpleasant emotions were impaired as a result of the cerebellar lesions (Turner et al. 2007). Similar to the work of Adamaszek et al. (2015), in which emotion processing occurs increasingly in the prefrontal cortex in the case of cerebellar lesions, the formation of alternative neuronal circuits was also discussed for Turner’s findings in order to maintain the preservation of fear experiences, which is important from an evolutionary perspective, even in the case of cerebellar damage. In line with neuroscientific considerations of neuronal plasticity, the above-mentioned findings and the discussions triggered by them vividly underline that the cerebellum contains structures of dynamic networks, the majority of which are still the subject of lively research (Turner et al. 2007).

3 Music Production and the Cerebellum

It is obvious that the cerebellum is an important part of the network regulating music production. Which specific mechanisms have been detected to date to better understand the role of the cerebellum in music production?

3.1 Physiology

To further understand the physiological features of cerebellar processes within music perception and processing, techniques that highlight the haemodynamic changes during cerebellar activations in specific paradigms may be considered. Previous work by Koeneke et al. (2004) investigated precisely these cerebellar haemodynamic responses in highly skilled keyboard players and control subjects during complex tasks requiring one- and two-manual finger movements. Using this specific fMRI study, which employed a classical box-car design with alternating rest and activation blocks of 20 s each, strong haemodynamic responses were recorded in motor and supplementary motor areas, but also in the cerebellar hemispheres (anterior and intermediate zones) in both groups during the task conditions. However, non-musicians generally showed stronger haemodynamic responses in both cerebellar hemispheres, while skilled keyboard players activated essentially the right cerebellar hemisphere (Koeneke et al. 2004). From these striking differences in cerebellar activations, it was concluded that a different cortical activation pattern is triggered in keyboard players due to many years of motor practice. In accordance with a neural plasticity of motor learning, it can rather be assumed from the present study results that fewer neurons are recruited for trained keyboard playing movements. Morphologically, the different motor performance of such differently trained individuals would understand the observed different volume of activated cortical areas, reflecting the different demands of motor effort in the two groups. Interestingly, the work of Koeneke et al. also recorded activation of the vermis, which is thought to represent emotional or even motor aspects of eye movements when playing the keyboard (Koeneke et al. 2004).

Neural plasticity within motor systems of the central nervous system as a function of motor learning has been well demonstrated many times. Indeed, several studies have shown that training motor skills over long periods of time leads to a reorganisation of neural networks, which is reflected in changes in brain morphology (Sato et al. 2015; Vaquero et al. 2016). These processes of neural reorganisation are complicated and many details are still poorly understood, especially for changes in the vocal system, within which largely intrinsic reflex mechanisms determine these changes. An interesting paper by Kleber et al. (2010) investigated these processes in highly gifted opera singers, but also conservatoire-level voice students, and finally, as a control, in amateurs during the singing of an Italian aria using neuroimaging techniques (Kleber et al. 2010). In this striking fMRI study, increased functional activation of the bilateral primary somatosensory cortex representing the articulators and larynx was identified for vocal skill training. In addition to these cortical activations, experienced singers also showed increased activations in the basal ganglia, thalamus and cerebellum. Interestingly, a precise regression analysis demonstrated a correlation between functional activation and increased singing practice, inferring increased activity of a cortical network along with increased involvement of implicit motor memory areas at the subcortical and cerebellar levels for training singing skills (Kleber et al. 2010).

Sung and spoken language, notwithstanding their differences in certain vocal and prosodic aspects, share many similarities in terms of their physiology of articulation and perception, but also in terms of the phonology, phonotactics, syntax and semantics of the underlying language. In addition to the numerous cerebral sections, cerebellar involvements are also of increasing interest for song and speech, as they were investigated in more detail in a nice paper by Callan et al. (2007). In this work, the authors were able to highlight considerable overlap in the lateral aspect of lobe VI of the posterior cerebellum in the literature on the perception and production of song and speech. This region is plausible because of its somatotopic representation of the lips and tongue; in the context of speech, this region may use internal models of vowel tract articulation that simulate well-learned phonological and/or segmental articulatory-auditory and oral sensory mappings that are used in both speech and singing (Callan et al. 2007). In addition, recent studies have shown a specialisation of the left cerebellar hemisphere in the processing of song, and again a special function of the right cerebellar hemisphere in the processing of speech, although for both hemispheres in lobule VI this seems to be less true than in the representation of speech and song (Callan et al. 2007). Provided one takes into account the observation of crossed patterns of anatomical connectivity between the cerebellum and the cerebral cortex, these results can be applied to the hypothesis that the right cerebellum differentially processes high-pass filtered information (segmental properties) and the left cerebellum differentially processes low-pass filtered information (prosodic, melodic properties) (Callan et al. 2007).

With regard to specific haemodynamic effects of cortical and subcortical areas in music perception and processing, another recently published study by Gonzalez’s research group should be mentioned here (González et al. 2020). In this neuroimaging study, functional magnetic resonance images of cortical regions of 13 professional cellists were acquired in an MRI scanner during their interpretation of excerpts of baroque and contemporary music (González et al. 2020). For both styles of interpretation, common cortical motor and sensory regions were identified in the maps of cortical activations and connectivity, but these showed different hemispheric intensity levels. Interestingly, only certain auditory and motor regions, i.e. the gyri Heschl, the superior frontal gyrus, the planum temporale and the caudatum were activated during the interpretation of baroque music. In contrast, during the interpretation of contemporary music, increased activations occurred in the vermis, the insular cortex and the parietal operculum. These discrepancies in the interpretation of baroque and contemporary music were presumably attributed to the different cognitive, sensory and motor requirements of the individual styles underlying musical interpretation (González et al. 2020).

Using functional MRI, the research group of Segado et al. (2018) investigated which brain activations are responsible for coherent auditory-motor music perception and processing. This study examined comparisons of functional brain activations during singing and again during cello playing in the same individuals, where similar voluntary auditory-motor associations during cello playing were also assumed for singing (Segado et al. 2018). The background to this study was that playing an instrument as well as singing require highly specific associations between sounds and movements, and according to the literature for the production of musical sounds, strikingly similar neural networks are assumed in both cases. This study approach again took into account that singing is an evolutionarily relatively old human trait whose auditory-motor associations are also used in speech and non-linguistic vocalisations. In turn, the pitch range of the cello shows parallels to the human voice, although cello playing is ultimately completely independent of the vocal apparatus and can therefore be used to separate the responsible auditory-vocal network from the auditory-motor network, even if musicians tend to produce a vocalisation when playing an instrument (Segado et al. 2018). Finally, as a result of the study by Segado et al. in their specific fMRI paradigm, brain activity during cello playing was found to overlap in many areas with those of the auditory-vocal network during singing, for the primary motor cortex, the dorsal premotor cortex, the supplementary motor cortex and also the primary and periprimal auditory cortex in the superior temporal gyrus including the Heschl’s gyrus. Further overlap of neural networks between these two levels was identified in the anterior insula, anterior cingulate cortex, intraparietal sulcus and also cerebellum, but not the periaqueductal grey and basal ganglia (Segado et al. 2018).

Taken together, the cerebellum thus also has an intriguingly significant role in music perception and processing in these studies, which are closely related to the neural networks of the cerebral cortex. The neural plasticity of the cerebellum also seems to have a special physiological condition for training individual patterns in certain skills. Instrumental training in particular leads to specific activations of certain regions in the cerebellum depending on the type and intensity of the training and the musical experience, which will be the subject of further research.

3.2 Cerebellum Morphological Plasticity

With regard to the morphological aspects of the brain to music, numerous cross-sectional and longitudinal studies have been able to show correlations between expertise to music and areas of responsible regional brain anatomy, in which the cerebellum also plays an important part. For this separate aspect, for example, a discordant monozygotic (identical) twin design was used to investigate the expertise-dependent effects on neuroanatomy using music training as a model behaviour, mainly to test genetic factors and the shared environment of upbringing (de Manzano and Ullén 2018). In this study, for identical twins with high discordance in relation to piano playing, a greater cortical thickness in the auditory-motor network of the left hemisphere, furthermore a better developed white matter microstructure in relevant pathways in both hemispheres and in the corpus callosum was revealed for the musically active twin. In addition, a larger volume of grey matter in the left cerebellar region with lobuli I-IV and V was found in the piano players. This finding was considered a clear evidence that a significant part of the differences in brain anatomy between experts and non-experts was due to causal effects of training.

A similar observation was made by Hutchinson’s research group in their examination of high-resolution T1-weighted MR images from a large, prospectively collected database, which described larger cerebellar volumes for professional keyboard players than for non-musicians (Hutchinson et al. 2003). Interestingly, this increase in volume in male musicians was proportionate for both absolute and relative cerebellar volume, but not for the whole brain. In the further detailed analysis, a correlation was again calculated between the lifelong intensity of music practice and the relative cerebellum volume in the group of male musicians. In the separate analysis of the morphological characteristics of female musicians and non-musicians, no significant differences in cerebellar volumes were found. Based on these exciting results, a structural adaptation in the cerebellum to long-term motor and cognitive functional demands was assumed, which would explain the significantly larger cerebellar volume in male musicians and the positive correlation between relative cerebellar volume and lifelong exercise intensity. The noticeable gender effect remained uncertain, with a higher plasticity of the brain to influences of testosterone being discussed for this phenomenon of male musicians.

The observation that early music training and timing also exert an effect on the cerebellum was investigated in further study with a focus also on regional differences in structural volumes (Baer et al. 2015). In this study, cerebellar grey and white matter volumes were evaluated using a novel automatic multi-atlas segmentation pipeline in adult musicians and non-musicians during completion of a standard finger-tapping task. Data analysis revealed lower volumes of bilateral cerebellar white matter and right lobulus IV, V and VI for early trained musicians compared to late trained musicians. An interesting finding was a smaller volume of cerebellum for those musicians who had better timing performance, greater musical experience and an earlier age at onset of musical training, with better timing performance particularly associated with smaller volumes of right lobule VI. From these imaging results, it could in turn be concluded that not only does the age of onset of musical training influence the functional volume increase of the cerebellum, but also that lobule VI plays a role in timing. Finally, the observation of smaller cerebellar volume associated with music training and timing performance is likely to reflect more efficient implementation of low-level timing and sensorimotor processes (Baer et al. 2015).

The hypothesis of a connection of the cerebellum to musical learning was investigated in a study by Bruchhage et al. which examined the volumes of individual areas of the cerebellum in a specific training of drumming (Bruchhage et al. 2020). More precisely, this study compared the volumes of cerebellar lobules and white matter microstructure as well as cortical thickness of healthy non-musicians before and after a demanding multimodal motor training to learn drumming with age-matched control participants. Significant volume changes were identified for 8 weeks of drumming training, with increases in left VIIIa grey matter, relative decreases in VIIIb and vermis crus I volume, and white matter microstructure in the inferior cerebellar peduncle. In addition to these plastic changes in the cerebellum, an increase in the cortical thickness of the left paracentral, right precuneus and frontal portions of the right (but not left) superior frontal gyrus was seen, inferring an interplay of cerebellar learning with cortical structures via specific cerebello-prefrontal pathways.

In summary, this substantial work has shown that musical training induces morphological changes in the cerebellum independent of the type of musical performance.

4 Cerebellar Disorders and Music Perception: An Experiment

4.1 Setting

A separate study (Tölgyesi and Evers 2014) on cerebellar characteristics influencing music perception should be considered here in detail, in order to give an impression of the clinical research on the role of the cerebellum in music perception. The results of this study, which are still quite recent, make it possible to understand the specific effects of cerebellar disorders on music perception, especially with regard to individual characteristics such as rhythm, metre and melody structure, but also emotional aspects. The study used specific clinical tests of musical ability in patients with ischaemic (focal infarction; n = 11) or genetic (Machado-Joseph disease; n = 4) cerebellar disease, and compared them with 30 healthy controls (Tölgyesi and Evers 2014).

In the experimental study, Tölgyesi and Evers applied a clinical test of musical ability, which has been formerly used in patients with cerebral infarctions (Lorenz 2000), and is divided into five parts:

  1. 1.

    Rhythm and metre: Participants had to reproduce short rhythmic sequences by tapping a pencil on a table. Correctness of rhythm and metre were scored separately, so that a maximum of 16 points could be achieved for rhythm and 16 points for metre.

  2. 2.

    Comparison of melodies: Of two consecutive melodic sequences, the subject was asked to decide whether the two melodies were the same or different, with a total of 16 points achievable for correct comparison.

  3. 3.

    Emotions: This subtest consisted of 12 short improvised pieces of three to four bars in duration. Each piece represented a certain emotion, which the respondent had to name correctly. A total of 12 points were attainable in this subtest.

  4. 4.

    Pitch discrimination: 12 different pairs of notes were played on a piano and the participant had to determine whether the second note was higher or lower than the first. This specific subtest of correct pitch discrimination was to be completed with a maximum of 12 points.

  5. 5.

    Melody recognition: The subjects listened to 14 short pieces from the beginning of a familiar song (10×) or an unfamiliar improvisation (4×), whose melody was to be correctly identified as familiar (familiar) or unfamiliar. This last subtest was again to be completed with a maximum score of 14 points.

Taken together, the sum of the individual subtests required a total score of 86 points. In the subtests, most tasks are very easy to complete, which also means that quite high ceiling effects have to be kept in mind, i.e. abnormal results are expected in clearly and severely affected persons.

4.2 Results

The test results of the examination of the patients with an average age of 56 years are shown in Table 13.1. Significant differences between the patients and the control subjects were obtained in the three subtests on metre, melody comparison and emotion, but there was also a significant difference for the total score of the test. In three subtests, rhythm and pitch, the patients also achieved lower correct scores, only this finding did not remain statistically significant. Taking the mean total score of the control subjects (i.e. 69.7 points) as the average normal score, the lower limit of a normal test result is 52 points (i.e. minus two standard deviations). According to this calculation, seven patients (47%) but no control person showed abnormal results in this test. A differentiated consideration of the patients with regard to the aetiology of the cerebellar disease showed individual differences in the respective subtests. In the relatively small subgroup of patients with Machado-Joseph disease, we found significant differences in two subtests, i.e. patients with cerebellar infarction performed significantly better in the melody comparison task (11.3 ± 2.4 vs. 9.3 ± 2.2; p < 0.05) and in the melody recognition task (12.8 ± 0.9 vs. 10.8 ± 2.1; p = 0.007) than did patients with Machado-Joseph disease as the cause of cerebellar dysfunction.

Table 13.1 Results of testing musical ability in the patients and the control subjects. Data are presented as arithmetic mean and standard deviation. Statistical comparison by Mann-Whitney-U-test (ns denotes not significant)

With these results, it was found that patients with a structural cerebellar disorder were impaired in the perception of musical parameters in the subtests of metre, emotion and melody comparison. Where a musical rhythm can be understood as a movement with regular and irregular impulses, it is obvious that keeping time is an indispensable prerequisite for a successful performance. According to the results of this study, in cerebellar dysfunction, the regular and tightly focused movements required to maintain the exact metre in time may be impaired. The present results vividly reveal the complexity of relationships and interactions in the recognition of melodies. In the study, the strategic features of individual subjects’ task performance when presented with the opening melody of a familiar or improvised composition, such as internal anticipation of the upcoming melodic pattern, were not affected. Interestingly, these impairments were found essentially in the subgroup of patients with Machado-Joseph disease, and thus a chronic degenerative, rather than in those with an ischaemic, cause of cerebellar dysfunction. Here, further studies are promising to characterise more precisely the neurotopographic relationships of the cerebellum to the cerebral networks responsible for perception or active memory of familiar melodies.

In humans, music is associated with emotions in a variety of aspects, such as the tempo (e.g. the number of beats per minute) and the mode (e.g. major/minor) of the music (Dalla Bella et al. 2001). The improvisation tracks in the emotion identification subtest evoked a clearly defined emotion according to their design. In the present study, patients with a cerebellar disorder were less successful in this emotion subtest compared to healthy control subjects. Of four patients (three with an ischaemic, one with a chronic-degenerative genesis), the emotion anger was conspicuously often not recognised, which corresponds well with the observations of a special importance of the cerebellum in negative basic emotions. What was striking in the study was the patients’ lack of awareness of the errors in the musical performances tested. Since severely disabled patients did not participate in the study, the clinical relevance of the results was difficult to classify anyway. Regardless of this, the test protocol used can certainly be included as an additional instrument for recognising specific musical limitations of the affected patients with brain dysfunction, even within the cerebellum. It is certainly important to distinguish the aetiology of cerebellar dysfunction, which had not been done in this way in the present work. In fact, the study results cannot be resolved with certainty in detail, especially since other parts of the cerebellum may also be affected in genetically determined, chronic degenerative cerebellar diseases such as Machado-Joseph disease, which might also influence the test results of musical abilities.

5 Therapeutic Implications

Despite the large number of pioneering and plausible studies on the neural background of music perception and the neurotopographical aspects of the processing of music in the brain, i.e. the cerebral cortex, the basal ganglia as well as the cerebellum, only a few studies have so far worked out the role of the cerebellum in more detail. In view of the also well-studied therapeutic effects of music in neurorehabilitation (see Chatterjee et al. 2021), this circumstance is surprising, where the therapeutic effects of music also suggest promising approaches for diseases within the cerebellum. In accordance with the aspects of cerebellar involvement in music perception elaborated here, a targeted activation of the cerebellum, for example, via the use of music in neurorehabilitative settings of supporting therapeutic processes with the aim of neural reorganisation within the cerebellum or the affected cerebello-cerebral connections could indeed be important. The causal basis for this assumption is grounded in the fact that music is typically used to facilitate or support motor movements, and moreover, is increasingly used in movement rehabilitation (Devlin et al. 2019). In addition, there is some evidence that music imagery, which has been reported to lead to similar brain signatures as music perception, may also support movement (Haire et al. 2021). It remains unclear whether and what influence the imaginal or musical cues have on the activation of individual motor systems of the human brain during simple movements. Here, again, a paper by Schaefer et al. (2014) is of interest, in which the neuronal activity during wrist flexions to heard or imagined music was compared with self-control of the same movement without stimuli in an fMRI study (Schaefer et al. 2014). In this work, the image data analyses focused predominantly on the motor networks of the brain, applying a mask of BA4, BA6, the basal ganglia, the motor nuclei of the thalamus and the whole cerebellum. As a result, movement to music, compared to self-directed movement, resulted in significantly increased activation in the left cerebellar lobulus VI. Movement to imagined music, compared to self-directed movement, in turn, significantly more activated the pre-supplementary motor area (pre-SMA) and the right globus pallidus. Significantly higher activity in lobulus VII of the cerebellum as well as the right hemisphere and the vermal lobulus IX was recorded in a direct comparison of the music and imagination conditions for the movements in the music condition, whereas significantly higher activity in the pre-SMA was shown for the imagination condition. Based on these results, the stimulation of movement by actual or imagined music appears to affect different network regions, including cerebellar regions, during movement, suggesting subtle differential modulation of heard and imagined cues to movement.

Notwithstanding the as yet unclear neurophysiological mechanisms, personalised music programmes have been proposed as a complementary therapy in cognitive rehabilitation and particularly for patients with Alzheimer’s disease, as clinical studies have demonstrated improvements through music perception in agitation, anxiety and behavioural symptoms (Garrido et al. 2017). This recommendation is also well highlighted in an fMRI paper by King et al. which found specific effects on brain connectivity for individuals with clinically diagnosed Alzheimer’s disease following a period of training with a personalised music listening programme (King et al. 2019). In this recently published work, patients with AD demonstrated specific activation of the SMA, a region associated with memory for familiar music and typically spared in early AD, when listening to a music they preferred. Interestingly, imaging data analysis for the condition of preferred musical stimuli revealed an increase in functional connectivity not only in cortical but also in cortico-cerebellar networks. With these separate results, again the complexity of brain connectivity in music perception and processing becomes clear, where the cerebellum also takes its part in the benefit of a therapeutic music training programme in patients with Alzheimer’s disease (King et al. 2019).

A particular challenge in the study of elements of music therapy effectiveness is the interaction and therapeutic relationship between the patient and the therapist. A pilot study by Steinhoff et al. (2015) highlights this issue, where this work observed a reduction in specific pattern changes in the brains of four individuals with Unresponsive Wakefulness Syndrome during music therapy (Steinhoff et al. 2015). In this imaging study, three PET scans were performed in each patient: (1) at rest, (2) during the first music therapy exposure and (3) during the last music therapy exposure. To compare treatment effects, two patients in the music therapy group received music therapy for 5 weeks between the second and third PET examinations, and two other patients in the control group received no music therapy in the interim. According to the tracer uptake with a focus on the frontal, hippocampal and cerebellar regions, it was found that with some differences in these three observed brain regions, the tracer uptake was higher (34%) in the music therapy group than in the control group after 5 weeks. These preliminary results, with concordant activation of the cerebellum as well, provide fascinating evidence of cerebellar involvement in a specific music therapy treatment setting, which should inform further research. In summary, it is stated that in the few but promising studies, the separate involvement of the cerebellum in the therapeutic application of music in the use of passive or active music therapy is to be emphasised, so that further clinical and neurofunctional studies for the detailed investigation of the specific characteristics of the cerebellum, in particular the detailed analysis of the individual cerebellar regions and their connections above all to the (pre)frontal, the parietal and also the temporal cortex in connection with specific music therapy settings may be awaited with excitement.

6 Perspectives

With the work presented here and the considerations associated with it, we conclude that the traditional assumption of learning and maintaining essential components of culture (here: music) is not only the responsibility of the cerebral cortex alone, but also of the cerebellum in part. This specific performance of individual cerebellar structures, still to be clarified in detail, is learned through the repeated improvement of predictions, but also the control by internal models in the cerebellum and made available to the cerebral networks responsible for this (Ito 2008). According to Vandervert (2016), the following new explanations for music learning can be discussed:

  • how the recent evolutionary expansion of the cerebellum was involved in the co-evolution of earliest stone tools and language leading to the cerebellum-driven origin of culture;

  • how cerebellar internal models are blended to produce the creative, forward advances in culture;

  • how the blending of cerebellar internal models led to human, multi-component, infinitely partitionable and communicable working memory.

In summary, according to previous research, the cerebellum is clearly found to be involved in both the hearing of music and the production of music, and thus in music processing. In clinical and neuroimaging research, different parts of the cerebellum have been identified for different aspects of these processing tasks and their connections to individual connections to the cerebral cortex as well as the basal ganglia have been elaborated. In this still rather young field of research, relatively little is known about the effects of disorders or damage to the cerebellum on music processing, and in turn the therapeutic influence of functional modulation of individual cerebellar target areas on music processing. This last aspect in particular will certainly play an important role in future neuroscientific research on music therapy.