Keywords

1 Introduction

Studies have found that musicians outperform nonmusicians on the behavioral level in many aspects, probably due to advanced perceptual skills and working memory through years of musical training [1, 2]. Changes in brain functions and structures explain the effects of music training. Evidence was found on the neurobiological level, where musicians showed better information encoding and stronger cross-modal auditory-motor integration, supporting the musician’s advantage on corresponding perception [3, 4].

Music perception involves cognition, emotion, and other underlying brain mechanisms [5]. Studies of music perception explained musician advantage over nonmusicians in a more underpinning way. Various neuroimaging techniques have been used in studies of music perception for the different properties of cortical responses to musical stimuli [6]. Electroencephalography (EEG), a powerful tool in the field of neuroscience and brain science, allows for the analysis of continuous processes in time, frequency, and other domains, therefore, has been frequently used in studies of music perception. EEG provides researchers with various methods and evidence helping to understand music perception profoundly. Here, we focus on EEG methods and findings of music perception in musicians and non-musicians.

The EEG studies of music perception is conditioned by the interested musical contents and the form of stimulus. EEG features that we usually discovered, which reveals the cortical decoding of music, depend on experiment tasks and paradigms. The perception of basic musical elements, such as pitch, timbre, etc., has been widely studied in recent years, whereas the effects of systematic music training have not been adequately embodied [7]. Rhythm, music phrases, and syntax are complex musical materials for which the EEG paradigms should be carefully designed. The investigation of these musical materials involves more complex brain activities. Cortical processing of the perception of these music elements would explain the differences between musicians and nonmusicians on a deeper level. EEG studies of the effects of music expertise are summarized in this review according to the three forms of musical stimulus.

2 Effects of Music Training on Music Perception

2.1 Perception of Rhythm

Temporal pattern or structure is essential to the perception of music. A rhythm is a sequence of musical sound groups temporally patterned by duration or stress [8]. In studies of rhythm perception, subjects perceive repeating rhythmic units or beats. Subjects’ responses to violations of rhythmic patterns or durations are analyzed using EEG or behavioral measures [9]. Most EEG studies of rhythm perception investigated (1) the ERP out of EEG recordings or (2) the neural entrainment to steady beats stimulus [10]. The study of rhythm perception requires high temporal resolution, and ERP enables a close examination of the time course of the changing rhythmic pattern. ERP P3 components were found to correlate with the performance of detection of speeding up or slowing down target beats after a four-beat sequence [11]. Subcomponents of P3 can further recognize the two kinds of rhythm violations.

Geiser et al. discovered the perception of meter and rhythm under two attention modes in musicians and nonmusicians utilizing early ERPs components within 300 ms [12]. Subjects were to detect and distinguish changes in meter (the number of beats in a loop) and rhythm (temporal pattern in the fixed loop) in repeating standard 3-beat rhythmic units. The musicians showed significantly better behavioral performance. The early negative deflection was elicited by rhythm and meter deviants in the attended condition, while in the unattended condition, only rhythm deviants elicited negative deflection. No group effects were found across all experimental conditions concerning ERP results, indicating that rhythm and meter are processed similarly in the early stage attentively or inattentively for both musicians and non-musicians. The effects of musicianship were found in a more complicated experiment revealed by P3a component [13]. Two rhythm processing modes were introduced: a sequential processing hypothesis that individuals expect incoming beats in a fixed position and a hierarchical processing hypothesis that individuals expect incoming beats in all possible positions. Target beat appears on one of six positions where few of them fit the meter. Subjects were to determine if there was a meter violation. Musicians showed significantly longer P3a latency than non-musicians. Moreover, the correlation coefficients of P3a, which express the resemblance of two P3a from different conditions, of nonmusicians could be compared fairly well with the correlation coefficients as predicted by the sequential hypothesis. In contrast, musicians’ results fit the hierarchical hypothesis more. All the P3a results discovered in this study suggest that temporal patterns are processed sequentially in nonmusicians and processed hirarchically in musicians.

The neural oscillations entrain to a stimulus of fixed time period [14]. These neural entrainments in music perception are usually synchronizations of neural excitability and underlying music rhythms [15]. It is analyzed similarly to steady-state evoked potentials (SSEP) where amplitude peaks of signals’ frequency spectra are plotted and compared. The differences between musicians and nonmusicians in rhythm perception are revealed by neural entrainment to musical rhythms [16]. Stupacher et al. designed a complex experiment in which subjects listened to a quadruple rhythm and a quadruple-triple combined polyrhythm. Then after a silent, subjects were asked to determine if the upcoming stimulus was early, on time or late. EEG signals were recorded through whole process. The normalized frequency spectrum amplitudes of EEG responses during silent periods before the target stimulus showed a significant effect of music expertise. The neural entrainment of 3-beat frequencies (1.5 Hz and 3 Hz) were significantly larger in musician group. The observed difference during silent period indicated that the beat-related top-down controlled neural oscillations can exist without continuous stimulus and the endogenous oscillations were enhanced by music expertise. Another study of rhythm perception found a stronger neural entrainment to 3-beat rhythm in musicians than in nonmusicians [17]. The rhythmic cues from different spatial angular positions induced a sense of 3-beat rhythm. The stronger responses to 3-beat rhythm in musicians were only observed under attentive condition, suggesting that top-down attentional mechanisms were in play for rhythm perception and the neural entrainment to spatial cue rhythms were strengthened by music expertise.

2.2 Perception of Music Phrase and Syntax

Music develops based on phrases like language. Melodies and harmony series are composed into musical phrases and may have different functions in the corresponding phrase according to music theory. The perception of musical phrases and syntax has been studied [1820]. The perception of musical phrases or syntax may correlate with music experience and be reflected in EEG responses. Influences of long-term musical experience on the processing of musical phrases or syntax are summarized below.

A commonly used experimental paradigm of music syntax perception is a chord sequence that includes violations against the expectations of the chord progression [21]. By changing the ending chord, the music syntactic appropriateness and expectancy are violated, which elicits ERP components related to music syntax perception. An early right anterior negativity (ERAN) ERP component was found specifically sensitive to the violation of musical regularities. ERAN has a latency of 200 ms, which is similar to MMN and is usually followed by an N5. The ERAN evoked by harmonically inappropriate chords in a sequence was larger in musicians than non-musicians [22]. Their results indicate that the automatic neural mechanisms that process musical syntax irregularities reflected by ERAN can be modulated by music expertise. Portabella et al. discovered the dissonance perception in a chord progression of the two groups. Chords that violate the regularities in the middle or end of the sequence evoked P3 and N5 components. The amplitude of these components varied between musicians and non-musicians. Moreover, they found that the predictability of dissonant chords did not modulate the ERP responses [23].

More complex phrases were used as musical syntax stimuli. James et al. studied the brain process of music syntax perception in different populations using spatial-temporal ERP, Microstate, and ERP source imaging [24]. Subjects listened to ten-second expressive string quartet pieces, which ended in one regular version or two transgressed versions, after which the subjects had to respond if they appraised the stimulus as satisfactory or not. The effects of musical expertise on the neural processing revealed by ERP response and microstate lay in a 300–500 ms window after the onset of the target stimulus. The P3b-like components and characteristic microstates differentiated nonmusicians, amateur pianists, and expert pianists from each other. The underlying sources for these microstates were localized in the right middle temporal gyrus, anterior cingulate, and right parahippocampal areas. Ma et al. discovered the neural responses of subjects with different musical experiences to musical syntax that follows finite state grammar (FSG) and phrase structure grammar (PSG) [25]. In the FSG condition, different final chord types were not reflected in ERP responses in non-musician group. Irregular final chords evoked ERAN-N5 responses in medium− and pro-musician groups. In terms of PSG condition, only in the pro-musician group were observed ERAN-N5 responses evoked by irregular final chords. Their results suggested that the effects of musical expertise on the perception of musical phrases were reflected in syntactic complexity and early ERAN and late N5 components. EEG responses were also studied during the aesthetic judgment of the chord sequences with different closures [26]. Musicians showed negative potentials during the beauty judgment task compared to the correctness judgment task, while this phenomenon was not seen in non-muscian group. However, during the listening stage in non-musician group, differences were observed between the correctness and aesthetic judgment tasks. The neural correlates of aesthetic music processing were modulated by musical expertise.

Table 1. EEG features to study complex music perception between musicians and nonmusician

Another subject of music phrase perception that has been studied is the perception of phrase boundaries [27]. An ERP positive wave of approximately 550 ms latency after the phrase boundary offset was found. The observed peak was similar to a positive component in respond to prosodic phrase boundaries in speech perception [28]. It was termed as closure positive shift (CPS) for correlating with prosodic phrase closure. The musical CSP was studied and found to be modulated by music expertise [27]. Nuehaus et al. used EEG and MEG methods and musical melody stimulus with phrase boundaries in the middle. Musicians showed an electric CPS and a magnetic CPS (CPSm) evoked by phrased melody versions, while nonmusicians showed an early negativity and a smaller CPSm. Zhang et al. designed a complex musical stimulus which contains boundaries at three hierarchical levels. All three boundaries evoked CPS in musician group and their amplitudes were modulated by hierarchical levels. Only period boundary (most obvious one) elicited CPS in nonmusicians and an undistinguishable negativity was induced at the three boundaries. These findings of music phrase perception suggest that the musical phrasing ability could be enhanced by music expertise or the phrasing perceptual strategies are different between the two groups.

3 Summary

In this paper, we reviewed EEG studies of music perception on rhythm, music phrases, and syntax perception between musicians and nonmusicians to show the effect of music training from the perspective of behavioral and cortical levels.

In summary, the perception of rhythm was usually studied using ERP or neural entrainment. Neural responses of musicians and nonmusicians were similar in the early stage and different in the later stage of rhythm processing. The P3a component evoked by a meter violation in the beats expectation task demonstrated the effects of music expertise, where musicians showed later P3a. The neural oscillations entrain to a rhythmic beat, where this entrainment was stronger in musicians than in non-musicians. Moreover, the endogenous beat-related top-down controlled neural oscillations were modulated by music expertise. These findings suggest that musicians and nonmusicians process complex rhythmic temporal patterns differently.

Moreover, neural entrainment is strengthened in musicians. In addition, musicians and nonmusicians process music phrases and syntax differently, revealed by evoked responses (ERAN, P3, and N5) to violations of chord sequences. Microstate and ERP source imaging also revealed the different mechanisms in the mid-latency period and involved cortical areas of musicians and nonmusicians in the perception of music syntax. The prosodic phrase boundaries-related CPS showed better musical phrasing ability in musicians.

The above studies revealed that cortical processing of rhythm, music phrases, and syntax differs between musicians and nonmusicians, reflecting the effects of music training on the perception of complex music elements. The mentioned EEG features and corresponding results are listed in Table 1. It should be mentioned that the discovered results are strictly conditioned by the experiment paradigm due to the complexity of the stimulus. Further EEG studies of music perception are recommended to investigate the interplay of these mechanisms, for example, in attentive and inattentive conditions.