Introduction

Playing a musical instrument is a complex multisensory experience requiring several skills including reading and translating abstract musical notation to fine and coordinated motor movements in order to produce a sound. The mastering of this rich and demanding process requires regular and intense practice, often from a young age on, and the combination of such demand is likely to influence the differential development, maintenance, and operation of certain brain structures. Over the past two decades, many studies comparing adult musicians and non-musicians have shown that music training is associated with anatomical brain differences (for comprehensive reviews see Gaser and Schlaug 2003; Herholz and Zatorre 2012; Jäncke 2009). Although the precise brain coordinates and lateralization of the regions that are affected by music-training vary among these studies, there is general agreement that learning and mastering of musical skills can lead to changes in the cortical tissue (including cortical volume, density and thickness) of auditory structures. Some of these findings include evidence for larger gray matter volume of the anterolateral segment of Heschl’s gyrus in musicians compared to non-musicians (Schneider et al. 2002) and also thicker cortex and higher gray matter density of the right posterior superior temporal gyrus, corresponding to the planum temporale, (Bermudez, Lerch, Evans and Zatorre 2009). Other findings indicate a strong association between changes in the anatomy of auditory structures and relevant musical skills; a positive correlation between musical proficiency and gray matter density of the left Heschl’s gyrus has been reported by Gaser and colleagues (Gaser and Schlaug 2003) and cortical thickness in the right auditory cortex has been shown to be directly associated with performance level on a melodic transposition task (Foster and Zatorre 2010).

Given the cross-sectional nature of the above-mentioned studies, it is not possible to conclude whether the reported anatomical differences result from pre-existing biological traits, lengthy musical training, or an interaction of the two factors. It is also not clear whether such changes might be found in children undergoing music training given that brain development is a dynamic process, shaped by genetic and experiential factors (Rosenzweig 2003). Longitudinal studies in which children undergo music training and are assessed before and after such training can help address these points more directly. In a first longitudinal study investigating structural changes related to the learning of a musical instrument in typically developing children, Hyde and colleagues demonstrated that 6-year-olds receiving instrumental musical training for 15 months, showed increased gray matter density in the right primary auditory cortex (Heschl’s gyrus) while age-matched children receiving no musical training did not (Hyde et al. 2009a, b). Our previous findings in a cohort of typically developing children, involved in a longitudinal study, revealed an asymmetric reduction of cortical thickness and volume of the posterior segment of the superior temporal gyrus (larger reduction in left pSTG than right) (Habibi et al. 2017). These children had been involved in music training for 2 years and were compared to aged-matched control groups who did not engage in music learning. Importantly, associated with these structural changes, we also observed related improvements of musical skills in the same music trained group including enhanced ability to detect changes in tonal environment (Ilari, Keller, Damasio and Habibi 2016) and an enhanced maturity of auditory processing as evidenced by accelerated development of adult-like cortical auditory evoked potentials (Habibi, Cahn, Damasio and Damasio 2016). Given that change in cortical thickness reflects brain maturation, which is also influenced by learning and experience, we interpreted the reduction in the rate of the right lateralized cortical thinning of the pSTG, as the result of an interaction of the expected thinning of auditory association cortices with a thickening due to auditory stimulation induced by early music training. These findings align with the observations that asymmetries associated with music processing, specifically at the level of melody and pitch, favor the right temporal regions (Bermudez et al. 2009; Burke 2010; Herholz, Lappe, Knief and Pantev 2008; Hyde, Peretz and Zatorre 2008; Tervaniemi 2006; Zatorre, Evans and Meyer 1994).

Based on these findings, it is reasonable to assume that early exposure to music training in childhood may be responsible for a thicker cortex of the right versus left auditory areas previously reported in adult musicians (Bermudez et al. 2009). Such an assumption would be concordant with the experience-based thickening of cortex previously reported in language related areas in association with fine tuning and mastery of linguistic skills during late childhood (Sowell 2004). To further explore this question, we sought to assess whether thicker cortex of the right auditory regions could be detected in an independent sample of children who have had at least 2 years of intensive music training compared to age matched children with no music training. We also sought to determine, in our longitudinal sample, whether the asymmetric reduction in cortical thinning, that we observed after two years of music training, is maintained as the children in our longitudinal study continue their music engagement. In other words, would the rightward asymmetry of a lesser reduction of cortical thickness, in auditory association areas, continue with two more years of training and could we replicate our findings in another cohort of children.

We describe here findings from what amounts to two separate studies. In study 1, with a cross-sectional design, we compared cortical thickness in three bilateral auditory regions (Heschl’s gyrus, anterior superior temporal gyrus and posterior superior temporal gyrus) in a group of children with intense music training, for at least 2 years, and an age matched control group with no music training. We also included three additional regions of interest within the cingulate gyrus: anterior cingulate, anterior dorsal cingulate, and posterior dorsal cingulate in comparing the two groups; these regions are not known to be directly influenced by music training but their maturation can be influenced by other factors notably socio-economic status and were therefore well-suited to serve as comparion regions for this study. In study 2, the comparison of cortical thickness, in the same brain regions was done between children undergoing music training and children without such training where we used data from an ongoing longitudinal study investigating brain development in children involved in a systematic music program and children without such music training. We report here results, at 4 years of training. Findings at two years of training have been published previously (Habibi et al. 2017). Of note, we elected to examine cortical thickness because it is a sensitive metric of dynamic brain changes throughout development and it has been shown to be influenced by both maturational pruning and experiential neuroplasticity (Vijayakumar et al. 2016). Other morphometric measures of cortical grey matter such as gray matter density, that provides a comparison of gray matter concentration on a voxel-wise basis does not have a biological correspondence, are difficult to interpret and, therefore, are not suited to evaluate in the context of brain maturation particularly in a longitudinal setting.

We hypothesized that in the cross-sectional comparison, children with music training would show the previously reported thicker cortex for the right auditory association regions. We also hypothesized that, in the longitudinal comparison, children with music training would continue to show a right lateralized lower rate of cortical thinning in the auditory association areas; Finally, we hypothesize that asymmetry in thinning would occur largely in the auditory association areas, in the superior temporal gyrus, posterior to the Heschl’s gyrus (pSTG): the region where early auditory processing takes place.

Study 1

Materials and methods

Participants

Participants in this study came from two groups. The first was a group of 15 children (9 girls, mean age = 9.93 years, SD = 1.30), who had been playing at least one musical instrument for a minimum of 2 years at a private conservatory or a music school. Child participants in this group came from affluent households with access to private music programs. The second group was composed by 15 children (6 girls, mean age = 10.02 years, SD = 1.27) who were not involved in any systematic music training. Child participants in this group came from underprivileged communities (as measured by the Hollingshead Four-Factor Index of Social Position; Hollingshead 1975), were of Latino background and were being raised in bilingual households; (these children are part of a larger longitudinal study on child development). All attended English-speaking schools that did not offer music education programs.

All participants were screened with Wechsler Abbreviated Scale of Intelligence (WASI-II; Wechsler 1999) and through interviews with their parents to exclude any history of developmental or neurological problems.

Participants in the non-music group were selected to match the participants in the music group, one to one, based on age. After age selection, inclusion in the study was guided by matching IQ. If more than one child was available, we selected the participant from the non-music group who had the closest composite score in the IQ test to the participant in the music group. Once age and IQ were matched, we further selected the participant with the matching gender. All demographics are summarized in Table 1.

Table 1 Demographic (mean and SD) for participants in study 1

Recruitment and induction protocols were approved by the University of Southern California Institutional Review Board. Informed consent was obtained in writing from the parents/guardians in their preferred language, on behalf of the child participants and verbal assent was obtained from all children individually. Either the guardians or the children could end their participation at any time. Participants (parents/guardians) received monetary compensation ($15 per hour) for their child’s participation and children were awarded small prizes (e.g., toys or stickers).

Experimental procedures

Handedness

Measure of handedness for the music group relied on self-report. Eleven participants were right-handed, and four participants did not report being right or left-handed. Measure of handedness for the non-music group was assessed as part of the Bruininks-Oseretsky Test of Motor Proficiency (BOT 2-Brief). Children were asked to write their name, throw a ball to the experimenter, and kick a ball to the experimenter. Children were classified as right or left-handers if they used either left or right hand/foot for all three tasks. They were classified as mixed handers if they used either left or right hand for only one of the tasks. In the non-music group, there was one left- handed boy, and one left-handed girl.

Assessment of musical performance

Musical performance was assessed for the music group. Participants were asked to bring their musical instrument to the testing session and perform a piece of their choice. Performances were approximately 1–3 min long. All pieces were classified in relations to eight grade levels and five dimensions (pitch, time, shape, tone and performance), as proposed by the Associated Board of the Royal Schools of Music (ABRSM). Children could receive between 0 and 30 points for each dimension. A composite score for children’s performance was obtained by adding scores for each dimension and dividing the total by five. Two experienced music teachers listened to all recordings and rated all student performances.

Brain imaging

The brain imaging sessions included anatomical T1 (MPRAGE) and functional MR imaging. Data collection and analysis for the MPRAGE are described below; functional imaging studies are not used in the present report; they have been reported elsewhere (Sachs, Habibi and Damasio 2018). For every child, a T2 weighted scan was also obtained for review by a neuroradiologist. If an incidental finding should have been detected, the neuroradiologist would have contacted the family’s designated physician and suggest further evaluation if needed. No incidental findings were reported.

We designed a child-friendly protocol for the scanning session that started with a training session. Children learn about the scanner by watching a video and become acquainted with scanning in a mock scanner which includes the different types of sounds to be encountered in the real scanner. During the actual scanning session, if children wished, one of the investigators remained in the scanner room and held the child’s hand. To assist children to remain motionless during the scanning, they could watch a video of their choice. After the scanning session children were shown and given an actual image of their brain.

For all participants, high-resolution T1-weighted MPRAGE MRI images were acquired using a Siemens 3 T MAGNETOM Prisma System equipped with a 20-channel head coil; the following parameters were used: 1 mm × 1 mm × 1 mm resolution over a 256 mm × 256 mm × 256 mm FOV; TI/TE/TR = 850/32.05/2300 ms; flip angle = 8; GRAPPA acceleration factor R = 2.

Selection of ROIs

We hypothesized, based on our prior findings and other related reports, that brain changes in a priori chosen regions of interest (ROI) within the primary and secondary auditory cortices, would be influenced by music training. These regions include (1) Heschl's gyrus (corresponding to primary auditory cortex), (2) anterior superior temporal gyrus, the section in front of the HG and (3) posterior superior temporal gyrus, the section behind the HG. These regions are known to be of critical importance in processing of acoustic information, including spectral and temporal cues; control of attention-related auditory process (Jäncke, Gaab, Wüstenberg, Scheich and Heinze 2001); and provide continuous feedback for appropriate motor execution during musical performance, an important skill in playing music (Peretz and Zatorre 2005; Zatorre and Belin 2001; Zatorre, Belin and Penhune 2002). Because the specific a priori predictions were motivated by multiple previous reports in children and adult cohorts (Bailey, Zatorre and Penhune 2014; Hyde et al. 2009a, b; Schlaug, Norton, Overy and Winner 2005; Schneider et al. 2002), we did not control for multiple comparisons for any of the statistical tests in the ROI analysis. We also included three additional ROIs within the cingulate gyrus (1) anterior cingulate, (2) anterior dorsal cingulate, and (3) posterior dorsal cingulate, regions that are not known to be directly influenced by music training. On the other hand, their development may be influenced by other factors, such as socio-economic status, bilingualism or maturation variability in general; therefore, they seemed well suited to serve as control regions for this study.

Analysis

Cortical analysis

To analyze high resolution T1 images, we used the BrainSuite software (Shattuck and Leahy 2002; https://brainsuite.org/) which incorporates a multiple step cortical surface extraction and labeling sequence for analyzing T1-weighted MR images (see Habibi et al. 2017). MRI data were visually, and individually, reviewed to identify movement artifacts. After this quality control step, data from one participant from the music group was excluded due to excessive head motion. Using a surface-constrained volumetric registration technique (SVReg, see Joshi, Shattuck and Leahy 2012; Joshi, Shattuck, Thompson and Leahy 2007), a customized child atlas was co-registered to each individual participant’s data set. For each subject, transfer of region labels from the atlas yielded labeled cortical surfaces and volumes for 31 individual cortical regions in each hemisphere. Each brain was examined after automated labeling and manual correction of edge mislabeling was applied whenever necessary.

A segmentation of the whole brain into regions of interest (ROI) is automatically transferred from the atlas onto each individual brain. Because the accuracy of automated transfer of ROI boundaries is not sufficiently accurate all sulcal based boundaries of ROIs are manually corrected using BrainSuite’s built-in ‘curve tool’ that allows drawing of ROI limits on the “inner surface” (the surface created by the gray/white separation). The accuracy of choice at each point can be verified through the use of crosstalk between the original orthogonal volume planes and the created gray/white separation surface.” All ROIs are inspected by an expert anatomist (H. Damasio) and corrected. BrainSuite’s built-in ‘curve tool’ was used to manually correct all ROI boundaries. Macroscopic anatomical landmarks (sulci) were used to identify and label the cortical surfaces, cross verifying decisions with the help of the orthogonal volume planes.

The ROIs used for this study, and their limits, are itemized below and shown in Fig. 1.

Fig. 1
figure 1

Top: Segmentation of the posterior superior temporal gyrus (pSTG) in blue, the anterior superior temporal gyrus (aSTG) in yellow, and the Heschl’s gyrus (HG) in red from the sagittal view of left and the right hemispheres. Bottom: Segmentation of the anterior cingulate in green, anterior dorsal cingulate in dark green, and posterior cingulate in light green from the mesial view of left and the right hemispheres. Center: Segmented ROIs in both hemispheres from axial view

Cingulate Cortex delineated by the cingulate and pericallosal sulci, subdivided into the following:

1a) Anterior Cingulate Cortex (ant.Cing): with a posterior limit drawn at the level of the most anterior edge of the corpus callosum as seen in a midsagital slice;

1b) Anterior Dorsal Cingulate: extending from the posterior limit of the ant.Cing to the level of the posterior limit of the superior frontal gyrus (the para-central sulcus);

1c) Posterior Dorsal Cingulate: extending from the posterior limit of the ant.Cing to the juncture of the posterior segment of the pericallosal sulcus and the inferior edge of the cingulate sulcus.

2. Superior Temporal Gyrus (STG), limited anteriorly by the posterior limit of the temporal pole (a perpendicular plane at the level of the separation of the horizontal and ascending branches of the Sylvian fissure); inferiorly by the superior temporal sulcus; mesially by the circular sulcus; posteriorly by the connection of the end of the horizontal sector of the superior temporal sulcus to the end of the horizontal sector of the Sylvian fissure and its connection to the mesial end of the transverse temporal sulcus. We divided the STG into three sectors:

2a) transverse temporal, or Heschl’s, Gyrus (HG) limited posteriorly by the transverse temporal sulcus; antero-mesially by the circular sulcus; whenever there was a doubling of HG, the anterior section was designated HG, while the posterior segment was included in 2b.

2b) a segment posterior to HG, the posterior superior temporal gyrus (postSTG) separated from HG by the transverse temporal sulcus and its connection towards the superior temporal sulcus;

2c) an anterior segment, the anterior superior temporal gyrus (antSTG) separated from HG and the postSTG by their respective anterior limits.

Cortical thickness was calculated for each individual regions of interest in each hemisphere, using thickness PVC (Joshi, Bhushan, Salloum, Shattuck and Leahy 2014), which incorporates tissue fraction and measures the distance between inner cortical surface (the gray/white separation line), and the pial surface. This measure is computed at every vertex on the surface mesh defining the two surfaces. The vertexwise cortical thickness measure for all the vertices within a region of interest is averaged to produce the final cortical thickness value for that region of interest (Joshi et al. 2014).

Statistical analysis

Statistical analysis and graphical representations of data were performed using R (RStudio Version 1.0.153). For each variable, Shapiro–Wilk tests and Levene’s test were used to check for normality and homogeneity of variances. Univariate one-way ANOVA was used to test differences in cortical thickness between the music and control group. Correlations between music group performance scores and cortical thickness were examined using Pearson’s product-moment correlation. For all statistical analysis, the alpha level was set at p < 0.05.

Results

There were no significant differences between the two groups in terms of gender or cognitive abilities as measured by the Wechsler Abbreviated Scale of Intelligence (WASI-II Wechsler, 1999) (all p > 0.1). There was a significant difference in socioeconomic status (SES) as measured by the Hollingshead Four-Factor Index of Social Position between the music and control group (F (1,27) = 245.661, p =  < 0.0001, ηp2 = 0.9). The average Hollingshead score for the music group was 57.79 and was 24.2 in the control group. Cognitive abilities, gender, and SES were not included as factors in subsequent analysis.

We found a difference (albeit not significant) in the average cortical thickness in left Heschl’s gyrus between the musician and non-musician groups (F (1,27) = 4.011, p = 0.0553, ηp2 = 0.129), whereby the cortex in this region was thicker for the music group. This difference was not significant in right Heschl’s gyrus (F (1,27) = 0.107, p = 0.746, ηp2 = 0.004), left anterior superior temporal gyrus (F (1,27) = 0.177, p = 0.677, ηp2 = 0.007), or right anterior superior temporal (F (1,27) = 2.798, p = 0.106, ηp2 = 0.094). A significant difference in the same direction (musicians > non-musicians) was observed in the right posterior superior temporal gyrus (F (1,27) = 4.516, p = 0.0428, ηp2 = 0.143), but not in the left posterior superior temporal gyrus (F (1,27) = 0.635, p = 0.432, ηp2 = 0.023). There were no significant differences in the cortical thickness of any segment of the cingulate cortex in the left or the right hemisphere between the two groups (all p > 0.2; see Table 2). A correlation with music performance scores, albeit not significant at alpha = 0.05, was observed with the cortical thickness in the right posterior superior temporal gyrus (r = 0.44, p = 0.12), and right Heschl’s gyrus (r = 0.46, p = 0.099), See Fig. 2. No other significant correlation was observed in the other regions of interest.

Table 2 Mean and SD of cortical thickness (CT) by group for participants in study 1
Fig. 2
figure 2

Correlation between musical performance and cortical thickness of the right Heschl’s gyrus (left) and right posterior superior temporal gyrus (left) for the music group participants in Study 1

Study 2

Materials and methods

For assessing how cortical thickness changes in the auditory cortices over the course of development and in response to music training, we assessed two groups of children currently involved in a multi-year longitudinal study at our laboratory (Habibi et al. 2014). Participants in this ongoing longitudinal study were first assessed between the ages of 6 and 7 years and were followed for the 4 subsequent years. They completed a variety of behavioral tests, along with EEG recordings and MR imaging (see Habibi, Damasio, Ilari, Sachs and Damasio 2018). Children who met MRI safety requirements were scanned three times: at induction at age 6–7, at age 9–10, and at age 11–12. Here we only include participants who had successfully completed a scan at age 6–7 and at age 11–12.

Participants

The first group consisted of 12 children (4 girls) who had been participating in a community-based youth orchestra music (YOLA at HOLA), which provides free music instruction 4–5 days a week to children from underserved areas of Los Angeles. The second group of children were those mentioned in the non-musician children group of study 1, and who had not been involved in any systematic music training (n = 11, 5 girls). All participants from both groups of children were from underprivileged communities, as measured by the Hollingshead Four-Factor Index of Social Position (Hollingshead 1975), were mostly of Latino backgrounds, and were being raised in bilingual households. They attended English-speaking schools. The screening process and compensation process was the same as in Study 1 and were approved by the University of Southern California Institutional Review Board. All demographics are summarized in Table 3.

Table 3 Demographic (mean and SD) for participants in study 2 at two times of testing

Experimental procedures

Handedness

Handedness was assessed for all participants using the Bruininks-Oseretsky Test of Motor Proficiency (BOT 2- Brief) as described in study 1. In the musician group, there was one left-handed boy; and in the non-musician control group, one left- handed boy and one left-handed girl.

Assessment of musical performance

Musical performances were assessed in the music group in their fifth year of training. Assessment of performance followed the same method described in study 1.

Imaging

The scan at age 6–7 years was acquired on a Siemens 3 T Trio system equipped with a 12-channel head coil. We obtained an MPRAGE sequence with the following parameters: 1 mm × 1 mm × 1 mm resolution over a 256 mm × 256 mm × 208 mm FOV; TI/TE/TR = 800/3.09/2530 ms; flip angle = 10°; GRAPPA acceleration factor R = 2. Due to the difficulty of young children to remain still for extended periods of time, we acquired two separate shorter MPRAGE sequences (~ 3 min) instead of a single longer MPRAGE acquisition we would typically use for adults. These two images were visually assessed for quality and motion artifacts, then were registered and averaged. The scan at age 11–12 followed the protocol described above for study1. MRI scans were individually visually assessed to check for movement artifacts.

Analysis

Cortical analysis

The high-resolution T1 images were analyzed using the BrainSuite software as described in study 1. For the first scan (at age 6–7), all participants were co-registered to the same customized child atlas, using a surface-constrained volumetric registration technique (SVReg, see Joshi et al. 2007, 2012). Each brain was examined after automated labeling and manual correction due to edge mislabeling was applied whenever necessary using BrainSuite’s built-in “curve tool”. The processing sequence for the scan at age 11–12 was similar with the exception that instead of using a customized common child atlas for registration purposes, we used each individual’s manually label-corrected first scan as an atlas for co-registration. This method allowed improved registration of surfaces and labels for the individual participant. Cortical thickness was calculated for the same six ROIs, (1) Heschl's gyrus, (2) anterior superior temporal gyrus, (3) posterior superior temporal gyrus, (4) anterior cingulate, (5) anterior dorsal cingulate, and (6) posterior dorsal cingulate the same way as described above in study 1. Change in cortical thickness in each specific ROI was calculated for each participant by subtracting average cortical thickness at age 11–12 from the average cortical thickness at age 6–7 (δ CT = CT1- CT3).

Statistical analysis

All statistical analyses were performed as described for study 1.

Results

There were no significant differences in gender, SES or cognitive abilities as measured by the Wechsler Abbreviated Scale of Intelligence (WASI-II Wechsler 1999;) (all p > 0.1). There was a significant difference in age between the two groups, at the time of scan 1 F (1,21) = 16.82, p = 0.0005, ηp2 = 0.445; where children in the music group were on average 7 months younger than the children in the control group. At the time of the last scan, children in the music group were still on average 7 months younger than the children in the control group and the main effect of age was significant F (1,21) = 10.52, p = 0.0039, ηp2 = 0.334. Cognitive abilities, SES, gender, age, or time interval between two scans were not included as factors in subsequent analysis.

There were no significant differences in cortical thinning in the left F (1, 21) = 0.19883, p = 0.66; ηp2 = 0.009 or the right Heschel’s gyrus F (1, 21) = 1.62, p = 0.21 ηp2 = 0.075; similarly there were no significant differences in cortical thinning in the left F (1, 21) = 1.61, p = 0.21, ηp2 = 0.071 or the right F (1, 21) = 0.19, p = 0.66, ηp2 = 0.009 anterior portion of the superior temporal gyrus. In the posterior superior temporal gyrus, there was no significant group difference in cortical thinning on the left side F (1, 21) = 0.06741, p = 0.79, ηp2 = 0.003; however, there was a strong trend towards a significant difference on the right F (1, 21) = 3.3252, p = 0.08, ηp2 = 0.137 (non-musicians > musicians), see Fig. 3. There were no significant differences between the music and control groups in cortical thinning, from scan 1 to scan 3 in any segment of the cingulate cortex (all p > 0.2). Mean and standard deviation of the average cortical thinning for each region of interest are reported in Table 4. There was no significant correlation between music performance score and change in cortical thickness in any of the regions of interest.

Fig. 3
figure 3

a Mean cortical thickness of the right posterior superior temporal gyrus in the music and control group in Study 1. b Mean cortical thickness change (Time 1–Time 2) of the right posterior superior temporal gyrus in the music and control group in Study 2

Table 4 Mean and SD of change in cortical thickness (CT) by group for participants in study 2

Discussion

We investigated the impact of music training on the development of the auditory cortex using measurements of cortical thickness in two studies. In study 1, with a cross-sectional design, we examined the effects of childhood music training on cortical thickness in bilateral Heschl’s gyrus, the anterior superior temporal gyrus and the posterior superior temporal gyrus, using a group of children, ages 10–11, who had several years of music training and a non-musician group of children matched for age, sex and cognitive abilities. In study 2, with a longitudinal design, we assessed developmental related changes in cortical thickness of auditory cortices and the effect of childhood music training on cortical maturation of these regions in a group of children starting at ages 6–7 (prior to their music training) and 4 years later at ages 10–11 after 4 years of formal musical instruction. These children were compared to a group of children without involvement in any formal music training program, also evaluated at a 4-year interval.

We report two main findings: (1) in study 1, we observed thicker cortex in the right posterior superior temporal gyrus, and in the left Heschl’s gyrus in the children who had music training compared to those who did not. Additionally, in the music group, music proficiency (as measured by performance skills) was correlated, albeit not significantly relative to the standard alpha level of 0.05, to cortical thickness in the right posterior superior temporal gyrus and the right Heschl’s gyrus. (2) in study 2, although all children showed some degree of cortical thickness reduction in all ROIs, as expected in healthy development, we observed a strong trend (p = 0.08) towards smaller reduction of cortical thickness specifically in the right posterior segment of the superior temporal gyrus in children who had music training. The results from both studies suggest that among the regions within the primary and secondary auditory cortices, the maturation of the posterior superior temporal gyrus, particularly on the right, is influenced by music training during childhood. We will discuss these results with consideration of findings reported previously in the relevant literature.

Musicians versus non-musicians—brain auditory regions

Our findings are in line with previous reports that intensive and long-term music training can induce systematic anatomical changes in the auditory related structures of the brain. In an adult population, Bermudez and colleagues, using both measures of cortical thickness and gray-matter density, compared musicians and non-musicians, and reported greater cortical thickness and more dense gray matter in musicians in auditory association areas—the difference although observed bilaterally, was more pronounced in the right hemisphere, including parts of Heschl’s gyrus, portions lying anteriorly to Heschl’s gyrus, and the posterior sector of the superior temporal gyrus corresponding to the planum temporale (Bermudez et al. 2009), the same location we have identified in our observations in children. Other researchers working mainly with adult populations have reported increases of gray matter volume and density in auditory cortex associated with music training (e.g., Gaser and Schlaug 2003; Hyde et al. 2009a, b; Schlaug et al. 1995; Schneider et al. 2002). Schneider et al. (2002), observed a larger gray matter volume in the anteromedial portion of Heschl’s gyri in musicians when compared to non-musicians and showed that this volume increase was positively correlated with musical proficiency. Using voxel-based morphometry (VBM), Gaser and Schlaug reported a positive correlation between musical skills and gray matter density of the left Heschl’s gyrus, when comparing three groups of adults: professional musicians, amateur musicians and non-musicians (Gaser and Schlaug 2003). Further, Foster and Zatorre compared adults with varying amount of formal music training, and showed that cortical thickness and gray matter density of the right auditory cortex can predict participants’ performance on a melodic transposition task (Foster and Zatorre 2010), suggesting that learning of one of the fundamental skills of musical perception (i.e., relative pitch) is related with changes in the anatomy of the auditory cortex. The above-mentioned studies have made clear that music training can lead to changes in the anatomy of the auditory cortex. They generally implicate areas along the superior temporal gyrus (containing Broadman areas 22, 41 and 42), and dominantly in the right hemisphere as being influenced by music training.

The role of the right auditory cortex in music processing, in particular when it comes to pitch and melody, has been shown in lesion studies as early as in 1962, (Milner 1962). Milner, who had studied patients after temporal lobectomies, showed that these patients experience more difficulty in auditory abilities that involved timbre and tonal discrimination. Her original contribution was supported by a number of other subsequent studies showing that patients with lesions in the superior temporal gyrus experience specific impairments in perceptual processing of melodies (Liégeois-Chauvel, Peretz, Babaï, Laguitton and Chauvel 1998; Zatorre 1985) and that the degree of impairment is more sever after damage to the right hemisphere. Functional imaging studies have also provided further evidence of engagement of right auditory regions, in tasks addressing changes in tonal structure (Zatorre and Belin 2001), melodic patterns (Patterson, Uppenkamp, Johnsrude and Griffiths 2002), and tonal modulations (Schonwiesner, Rübsamen and Von Cramon 2005). The anatomical findings of our results are consistent with the location and lateralization of these previously reported pitch processing areas. Therefore, it is reasonable to interpret the anatomical changes we found in the posterior segment of the right superior temporal gyrus as likely being related to music training.

Development of brain auditory regions and influence of music training

Longitudinal investigations during which participants undergo training and are assessed before and after such training can help address the issue of causality more directly. Hyde and colleagues, in the first longitudinal study pertaining to structural changes related to the learning of a musical instrument demonstrated that 6-year-old children receiving instrumental musical training for 15 months showed increased gray matter density in the right primary auditory cortex (Heschl’s gyrus) while age-matched children without musical training did not (Hyde et al. 2009a, b). Importantly, they also showed that the structural change was predictive of performance (i.e., changes in the morphology of the auditory regions correlated with better performance on music tasks). Our previously reported results from a multi-year longitudinal study are in agreement; they showed an asymmetric reduction of cortical thickness and cortical volume of the posterior segment of the superior temporal gyrus (less on the right than on the left) in children undergoing music training for 2 years when compared to two age matched groups of children without music training (Habibi et al. 2017).

The findings reported here for study 2 confirm and extend the results we published previously. The effects of music training, over a four-year period, on the development of auditory regions in a group of children engaged in a music program, compared to an age matched group of children who did not participate in any music training, showed a slower rate of cortical thinning in the right posterior superior temporal gyrus. In contrast to the results of Hyde et al. (2009a, b) the music training-related structural difference we observed in both studies are outside of the Heschl’s gyrus, in the right posterior segment of the superior temporal gyrus, an area that has been shown to have an important role in auditory feature extraction and the processing of complex sounds, including music (Koelsch 2011; Koelsch, Fritz, Schulze, Alsop and Schlaug 2005). The difference in the two studies may be related to differences in methods of reconstruction and analysis. Where the analysis of the current report was performed using surface-based cortical reconstruction and examining 12 ROIs (six on each side, three in the auditory regions and three in the Cingulate), Hyde and colleagues used a VBM approach for whole brain analysis. Surface-based cortical reconstruction method identifies the border between tissue types (gray matter, white matter and CSF) and allows for examinations of cortical thickness, a measure that is more closely reflect cytoarchitectural properties than gray matter density. Also, unlike VBM, each ROI measure was obtained in the actual brain-space of each individual. Of the regions examined, only the right posterior segment of the superior temporal gyrus (pSTG) showed a different rate of cortical thinning, at the end of 2 and 4 years of music training. This asymmetry favoring the right hemisphere in our results is well-supported by the previous findings, mentioned earlier, of a rightward asymmetry of auditory association areas engaged in the processing of music (Bermudez et al. 2009; Burke 2010; Hyde et al. 2008).

The typical course of cortical maturation is characterized by an overall decline in cortical thickness (Sowell et al. 2004) which is greatest in childhood and follows a linear and gradual slope, during adolescence and early adulthood (Mills and Tamnes 2014). However, this maturation-related thinning of cortex is not similar across all brain regions (Mills and Tamnes 2014). There is a growing body of evidence demonstrating that the brain also undergoes changes related to experience and learning during that same period. For example, during late childhood (5–11) when children are mastering fine language skills, areas in the temporal lobe associated with processing linguistic information, have been shown to thicken (Sowell et al. 2004). Participants in study 2 were first scanned between ages 6 and 7 and again between ages 10 and 11. We observed an overall decrease in cortical thickness in all ROIs, in all children, consistent with the expected course of healthy maturation. However, for the participants with music training, we observed less thinning in the right pSTG. We believe the reduction in the rate of cortical thinning on the right auditory association area is related to the interaction of the normal course of cortical thinning with the results of auditory stimulation induced by early music training in auditory association areas. Such interaction would influence the change in cortical thickness in the opposite direction to the natural thinning maturational process.

We note that the participants in both studies were string players, who were learning violin, viola, and cello. String instruments inherently place a strong emphasis on tonal learning and melodic skills which in turn can engage right auditory association areas more specifically. So, the differences observed may also, in part, be attributed to the specificity of instrument used during training.

We also found a positive, but non-significant, correlation between music proficiency and cortical thickness in this region in participants of study 1. This finding is in agreement with observations by Foster and Zatorre 2010, who showed that the density and thickness of the cortex in right Heschl's gyrus and bilateral supramarginal gyrus, can predict musical abilities on a pitch discrimination task. However, we must note that cross-sectional comparisons are weak predictors of causality and other factors beyond training, such as pre-existing differences and biological predisposition that might have contributed to the differences seen at later age. We did not observe a correlation between music proficiency and change in cortical thickness in musicians in our longitudinal cohort (study 2). There are a few possible explanations for this discrepancy, including the fact that children in the longitudinal cohort were learning music participating in group learning, whereas children in study 1 were taking private, one-on-one lessons, even if most of them also took part in musical ensembles. It was noticeable that children in study 1 played a more advanced repertoire, and when compared with the longitudinal cohort, obtained significantly higher scores (p < 0.05) than the children in study 2. Therefore, it is also possible that differences in musical skills might be related to the intensity of training.

The findings from the two studies also suggest that the thickness of the cortex of the right posterior superior temporal gyrus observed in study 1 (the children engaged in intense musical activity) may be related to the early exposure to music training and its influence in reducing the process of cortical thinning in this region. Following the same rationale, it can be hypothesized that the thicker cortex previously reported by others in adult musicians, in the same area, posterior and lateral to the Heschl’s gyrus, particularly in the right hemisphere (Bermudez et al. 2009), may be related to the influence of music training: first, in the slowing the rate of cortical thinning in this region; and second, by active thickening later, during adulthood. However, only a continuous observation and evaluation of individuals involved in music training from childhood to adulthood will be able to clarify if the slow rate of initial cortical thinning observed with early music training will lead to the anatomical difference observed in adult musicians compared to non-musicians, or if, on the other hand, these adult differences will be related to other mechanisms of experience based neuroplastic changes in adulthood.

Limitations of the current study

We note that assignment to music playing, for either study was not done randomly; thus, preexisting factors, home environment, and parental motivations may have played a role. Children who choose or are encouraged to take music lessons may have parents who motivate them to enroll in music programs early on, offering encouragement and support throughout (see Ilari 2018). Therefore, even in our longitudinal study we cannot rule out the possibility of a different home environment with respect to music or even possible biological predisposing factors that might be in part responsible for the results reported here.

Another limitation is that in study 1 children in the music group came from more affluent social and economic backgrounds. As noted, there were significant differences in socioeconomic status (SES) between the musician and non-musician groups. It has been reported (Farah et al. 2006; Noble et al. 2012) that brain development can be affected by socio-economic influences. To address this, in addition to auditory brain regions, which were the main interest of the current study, we investigated cortical thickness differences between the groups in the cingulate cortex, a region that has been described as affected by socio-economic conditions. Despite differences in SES, we did not observe any differences between the groups in these non-auditory regions; in fact the only differences observed were in the posterior superior temporal gyrus and the Heschl’s gyrus, regions that are primarily affected by auditory (music) training. We, therefore, conclude that SES disparity, in this case, was probably not contributing to the differences observed in the auditory regions between the two groups.

Finally, we note the relatively small number of participants. This is the result of significant logistic challenges in developmental neuroimaging studies: specifically, problems related to retention of participants particularly important in a longitudinal approach (study 2) which entailed following the same children over 4 years. We also note that, individually, these results do not survive multiple comparison corrections (i.e., Bonferroni), possibly because of the limited sample size. However, the consistency in our findings in study 1 and 2, where we included different groups of participants subject to different methods of music training, the morphological changes in the right posterior segment of the superior temporal gyrus may mitigate the fact that the differences did not survive Bonferroni correction. The fact that we also observed a positive relationship between cortical thickness in this region and musical proficiently seems to strengthen our conclusions.

In conclusion, we believe that the data presented here, in conjunction with that of previous studies, support the notion that music learning may enhance regional brain maturation during childhood. Specifically, we provide evidence that music training, during childhood, leads to enhancement of specialized skills, such as pitch processing, that may be related to the cortical development of the posterior superior temporal gyrus. Our findings are also consistent with previous evidence showing that music training is associated with right hemispheric lateralization.