Introduction

Spinocerebellar ataxias type 3 (SCA3) is the most common autosomal dominant ataxia worldwide. SCA3 belongs to polyglutamine diseases caused by abnormal CAG repeat expansions within the coding region of ATXN3 (Hersheson et al. 2012; Klockgether et al. 2019). SCA3 patients are clinically characterized by degeneration of the cerebellum and cerebellar interconnection. The predominant symptoms in SCA3 are cerebellar ataxia with progressive external ophthalmoplegia, dysarthria, dysphagia, dystonia, rigidity, pyramidal signs, and peripheral neuropathy (Guimarães et al. 2013; Rezende et al. 2018).

To evaluate disease severity and clinical outcome, several well-validated scales, such as the Scale for the Assessment and Rating of Ataxia (SARA) (Trouillas et al. 1997) and the International Cooperative Ataxia Rating Scale (ICARS) (Schmitz-Hübsch et al. 2006), have been developed. As a semiquantitative assessment of cerebellar symptoms on an impairment level, these scales are now widely used in clinical trials (Perez-Lloret et al. 2021). Moreover, previous neuroimaging studies showed that damage of infratentorial structures is significantly associated with SARA, ICARS, and disease duration (D’Abreu et al. 2012; Fahl et al. 2015; Jacobi et al. 2012; Kang et al. 2014; Schulz et al. 2010).

Although previous Voxel-based morphometry (VBM) studies have revealed the correlation between atrophy of the cerebellar cortex and ataxia severity, these analyses were typically performed based on the whole brain template (D’Abreu et al. 2012; Guimarães et al. 2013; Schulz et al. 2010). Due to limited contrast for cerebellar structures, whole-brain VBM analysis has been especially affected by inter-subjects misregistration in the infratentorial space. To circumvent this issue, a high-resolution atlas template of the human cerebellum and brainstem (SUIT, spatially unbiased infratentorial template) has been developed, which preserves more anatomical details of the cerebellum and has less spatial variance across individuals (Diedrichsen 2006). Recent studies have shown that the SUIT is more sensitive than the whole-brain template to identify cerebellar morphological changes (Hirjak et al. 2015; Lindig et al. 2019; Wolf et al. 2015). Besides, conventional VBM analysis was usually performed by a mass-univariate approach, in which all the voxels are tested individually to detect statistically abnormal brain regions. However, co-varied and distributed effects across the brain voxels were ignored in the univariate approach. Moreover, the univariate approach describes differences at the group level and cannot make a prediction at the individual level, which is more desirable in clinical practice.

Recently, machine learning techniques, such as Multi-voxel pattern analysis (MVPA), have been increasingly used to discover the potential biomarkers within the neuroimaging data (Mateos-Pérez et al. 2018). As a multivariate approach, MVPA can provides an ideal framework for investigating the relationship between the spatially distributed pattern of brain activation and clinical measurement, which can subsequently be used to predict the individual subject (Weaverdyck et al. 2020).

In the current study, we investigated the associations between alteration of cerebellar gray matter (GM) and clinical measurements (disease duration, SARA scale, and ICARS scale) in SCA3 patients by using the MVPA approach. We hypothesized that multivariate analysis would be sensitive to identify associations between cerebellar GM atrophy and clinical measurements, and that alteration patterns of cerebellar GM atrophy can predict an individual’s clinical measurements in SCA3 patients.

Methods

Subjects

A group of 66 genetically confirmed SCA3 patients and 58 age- and gender-matched healthy individuals were recruited between December 2018 and March 2020. Demographic and clinical data of the study population are given in Table 1. All the participants were right-handed. None of them had a history of alcohol abuse, previous neurologic disorders, and contraindications for MRI examination. The local ethical committee approved the study protocol. All participants signed informed consent before the study procedure.

Table 1 Demographics and clinical characteristics of SCA3 patients and controls

Clinical assessment

The severity of cerebellar ataxia was assessed based on the SARA scale and ICARS scale by an experienced neurologist within 3 days before the MRI scan. Disease duration was defined as the time from motor symptoms onset to MRI examination.

Image acquisition

All participants underwent an MRI examination using a 3-Tesla Siemens Skyra scanner with a 20-channel head-neck coil. High resolution anatomical scans were acquired using T1-weighted 3D magnetization prepared rapid gradient-echo (MPRAGE) sequence with the following parameters: repetition time = 2300 ms, echo time = 2.3 ms, inversion time = 900 ms, flip angle = 8, field of view = 256 × 256 mm2, matrix = 256 × 256, bandwidth of 200 Hz/Px, 192 slices per slab, voxel size = 1.0 mm×1.0 mm × 1.0 mm, acquisition time = 5.18 min.

Image Processing

Structural MRI data analysis was undertaken using Statistical Parametric Mapping software (SPM12, http://www.fil.ion.ucl.ac.uk/spm) based on MATLAB R2017b. Cerebellar data processing was performed using SUIT toolbox (v3.2) (http://www.diedrichsenlab.org/ imaging/suit.htm). Before data preprocessing, all the subjects’ images were visually checked to ensure acceptable image quality and were manually reoriented to the anterior commissure of each subject in order to minimize errors during the image processing. The processed steps were as follows: firstly, the cerebellum and brainstem structures were isolated from the surrounding tissues, and tissue probability maps of cerebellar gray-matter and white-matter were obtained. The masks of cerebellar gray matter, cerebellar white matter and cerebellum were created with an absolute threshold of 0.2. Secondly, the cerebellar GM segmentation maps were then normalized and resliced to the SUIT template using the DARTEL algorithm. Finally, the probability maps of cerebellar GM were smoothed with a 4-mm FWHM isotropic Gaussian filter in SPM12.

T1-weighted images were also processed using SPM12/DARTEL to obtain total intracranial volume (TIV), which was calculated for each subject by adding the volume of gray matter, white matter, and cerebrospinal fluid. These TIV values were then used as covariates to account for individual differences in whole-brain volume in the following analysis.

Anatomical localizations were identified by the probabilistic MRI atlas of the human cerebellum (Diedrichsen and Zotow 2015; Diedrichsen et al. 2009).

Univariate SPM analysis

A voxel-wise two-sample t-test based on the smoothed and normalized cerebellar GM images was conducted to identify cerebellar GM differences between SCA3 patients and healthy controls. The sex, age, and TIV of each participant were entered as covariates of no interest. Results were considered significant at p-values < 0.05 after FWE correction at voxel-level with a minimum cluster extent of 499 voxels (corresponding to p-values < 0.05, FWE corrected at cluster-level).

In order to compare with the results of MVPA, we also investigated the relationship between cerebellar GM and clinical measurements (SARA scores, ICARS scores, and disease duration) in our SCA3 group by using a univariate VBM analysis. A multiple regression model was conducted to look for regions with either a linear increase or decrease in cerebellar GM associated with clinical measurement. The sex, age, and TIV were entered as covariates of no interest. Statistical inferences were made at p-values < 0.001 uncorrected at voxel-level with a minimum cluster extent 20 voxels. The results were also observed at a more conservative statistical threshold (p-values < 0.05 after FWE voxel -level correction with a minimum cluster extent 20 voxels).

Multi-voxel pattern analysis

Regression-based MVPA was performed using the Pattern Recognition for Neuroimaging Toolbox (PRoNTo v2.1) (http://www.mlnl.cs.ucl.ac.uk/pronto/). The smoothed and normalized cerebellar GM images were inputted as the input features, while different clinical measurement was entered as the regression target. We tested three pattern regression models implemented in PRoNTo: Relevance Vector Regression (RVR), Gaussian Process Regression (GPR) and Kernel Ridge Regression (KRR). We found that the results were similar across these regression models but RVR showed slighter better results. Therefore, we only present the results of RVR for the sake of brevity (the results of the other models were listed in the Supplementary Material). RVR is a sparse kernel-based pattern recognition method based on a probabilistic Bayesian framework. The model weights are initially assigned a Gaussian prior with mean zero and then are iteratively optimized through the training process. The optimized posterior distribution over the model weights can then be used to predict the target value for a previously unseen input vector by computing the predictive distribution (Tipping 2001). Although age, gender and TIV were considered potential confounders affecting the patterns of cerebellum grey matter, removing these confounds is likely to remove not only the variability related to the confounds but also variability associated with the labels in the data (Portugal et al. 2019). Herein, we performed the analysis without removing confounds. We also repeated the analysis considering age, gender and TIV as confounds (covariates) and submitted the results to the Supplementary Material.

The model performance was evaluated based on two different cross-validation strategies (leave-one-subject-out cross-validation and tenfold cross-validation) with normalizing the samples and mean-centering features across training data. Leave-one-subject-out cross-validation is a validation method in which one sample is left out as the testing data, the other samples are used to train the model, and doing so N times so that each sample is left out once. In tenfold cross-validation, the original sample is randomly divided into 10 subgroups. One of the 10 subgroups is iteratively assigned as the testing data, and the remaining subgroups are used as training data until all subgroups have been used for testing data. The Pearson correlation coefficient and mean square error (MSE) between the actual value and predicted value across all subjects were computed to quantify the prediction accuracy. The significance of both the correlation coefficient and the MSE score was estimated using a permutation test with 1,000 iterations. Permuted p-value < 0.05 was considered significant.

In addition to the main aim of the current study, classification-based MVPA was also performed to investigate whether healthy and SCA3 patients can be distinguished based on the alteration pattern of their cerebellum GM. Two pattern classification models, binary support vector machines (SVM) and Gaussian processes classification (GPC), were applied for the same cross-validation schemes and the same sample (124 subjects n = 66 SCA3 patients and n = 58 healthy individuals). We found that the results were similar in the SVM model and the GPC model. Therefore, we only presented the results of GPC model without removing confounds and submitted the results of SVM classification model to the Supplementary Material. The classification performance was evaluated using balanced accuracy and accuracies per class. A permutation test with 1000 permutations was used to determine the significance of the classification performance measures.

To view the predictive model results, the weight maps were built at both the voxel level and region of interest (ROI) level. In the ROI level, each cerebellar subregion was defined by the SUIT atlas, and the mean of all voxel weights (absolute values) within cerebellar subregions were computed. Finally, all the labeled subregions were ranked according to the normalized weights that contributed to the pattern recognition modeling.

Results

Participants

All participants’ characteristics were presented in Table 1. There were no differences in terms of age, sex between SCA3 patients and healthy controls.

Univariate SPM analysis

Compared to the control group, SCA3 patients demonstrated extensive GM volume reduction involving almost all cerebellar lobules, except the small part far away from the middle line in bilateral lobules VIIb, bilateral lobules VIIIa, and bilateral lobules VIIIb (Fig. 1). There was no evidence of increased cerebellar GM volume in the SCA3 group.

Fig. 1
figure 1

Comparison between SCA3 patients and the control group. Anatomical locations of the cerebellar lobules on the cerebellar flatmap for orientation (A). Group differences between SCA3 patients and the control group are displayed on the cerebellar flatmap (B) and SUIT template (C)

The cerebellar GM patterns associated with the SARA and ICARS scores were very similar (Fig. 2). With a less conservative statistical threshold (p-values < 0.001 uncorrected at voxel-level with a minimum cluster extent 20 voxels), cerebellar subregions showing significant negative associations with SARA scores and ICARS scores were mainly found in bilateral lobules I_IV, bilateral lobules VIIIb, bilateral lobules VIIIa, bilateral lobules VIIb, bilateral lobules CrusII, and bilateral lobules IX. However, under a conservative statistical threshold (p-values < 0.05 after FWE voxel -level correction with a minimum cluster extent 20 voxels), only small significant clusters involving bilateral lobules VIIb were shown (right: cluster size = 346 voxels, peak MNI coordinates x/y/z = 12/-68/-39, peak T = -6.31, left: cluster size = 49 voxels, peak MNI coordinates x/y/z =—-4/-74/-39, peak T = -5.76 for SARA scale; right: cluster size = 308 voxels, peak MNI coordinates x/y/z = 12/-68/-39, peak T = -6.24, left: cluster size = 95 voxels, peak MNI coordinates x/y/z = -5/-74/-40, peak T = -5.73 for ICARS scale). There was no significant positive association between cerebellar GM volume and clinical measurements (SARA scores and ICARS scores). There were also no significant positive or negative associations between cerebellar GM volume and disease duration, even with a less conservative statistical threshold (p-values < 0.001 uncorrected at voxel-level with a minimum cluster extent 20 voxels).

Fig. 2
figure 2

The result of univariate regression analysis. Cerebellar regions showing significant negative association with SARA scale (A) and ICARS scale (B) at p-values < 0.001 uncorrected with a minimum cluster extent 20 voxels. Significant clusters enclosed by the red circle were made at a more conservative statistical threshold (p-values < 0.05, FWE correction with a minimum cluster extent 20 voxels)

Multi-voxel pattern analysis

The application of RVR model to cerebellar GM allowed quantitative prediction of SARA scores with statistically significant accuracy (leave-one-subject-out cross-validation: correlation = 0.56, p-value = 0.001; mean squared error = 20.51, p-value = 0.001; tenfold cross-validation: correlation = 0.52, p-value = 0.001; mean squared error = 21.00, p-value = 0.001), and also allowed quantitative prediction of ICARS scores with statistically significant accuracy (leave-one-subject-out cross-validation: correlation = 0.59, p-value = 0.001; mean squared error = 139.69, p-value = 0.001; ten-fold cross-validation: correlation = 0.57, p-value = 0.001; mean squared error = 145.371, p-value = 0.001). In contrast, the application of RVR model did not allow accurate prediction of disease duration (leave-one-subject-out cross-validation: correlation = 0.18, p-value = 0.089; mean squared error = 18.25, p-value = 0.098; ten-fold cross-validation: correlation = 0.17, p-value = 0.089; mean squared error = 18.86, p-value = 0.13).

The voxel-based and ROI-based weight maps of patterns contributing to the RVR model predictions were displayed on the cerebellar flatmap (Diedrichsen and Zotow 2015). For the sake of brevity, we display the results of the model based on the leave-one-subject-out cross-validation in the main manuscript (the results of the model based on the tenfold cross validation and the other regression models can be found in the Supplemental Material: Fig.S01-S11, Table S01-S12). The weights distributions of the predictive model between SARA scores and ICARS scores were similar. The top 10 cerebellar subregions contributing most to the RVR model for predicting different clinical scales (SARA scores and ICARS scores) were displayed in Fig. 3 and Table 2. A total weighted value of these cerebellar subregions represented nearly 45% of the regression functions’ total weights. Generally, the cerebellar lobules with the highest contributions to the predictions were bilateral lobules I_IV, bilateral lobules VIIIb, right lobules VIIIa, right lobules IX, left lobules V, Vermis VIIIa, Vermis IX for both SARA scale and ICARS scale, left lobules VIIb for SARA scale, and Vermis VIIIb for ICARS scale.

Fig. 3
figure 3

The result of Multi-voxel pattern analysis. Top panel: Scatter plot showing the actual score vs. the corresponding predicted score for the SARA scale (A) and the ICARS scale (B). Bottom panel: Voxel based predictive pattern maps for the RVR model predicting SARA (C) scores and ICARS (D) based on leave-one-subject-out cross-validation. The color bar indicates the weight of voxels for decoding the clinical scale. ROI-based pattern map based on the top 10 cerebellar subregions contributing most to the RVR model for predicting SARA (E) scores and ICARS (F). The color bar indicates the percentage of the total normalized weights of each subregion. SARA: Scale for the Assessment and Rating of Ataxia; ICARS: The International Cooperative Ataxia Rating Scale. r: Pearson correlation coefficient. RVR: Relevance Vector Regression

Table 2 The top 10 cerebellar subregions contributing most to the RVR model based on leave-one-subject-out cross-validation for prediction of SARA scores and ICARS scores

The application of GPC models to cerebellar GM allowed accurately discriminate SCA3 patients versus healthy individuals with statistically significant accuracy (leave-one-subject-out cross-validation: balanced accuracy = 91.90%,p-value = 0.001, SCA3 accuracy = 92.42%, p-value = 0.001 and healthy accuracy = 91.38%, p-value = 0.001; ten-fold cross-validation: balanced accuracy = 90.39%, p-value = 0.001, SCA3 accuracy = 89.39%, p-value = 0.001 and healthy accuracy = 91.38%, p-value = 0.001). The voxel-based and ROI-based weight maps of patterns contributing to these classification model predictions were also presented in the Supplemental Material (Fig.S12-S19, Table S13-S17). Generally, the top 10 cerebellar subregions that contribute most to classification were very similar to those that contribute most to prediction.

Discussion

The current study used an MVPA approach to investigate associations between the pattern of cerebellar GM loss and ataxia severity in SCA3 patients. The application of the RVR model to cerebellar GM images allowed the prediction of individual cerebellar ataxia. Moreover, those cerebellar subregions most contributing to the predictive model also demonstrated significant associations with these clinical scores in univariate regression analysis, which in turn supported our multivariate pattern analysis.

The cerebellum is the main structure affected by SCA3. Neuropathological and structural MRI studies have confirmed neuronal loss, structural or functional degeneration in the cerebellar cortex in SCA3 (Lukas et al. 2006; Reetz et al. 2013; Scherzed et al. 2012; Stefanescu et al. 2015). In line with the findings of previous studies, we also detected marked GM loss in the cerebellum when compared with health controls in the univariate analysis. Moreover, due to the use of the SUIT template to improve normalization of the cerebellum structures and registration of the infratentorial space, more extensive loss of the cerebellar GM volume, involving almost all lobules of the cerebellum, was identified in our study. In the present study, the cerebellum regions with the largest weights for both regression-based and classification-based MVPA were consistent with the result of current univariate analysis and previous literatures, suggesting that the close relationship between cerebellar GM atrophy and the progression of SCA3.

Cerebellar ataxia is the most prominent symptom of SCA3, which is characterized by progressive incoordination of body movement and gait. As two main scales for assessing the severity of cerebellar ataxia, both the SARA scale and ICARS scale have been previously validated to be practical and efficient in SCA3 patients (Perez-Lloret et al. 2021). Although assessment items and assessment time were slightly different, the SARA scale and the ICARS scale were reported to be highly correlated (Yabe et al. 2008). Therefore, in the present study, similar altered patterns of cerebellar GM for the SARA score and ICARS score were illustrated both in the MVPA and univariate regression analysis.

In the MVPA, we found the cerebellar subregions that contributed most to the RVR model included lobules I_IV,lobules VIIIa, lobules VIIIb, and lobules VIIb. The damage of cerebellar subregions is highly symptom-specific. Lobules I_IV belongs to the cerebellum’s anterior lobe, which is considered a somatotopic representation of the superior cerebellar cortex (Guell & Schmahmann 2020; Lehman et al. 2020). Previous studies based on voxel-based lesion symptom mapping analysis have indicated that lesions located at lobules I_IV were correlated with ataxia of movement, posture, and gait (Drijkoningen et al. 2015; Gellersen et al. 2017). Besides, as an important part of the sensorimotor network, lobules VIIIa/b are supposed to the secondary representation of somatosensory; and lobules VIIb belonged to the executive network is involved in the execution of complex motor task (Buckner et al. 2011; Habas et al. 2009; Stoodley & Schmahmann 2009). In some studies, these cerebellar subregions have also been reported to be associated with motor impairments and cerebellar ataxia (Goel et al. 2011; Lukas et al. 2006; Reetz et al. 2013). More interestingly, similar to the result of MVPA, we found cerebellar GM loss in these subregions were also negatively associated with the SARA scores and ICARS scores in the univariate regression analysis, which further strengthens the evidence linking cerebellar GM loss with ataxia severity.

In the univariate analytical approach, each voxel in cerebellar GM images is interpreted as a spatially independent unit and tested individually against the ataxia score. Although the univariate analysis is well suited to detect robust and localized effects, it is not very sensitive to detect the differences in spatially distributed patterns. In contrast to the univariate approach, MVPA, as a multivariate machine learning method, focus on whether the spatial pattern of alterations across the brain is correlated with clinical symptoms (Mateos-Pérez et al. 2018; Weaverdyck et al. 2020). In recent neuroimaging studies, MVPA based on the RVR model has also been successfully applied to predict illness severity in patients with psychiatric or neurological disorders (Abela et al. 2019; Tognin et al. 2013). Cerebellar ataxia is a neurological dysfunction of motor coordination that can affect limb movement, balance, and gait, as well as oculomotor control, depending on the different cerebellar regions involved (Schoch et al. 2006). Due to different cerebellar subregions involved and different involvement degree, patients with cerebellar ataxia can take on different manifestations. This indicated that the alteration pattern of cerebellar cortex atrophy is important for predicting ataxia severity in SCA patients. In addition, it should be noted that most of significant clusters were eliminated after correction for multiple comparisons to control Type I error in the univariate regression analysis. Therefore, these results suggest that MVPA might be more suitable for detecting the subtle and spatially alterations across the cerebellum in SCA3 patients, while these alterations could not be identified using a univariate approach.

In this study, we could not find the association between cerebellar cortex atrophy and disease duration, whether based on univariate regression analysis or based on MVPA. One of the possible explanations for our results is the floor effect (D’Abreu et al. 2012). Due to the average disease duration in our SCA3 group was about 8.49 years, the atrophy degree in the cerebellar cortex was usually so obvious that minor progressive changes would not be identified. In addition, previous studies also showed a diversity of the association between disease duration and cerebellar cortex atrophy in SCA3 patients (D’Abreu et al. 2012; Goel et al. 2011; Schulz et al. 2010).

There were some limitations to the present study. Firstly, the present study is a single center study with relatively small sample size. More importantly, although we applied two different cross-validation strategies (leave-one-subject-out cross-validation and tenfold cross-validation) to demonstrate the reliability of the predictive model, ideally, the predictive model should be validated with truly independent samples. Thus, multi-center studies with larger sample sizes are necessary to confirm the robustness of predictive models. Secondly, previous studies have indicated that ataxia severity was also negatively associated with cerebellar white matter, brainstem, and supratentorial cerebral structure (de Rezende et al. 2015; Kang et al. 2014). In further research, it would be interesting to improve the prediction model by using multi-modal image data (for example, combining T1weighted image and DTI image) involved the above brain structure.

Conclusions

In summary, unlike univariate analysis, MVPA focuses on a distributed pattern of alteration across the brain associated with clinical symptoms. Our results suggested that MVPA is a valid approach for predicting ataxia severity at the individual subject in SCA3 patients. This result also presents a novel perspective to elucidate cerebellar pathophysiological alterations in SCA3 patients.