Introduction

Parkinson’s disease (PD) is a neurological disorder caused by the degeneration of dopaminergic neurons, leading to clinical features characterized by bradykinesia, rigidity, tremor and postural instability. Atypical parkinsonian syndromes (APS) such as progressive supranuclear palsy (PSP) and multiple system atrophy (MSA) differ from PD by more widespread neuronal involvement, resulting in additional clinical signs, more rapid disease progression and poor response to dopamine replacement therapy [1]. The majority of PSP and MSA patients develop clinical features that overlap those of PD and thus the correct diagnosis can be challenging in early stages of the disease. However, an accurate, early diagnosis is essential not only in assessing prognosis and making decisions regarding treatment, but also for understanding the underlying pathophysiology and for the development of new therapies [2]. Currently, a variety of imaging techniques such as Magnetic Resonance Imaging, Diffusion Tensor Imaging, Positron Emission Tomography, Single-photon Emission Computed Tomography and Transcranial Sonography may be used in the assessment of various parkinsonian syndromes [3]. In particular, automatic image-based classification based on metabolic patterns is highly accurate in distinguishing between PD, PSP and MSA patients at early stages of the disease, with more than 84 % sensitivity and 94 % specificity [4]. However, metabolic imaging is burdened by the invasive application of radiopharmaceuticals, whilst technical demands and financial costs may limit the application of other imaging methods.

Speech assessment is an inexpensive, non-invasive, quick and simple technique that could potentially be used in the evaluation of subjects with initial parkinsonism [5]. Speech disorder is a common clinical manifestation occurring in 70–100 % of patients with PD, PSP and MSA [68], and tends to emerge at an early stage [9, 10]. Whilst the majority of PD patients develop a clear form of hypokinetic dysarthria [6], PSP and MSA patients typically evolve mixed dysarthria with various combinations of hypokinetic, spastic and ataxic components [7, 8] due to the involvement of the basal ganglia, corticobulbar pathways and the cerebellum. Analyses of motor speech disorders may thus provide important clues to the diagnosis and pathophysiology of the underlying disease. However, perceptual dysarthria assessment may be difficult in early disease stages when speech impairment is often imperceptible [11]. To this extent, acoustic analyses have the unique potential to provide objective, sensitive and quantifiable information for the precise assessment of various deviant speech dimensions [10, 12].

Previous descriptions of speech in PSP and MSA have been mainly limited to the perceptual estimation of dysarthria type [7, 8], where spastic components appear to be more dominant in PSP and hypokinetic components in MSA. Considering individual speech aspects, only the occurrence of stuttering-like behaviour was reported to be specific for PSP [7, 8]. Few studies have provided more accurate objective descriptions of dysarthria in APS [9, 1315]. In general, these studies have shown that the impairment of specific speech dimensions is more pronounced in APS than in PD [1315]. Speech velocity, maximum phonation time, intonation variability and articulation precision were reduced and pauses were prolonged in PSP in comparison with PD [1315], whilst MSA patients manifested voice perturbations and slow and variable alternating motion rates (AMR) [9]. However, little effort has been put into the investigation of complex speech impairment in APS. A direct, objective comparison between individual speech patterns in PSP and MSA patients has never been performed and distinctive speech markers that would be suitable for the differentiation of various forms of parkinsonism remain generally unknown.

Therefore, the specific speech characteristics allowing discrimination between dysarthria in PD, PSP and MSA should first be determined in clinically probable patients, with the future goal of evaluating speech analysis as an instrument for early-stage differential diagnosis. In particular, we quantitatively assessed 16 key speech dimensions using objective acoustic analyses with the following aims:

  1. 1.

    To characterize the type and severity of dysarthria in PSP and MSA.

  2. 2.

    To determine specific dysarthric patterns and estimate their reliability in differentiating between PD, PSP and MSA.

  3. 3.

    To explore the relationship between speech and clinical manifestations to provide greater insight into the pathophysiology of dysarthria in APS.

Methods

Subjects

From 2011 to 2014, 12 consecutive patients with the clinical diagnosis of probable PSP (10 men, 2 women) and 13 patients with the diagnosis of probable MSA (6 men, 7 women) were recruited for the present study. In this series, 9 PSP patients were diagnosed with the Richardson’s syndrome (PSP-RS), 2 with PSP-parkinsonism (PSP-P) and 1 with PSP-pure akinesia with gait freezing (PAGF), whereas 10 MSA patients were diagnosed as the parkinsonian type (MSA-P) and 3 as cerebellar type (MSA-C). Additionally, 15 patients with idiopathic PD (9 men, 6 women) were investigated. The PD patients were selected in order to match PSP and MSA groups according to disease duration, which was estimated based on the self-reported occurrence of first motor symptoms. The diagnosis of PSP was established by the NINDS-PSP clinical diagnosis criteria [16], MSA according to consensus diagnostic criteria for MSA [17] and PD based on the UK Parkinson’s Disease Society Bank Criteria [18]. The diagnosis was further confirmed by two neurologists (CB, JK) with experience in movement disorders. At the time of examination, all PD subjects were on stable dopaminergic medication for at least 4 weeks, consisting of levodopa and different dopamine agonists. In the PSP and MSA groups, medication consisted of various doses of levodopa alone or in combination with different dopamine agonists and/or amantadine. None of the patients received antipsychotic therapy. PSP and MSA patients were further rated by the natural history and neuroprotection in Parkinson plus syndromes–Parkinson plus scale (NNIPPS) [19] whilst PD patients were scored according to the Unified Parkinson’s Disease Rating Scale motor subscore (UPDRS III). Item 18 of the UPDRS III was used for perceptual description of speech severity. Patient characteristics are summarized in Table 1.

Table 1 Clinical characteristics of patients

The healthy control (HC) group consisted of 37 age-matched subjects (21 men, 16 women; mean age 63.1, SD 7.9, range 50–75 years) with no history of neurological or communication disorders. All subjects recruited were Czech native speakers.

Speech recordings

Speech recordings were performed in a quiet room with a low ambient noise level using a head-mounted condenser microphone (Bayerdynamic Opus 55, Heilbronn, Germany) situated approximately 5 cm from the mouth of each subject. Speech signals were sampled at 48 kHz with 16-bit resolution. Each participant was instructed to perform sustained phonation of the vowel/a/per one breath as long and steadily as possible, fast/pa/-/ta/-/ka/syllable repetition at least seven times per one breath and monologue on a given topic for approximately 90 s. All participants performed the sustained phonation and syllable repetition tasks twice with a relatively high test–retest reliability (r = 0.77–0.93, p < 0.001).

Dysarthria assessment

Quantitative acoustic vocal assessment was performed to investigate 16 deviant speech dimensions associated with hypokinetic, spastic or ataxic dysarthria [20, 21], which correspond to previous descriptions of speech and neuropathological findings in patients with PSP and MSA [7, 8]. The deviant speech dimensions investigated were selected considering the possibility of their objective assessment using acoustic analyses. In addition, these speech dimensions were chosen in order to be gender independent [10, 12]; there were no significant differences between male and female healthy participants across all investigated acoustic variables.

We evaluated eight dimensions widely observed in hypokinetic dysarthria of PD, including airflow insufficiency, harsh voice, rapid AMR, inappropriate silences, reduce loudness, monopitch, imprecise vowels and dysfluency. Considering elements of spastic dysarthria, we assessed strained-strangled voice quality, slow AMR and slow rate. To capture components related to ataxic dysarthria, we examined excess pitch fluctuations, vocal tremor, irregular AMR, prolonged phonemes and excess intensity variations. See Table 2 and Supplementary Material Online for comprehensive details on acoustic speech analyses.

Table 2 List of speech dimensions for hypokinetic, spastic and ataxic dysarthria

Statistical analyses

Final values used for statistical analyses were calculated by averaging the data for each participant obtained in two vocal task runs. To assess group differences, each acoustic metric was compared across all three groups (PSP, MSA, PD) using a Kruskal–Wallis test with post hoc Bonferroni adjustment. Effect sizes were measured with Cohen’s d, with d > 0.5 indicating a medium effect and d > 0.8 indicating a large effect. The Spearman coefficient was calculated to determine correlations between speech variables in APS and NNIPPS subscales. The level of significance was set to p < 0.05.

Estimation of the type and severity of dysarthria across individual patients was inspired by previous research on dysarthria in PSP and MSA [7, 8]. First, as the reference interval, the 5th and 95th percentile was calculated from the probability distribution of healthy controls across each acoustic measurement. The speech performance of each subject was then compared with the reference interval across all speech dimensions. If the subject speech performance did not match the reference interval, it was considered as affected. Weighting factors in percentages were then applied to all affected speech performances in order to enhance the impact of distinctive dimensions according to specific dysarthria type (Table 2) [7, 8, 20, 21]. A total score was obtained reflecting the degree of hypokinetic, spastic and ataxic dysarthria components; possible scores ranged from 0 to 100 % for each type of dysarthria.

We additionally introduced a classification experiment to determine the best combination of acoustic features and estimate their sensitivity and specificity in differentiating between PD, PSP and MSA groups. A support vector machine (SVM) with a Gaussian radial basis kernel was applied to search for all combinations across acoustic features. Subsequently, a cross-validation scheme was used to validate reproducibility of the SVM classifier, where the original data were randomly separated into a training subset composed of 75 % of the data and a testing subset containing 25 % of the data; this cross-validation process was repeated twenty times for each combination. The overall classification performance of the SVM-based model was computed as the average percentage of correctly classified subjects into an appropriate group through all twenty cycles. Comprehensive details on classification procedure has been published previously [22].

Results

Table 3 provides numerical data and comparison between PD, PSP and MSA across all 16 speech dimensions investigated. In comparing PSP and PD groups, statistical analyses revealed significant alterations in three hypokinetic dimensions of harsh voice, inappropriate silences and imprecise vowels, one spastic dimension of slow rate and two ataxic dimensions of excess pitch fluctuations and irregular AMR. Comparison between MSA and PD groups revealed significant differences in all five ataxic dimensions but only in one hypokinetic dimension of inappropriate silences and one spastic dimension of strained-strangled voice. Notably, only one dimension of speech dysfluency was able to significantly separate PSP and MSA groups.

Table 3 Results of acoustic speech analyses

At least one deviant speech dimension was found in all PD and APS speakers. The severity of dysarthria was similar in PSP and MSA patients but considerably greater than in the PD group (Fig. 1a). Eight PSP (68 %) and 12 MSA (92 %) patients exhibited dysarthria with a combination of all hypokinetic, ataxic and spastic components, whereas all PD patients (100 %) manifested pure hypokinetic dysarthria. Conversely, pure hypokinetic dysarthria was found only in one PSP patient (8 %) and was more severe than in any PD patient investigated, whereas the remaining PSP and MSA patients showed a combination of at least one affected hypokinetic and one spastic or ataxic component. Speech in PSP was primarily characterized by the occurrence of hypokinetic components (51 %) followed by spastic components (43 %), whereas speech in MSA was characterized by the occurrence of ataxic components (56 %) followed by spastic components (45 %) (Fig. 1b). The majority of PSP patients (83 %) showed predominant hypokinetic, spastic or hypokinetic-spastic dysarthria. MSA patients manifested either predominant ataxic dysarthria (46 %) or showed ataxic dysarthria with various combinations and severity of hypokinetic and spastic components (Fig. 1c). Table 4 summarizes our findings and details the percentage of affected patients across individual speech dimensions.

Fig. 1
figure 1

Characteristics of dysarthria: a percentage of affected patients according to dysarthria severity; b percentage of deviant speech dimensions according to individual dysarthric components; c percentage of patients according to predominant type of their dysarthria

Table 4 Characteristics of deviant speech dimensions

The combination of six acoustic features related to five deviant speech dimensions including harsh voice (jitter), inappropriate silences (percent pause time and number of pauses), slow AMR (diadochokinetic rate), excess intensity variation (intensity variation) and excess pitch fluctuation (pitch variation) were able to separate PD from APS with a very high classification accuracy of 95.3 ± 6.4 %, with a sensitivity of 93.4 ± 8.7 % and specificity of 99.5 ± 4.1 %. Furthermore, the four deviant speech dimensions including harsh voice (harmonics-to-noise ratio), fluency (percent dysfluent word), slow rate (articulation rate) and vocal tremor (frequency tremor intensity index) were able to discriminate PSP from MSA with an accuracy of 75.2 ± 13.3 (sensitivity of 74.3 ± 15.3 %, specificity 81.2 ± 17.7 %).

Acoustic assessment of the extent of dysarthria severity in APS showed significant correlation to overall NNIPPS score (r = 0.54, p = 0.006). In addition, the bulbar/pseudobulbar NNIPPS subscore correlated with the severity of spastic dysarthria components (r = 0.42, p = 0.04) and the cerebellar NNIPPS subscore showed a correlation trend with severity of ataxic dysarthria components (r = 0.36, p = 0.07). From individual speech patterns, only slow rate showed negative correlation to the bulbar/pseudobulbar NNIPPS subscore (r = −0.47, p = 0.02). There were no other significant correlations between speech parameters and NNIPPS subscores.

Discussion

The current study is the first quantitative, objective investigation attempting to broaden our knowledge concerning speech disorder in PSP and MSA. Our results show that the characteristics of speech disorder may reflect the underlying neuropathology of PD and APS. Dysarthria was uniformly present in all patients with PSP and MSA and generally consisted of a combination of hypokinetic, spastic and ataxic components, whereas PD patients manifested pure hypokinetic elements. Therefore, using objective speech measurements, we were able to discriminate between APS and PD with 95 % accuracy. Moreover, the speech of PSP patients was characterized by the predominant occurrence of hypokinetic-spastic dysarthria whereas MSA patients manifested predominantly ataxic dysarthria, resulting in a discrimination accuracy of 75 % in the differentiation between PSP and MSA groups.

In contrast to previous perceptual examinations suggesting predominant spastic components in PSP and hypokinetic in MSA [7, 8], we objectively detected predominant hypokinetic components in PSP and ataxic in MSA. Interestingly, ataxic components were predominant even though the majority of our patients were MSA-P, probably reflecting great sensitivity of speech to minor cerebellar deficits. Furthermore, dysarthria was perceptually estimated to be less severe in MSA than PSP [23], whereas dysarthria was more severe in our MSA patients, probably as a result of greater disease disability. On the other hand, we may hypothesize that predominant ataxic dysarthria in MSA is perceptually more intelligible than hypokinetic dysarthria in PSP. Indeed, listeners who heard and subsequently transcribed ataxic speech benefited more from its exposure than did listeners who heard and then transcribed hypokinetic speech [24].

Recognizing characteristic deviant speech dimensions may have important implications in improving the accuracy of early clinical diagnosis [7, 8]. Dysarthria in PSP and MSA differed from that in PD due to greater severity and the presence of spastic and ataxic components. In the present study, at least one spastic or ataxic deviant speech dimension was detected in almost every PSP and MSA patient, including those with short disease duration. In comparing PSP and MSA, in addition to hypophonic monotony of parkinsonian speech, dysarthria in our PSP patients was dominated by increased dysfluency, decreased slow rate, inappropriate silences, deficits in vowel articulation and harsh voice quality, whereas patients with MSA more frequently manifested pitch fluctuations, excess intensity variations, prolonged phonemes, vocal tremor and strained-strangled voice quality.

Dysfluency was the only single speech aspect distinctive for PSP but was rarely observed in MSA. In particular, only two of our MSA patients showed increased dysfluencies, which were rather associated with cluttering in one case and poor working memory in the second case, as opposed to the stuttering-like behaviour typically observed in PSP and later stages of PD [7, 25]. The occurrence of stuttering-like behaviour may be due to involvement of the globus pallidus and primary motor cortex, which represent regions of the brain commonly affected in PSP [26]. In fact, stuttering was reported as a consequence of pallidal deep brain stimulation in patients with dystonia [27] and was widely present in manganese-induced ephedrone parkinsonism associated with toxic and neurodegenerative damage to globus pallidus [12]. In addition, motor planning responsible for control of fluency has recently been suggested to be coded in the left primary motor cortex whereas this speech motor-related asymmetry was missing in stuttering [28]. Yet, it has been shown that increased dopamine levels in PD may lead to the emergence of stuttering [29, 30], where the motor cortex may play a similar role as in the case of levodopa-induced dyskinesia [31].

In addition, MSA patients showed overall poorer voice control in comparison with PSP. The strained-strangled voice quality, excess pitch fluctuation and vocal tremor observed in MSA patients may together give the perceptual impression of quivery-croaky strained speech with increased pitch, whereas severe harshness in the voice of PSP subjects may resemble growling dysarthria. These aspects contributing to decreased quality of voice probably arise due to uncontrolled movements of the laryngeal muscles, fluctuation of vocal fold tension and incomplete vocal fold closure, representing a rather non-specific marker of neuronal dysfunction. Speech in PSP may be further characterized by a slower rate accompanied by inappropriate silence intervals, which was also noted in patients with Huntington’s disease [32] and thus it may be hypothesized as a result of damage to the striatum and generally more widespread neuronal atrophy. Furthermore, PSP patients manifested more affected vowel articulation than MSA, which may also contribute to a perceived reduction in intelligibility in PSP in comparison with MSA [23]. Conversely, speech in MSA exhibited more prolonged phonemes and excess intensity variations that substantially contributed to the perceptual impression of scanning dysarthria.

Predominant hypokinetic-spastic dysarthria with fewer ataxic components in our PSP group is consistent with observed widespread neurodegeneration involving the midbrain as well as the globus pallidus, striatum, hypothalamic nucleus, pons, superior cerebellar peduncle and cerebella dentate nucleus [26]. The clinical features of the dysarthria in our MSA patients showing predominant ataxic dysarthria with fewer spastic and hypokinetic components conform to the known neuropathological changes which include degeneration of cerebellum, middle cerebellar peduncle, striatum, substantia nigra, inferior olivary nucleus and pons [33]. However, only one previous neuropathological study identified relationship between the severity of hypokinetic components in PSP and the degree of neuronal loss and gliosis in the substantia nigra [34]. Our current findings support the role of corticobulbar pathways and the cerebellum in the development of mixed dysarthria in APS as we observed relationships between the severity of spastic components and bulbar/pseudobulbar manifestations, as well as between the severity of ataxic components and cerebellar signs.

The results of the present study indicate the potential of speech analyses in the differentiation of PD from APS, with 93 % sensitivity and 100 % specificity in patients with an average symptom duration longer than 2 years. These results are similar to recent neuroimaging studies reporting comparable sensitivity and specificity in metabolic pattern analysis or Diffusion Tensor Imaging in the differential diagnosis of parkinsonism [4, 35]. In addition, our classification results between PD and APS seem to be superior to very recently introduced breath analysis, which showed 88 % sensitivity and 88 % accuracy [36]. However, it is noteworthy to point out that our speech-based classification between PSP and MSA provided only 74 % sensitivity and 81 % specificity, whilst previous neuroimaging studies have reported 90 % sensitivity and 100 % specificity [4, 35].

Certain limitations of the present study should be noted. As our PD patients were investigated in their ON condition, we cannot exclude that some differences between PD and APS were more pronounced due to the beneficial effect of dopaminergic therapy. However, it is assumed that short-term dopaminergic therapy has no or very little effect on speech in PD [37]. We did not differentiate between speech in the various subtypes of PSP and MSA due to the limited opportunity in recruiting a larger number of participants. Nevertheless, at least in PSP patients, different subtypes of disease seem to have no substantial effect on global speech performance [14].

Objective identification of deviant speech dimensions can be diagnostically helpful in a number of neurological disorders and may provide measures of treatment response and disease progression. Future studies should further elaborate and extend our findings as well as show the sensitivity of speech in the differentiation between PD, PSP and MSA in very early disease stages.