Introduction

Parkinson’s disease (PD) is the second most common neurodegenerative disease in Europe [1]. First described as a movement disorder, it is now known that PD patients also suffer from a variety of non-motor symptoms. Loss of cognitive functions is very common in PD, even in the early disease stages [2, 3]. In the non-demented PD population, around 27% show signs of mild cognitive impairment (PD-MCI) [4]. Early and valid detection of PD-MCI is increasingly regarded as very important in clinical practice, due to its predictive value for Parkinson’s disease dementia (PDD) [5]. Furthermore, cognitive decline has a strong effect on patients’ quality of life [6], often leading to nursing home placement and increased risk of mortality [7]. Therefore, an early and valid diagnosis of cognitive impairment in the daily clinical routine is crucial. However, identifying PD-MCI accurately in a clinical interview alone seems to be inadequate [8]. A sensitive test, economic in time, is needed.

Several screening tools exist, and one of the most favored cognitive screening tools to identify PD-MCI is the Montreal Cognitive Assessment (MoCA) [9]. However, the MoCA was originally designed as a global cognitive assessment to detect mild cognitive impairment in Alzheimer’s disease (AD). There are studies validating the MoCA in PD, showing a clear benefit of the MoCA for detecting PD-MCI and PDD compared to the Mini–Mental State Examination (MMSE) [10,11,12]. Compared to the initial validation study in AD, some studies with PD cohorts suggest a different cut-off score for the MoCA to classify cognitive impairment [10, 11]. Though, there are also limitations to the MoCA regarding its application to a PD patient group. Fengler et al. [13] criticized that the scoring system of the MoCA does not consider subtest-discriminant power to distinguish between cognitive impairment and no cognitive impairment in PD. Considering the importance of executive dysfunctions in PD, it should be noted that the three subtests of visuospatial and executive functions only represent 30% of the total score, compared to orientation, which represents 20%. However, it is known that executive functions are the most prominent and often the first cognitive deficits noticeable in PD [14]. Therefore, it is a crucial domain to diagnose cognitive impairment in PD. Fengler et al. [13] developed a weighting system for the MoCA subtests by addressing the diagnostic accuracy of each subtest for cognitive impairment in PD (PD-MCI and PDD) with a receiver operating characteristic (ROC) analysis. By calculating the area under the curve (AUC) for each subtest, the authors weighted visuospatial and executive functions higher than before. For example, the Trail Making Test received only one point on the original MoCA and now four points, and word list learning performance is scored with three points, whereas it did not receive any points on the original MoCA. In contrast, orientations’ weight was reduced from six points to one point out of 30 compared to the original version. When testing their weighted scoring system in a small PD patient group, they found that the sensitivity of the weighted MoCA was higher than the original version, without loss of specificity for discriminating PD patients with any cognitive impairment from those with normal cognition.

Based on their low sample size, which included patients with PDD, it is still unclear whether this novel weighted MoCA score discriminates between PD-MCI and PD patients with no cognitive impairment. Thus, the aim of this present study was to validate the weighted MoCA scoring algorithm in a large non-demented PD cohort. We hypothesized that the weighted MoCA score would have a better diagnostic accuracy for PD-MCI and would be more highly correlated to results in other neuropsychological assessments than the unweighted original version.

Methods

Participants

Two hundred and forty-one PD patients were recruited from the outpatient clinic of the University of Tübingen as part of the “Amyloid-Beta in cerebrospinal spinal fluid as a risk factor for cognitive dysfunction in Parkinson’s disease” (ABC-PD) study. Patients between 50 and 85 years of age diagnosed with PD according to the United Kingdom PD Brain Bank criteria [15], who agreed to a lumbar puncture, were recruited. Exclusion criteria for the ABC-PD study participation were: diagnosis of PDD according to Level II consensus guidelines [16], severe concomitant diseases affecting patients’ judgement for informed consent, and history of substance abuse (except for nicotine). In addition, patients with deep brain stimulation (DBS) were excluded. In the present study, only patients with a complete MoCA assessment were analyzed; data of 12 (5%) patients with a missing MoCA were excluded from the analysis. Furthermore, 7 (2.9%) patients with concomitant neurological diseases (e.g., history of stroke, epilepsy) and 23 (9.5%) PD patients with a moderate or severe depression as defined by a cut-off ≥ 20 points on the Beck Depression Inventory II (BDI-II) [17], were also excluded to ensure that cognitive dysfunctions were primarily caused by PD [18, 19]. In total, 202 PD patients were included in the present data analysis. The study was approved by the local ethics committee of the University of Tübingen. All patients gave written informed consent.

Assessments

Demographics and medication intake to express the levodopa equivalent daily dose (LEDD) [20] were collected. The Unified Parkinson’s Disease Rating Scale part III (UPDRS-III) and the Hoehn and Yahr Stage (H&Y) were used to rate severity of PD-related motor symptoms. The BDI-II was applied to screen for signs of depression [17].

The MoCA is a cognitive screening tool developed to define mild cognitive impairment, assessing executive and visuospatial functions, abstraction, naming, orientation, attention, language, and memory performance. In this study, the original subtest scoring of the MoCA [9], as well as a new weighted scoring algorithm, was calculated [13]. The new scoring algorithm evaluates each domain by its individual discriminant power for PD with cognitive impairment. A maximum of 30 points can be reached in both versions. The MoCA was conducted before the neuropsychological test battery on the same day.

To distinguish between patients with and without cognitive impairment, a comprehensive neuropsychological battery was applied. Executive functions were assessed by semantic and phonemic fluency, and the Trail Making Test part B of the Consortium to Establish a Registry for Alzheimer’s Disease—Plus (CERAD-Plus) [21]. Memory performance was tested using the following three CERAD-Plus subtests: word list learning, word list recall, and praxis recall. Scores of the praxis (CERAD-Plus) and the fragmented words test (Leistungsprüfsystem, LPS 50+) [22] constituted visuospatial abilities. Attention was assessed with the digit-number and letter-number-sequencing subtest of the Wechsler Adult Intelligence Scale (WIE) [23]. The Boston naming test (CERAD-Plus) and the similarities subtest of the WIE evaluated language function. The CERAD-Plus corrects for education, age, and gender, while the LPS 50+ and WIE are normed for age.

Diagnosis of PD-MCI was made according to the MDS Level-II criteria [24]. Impairment in at least two neuropsychological tests (≤ 1.5 standard deviations (SD) from the population mean reported in the test handbooks) either in one or two cognitive domains was required for diagnosis of PD-MCI. PD patients who did not meet these criteria were classified as having normal cognitive function (PD-NC). Additionally, we classified PD-MCI with a cut-off of 1 SD and 2 SD equal or below the population means to classify cognitive impairment in an early and advanced stage in PD.

Statistical analysis

Study data were collected and managed using REDCap electronic data capture [25]. Data analysis was performed with the use of the IBM SPSS Statistics version 23 and the statistical software MedCalc (Version 17.1, MedCalc Software). Figures were created using Microsoft Excel 2013. Except for the UPDRS-III score, data were not normally distributed, as verified by the Shapiro–Wilk test. Therefore, the Pearson’s Chi-squared test (gender and Hoehn & Yahr stage), independent-samples t test (UPDRS-III score) and the Mann–Whitney U test (all other variables including MoCA) were conducted for between-group comparisons of PD-NC and PD-MCI. The Wilcoxon test was used to compare the original and weighted MoCA scores in all PD patients and cognitive subgroups. We also calculated the score difference of the two MoCA versions by subtracting the weighted MoCA scores from the original MoCA scores. With the Mann–Whitney U test we compared the MoCA score difference between PD-NC and PD-MCI.

A ROC analysis was conducted to validate the diagnostic accuracy of the original and weighted MoCA by means of sensitivity and specificity. The Youden’s index was calculated to define the optimal cut-off for the original and weighted MoCA for PD-MCI.

All group comparisons and ROC analyses were applied independently to each of the three PD-MCI classifications using a cut-off of either ≤ 1 SD, 1.5 SD, or 2 SD below the appropriate norms. The Spearman’s correlation coefficient (rs) was used to evaluate the strength of the association between the two MoCA scores. To identify the congruent validity of both MoCA scores, the scores were correlated with the average z values of all neuropsychological tests assigned to their respective cognitive domain.

Results

All 202 PD patients were classified according to the three differing PD-MCI classification approaches. The 1 SD cut-off led to 74 PD-NC (36.6%) and 128 (63.4%) PD-MCI patients, the 1.5 SD cut-off to 125 (61.9%) PD-NC and 77 (38.1%) PD-MCI patients, and the 2 SD cut-off to 162 (80.2%) PD-NC and 40 (19.2%) PD-MCI patients. In general, the PD-MCI patients suffered from more severe motor problems (see Table 1 for details) and showed significantly lower test performances on all neuropsychological tests and cognitive domains than PD-NC patients (p ≤ 0.001) (for details we refer to Online Resource Table 1). According to the 1, 1.5, or 2 SD cut-off, 93.8%, 93.5%, and 95.0% respectively, of all PD-MCI patients were classified as multi-domain PD-MCI.

Table 1 Characteristics of Parkinson’s disease patients with normal cognition (PD-NC) and the classifications of mild cognitive impairment (PD-MCI) according to different cut-off levels for cognitive dysfunction

The correlation between the original and weighted MoCA score was high (rs = 0.89, p < 0.001). In the total PD sample, the score range of the original MoCA varied between 16 and 30 (Median, Mdn = 26) points and on the weighted MoCA score between 11 and 30 (Mdn = 26) points.

For all PD-MCI classifications in both the original and weighted MoCA scores, PD-MCI patients had significantly lower values than the PD-NC group (p < 0.001). In the PD-MCI groups, the weighted MoCA had significantly lower values than the original MoCA across all classifications (p < 0.001) (see Fig. 1 for details). In the PD-NC patient groups, the original and weighted MoCA did not differ (1 SD: p = 0.06; 1.5 SD: p = 0.13), except for the 2 SD cut-off, where PD-NC patients showed a significantly lower MoCA score in weighted MoCA compared to the original MoCA (Mdn = 27, range 14–30 vs. 27, 18–30, p = 0.005) (see Fig. 1 for details). Comparing the score difference between the original and weighted MoCA revealed significantly higher differences for PD-MCI than PD-NC for all classifications (Mdn; 1 SD: 1 vs. 0; p = 0.029; 1.5 SD: 1 vs. 0; p < 0.001; 2 SD: 2 vs. 0; p < 0.001). Both MoCA versions were moderately associated with each cognitive domain (0.38 ≤ rs ≤ 0.52) and did not statistically differ in the strengths of association to each cognitive domain (p > 0.05) (see Table 2 for details).

Fig. 1
figure 1

Clustered boxplots for original and weighted Montreal Cognitive Assessment (MoCA) total scores for both Parkinson’s disease patient groups. Divided by the three classification cut-offs with different standard deviations (SD). a refers to the PD-NC and b to the PD-MCI patient group with no cognitive impairment (PD-NC). *Referring to a significant difference with p < 0.01

Table 2 Spearman’s rank correlation coefficients (rs) between each of the two MoCA scores and the cognitive domain scores including statistical comparison between these two correlation coefficients

AUC values of the original (0.76, 95% confidence interval, CI 0.70–0.82) and weighted (0.81, CI 0.75–0.86) version varied significantly in the 2 SD classification (p = 0.044), but not for the classification of PD-MCI according to the 1 SD (p = 0.32) and 1.5 SD cut-offs (p = 0.15). The ROC analysis identified different cut-offs maximizing both sensitivity and specificity for the original and weighted MoCA for the diagnosis of PD-MCI (Table 3). For both MoCA versions, an optimal cut-off of 26 was revealed using the 1 SD cut-off to define PD-MCI. Sensitivity showed a tendency to increase from 57.8% to 64.1% and specificity decreased slightly from 86.5% to 77.0% because of the weighted MoCA, leading to a slightly increased positive predictive value (PPV) from 54.2% to 55.3% and decreased negative predictive value (NPV) from 88.1% to 82.8% compared to the original version. With the 1.5 SD cut-off for defining PD-MCI, an optimal cut-off of 27 was revealed for the original MoCA and 26 for the weighted version. Here, due to the weighted MoCA, sensitivity slightly decreased from 77.9% to 75.32% and specificity showed a tendency to increase from 60.8% to 67.2% compared to original MoCA. The PPV remained stable (81.7% vs. 81.6%) and the NPV increased from 55.0% to 58.6% with the weighted MoCA. Using a 2 SD cut-off to classify PD-MCI, an optimal cut-off of 26 was revealed for the original MoCA and 24 for the weighted version. Sensitivity slightly increased from 70.0% to 72.5% due to the weighted MoCA and specificity also increased from 65.4% to 79.0% compared to original MoCA. Therefore, the PPV increased from 89.8% to 92.1% and the NPV increased from 33.3% to 46.0%.

Table 3 Diagnostic values of the original and weighted MoCA at an optimal cut-off score for the three classifications of mild cognitive impairment in Parkinson’s disease (PD-MCI)

Discussion

The purpose of the present study was to validate a novel weighted MoCA scoring algorithm for the diagnosis of PD-MCI in a large sample of non-demented PD patients.

The main results are that (i) both the original and the weighted MoCA scores differed significantly between PD-NC and PD-MCI patients, (ii) within PD-MCI patients, the weighted MoCA scores were significantly lower than those for the original MoCA, (iii) diagnostic accuracy of the two MoCA versions was found to be highly dependent on the cut-off score used to classify PD-MCI, and (iv) the association of both MoCA scores to the neuropsychological domain scores was comparable.

In the present study, the cut-offs for the original and weighted MoCA score were determined by maximizing the ratio of sensitivity and specificity (defined by the Youden’s index). For each version, the optimal cut-off was analyzed to ensure the highest diagnostic accuracy for PD-MCI of each MoCA score. With a cut-off ≤ 1.5 SD to define PD-MCI, we found an optimal cut-off of 26 for the weighted MoCA and 27 for the original MoCA. Therefore, our proposed cut-off for the original MoCA version is slightly higher than that of Nasreddine et al. [9], who recommended a score of 26. However, their suggestion applies to AD patients and is, therefore, not necessarily applicable to PD patients. Other studies have already discussed an optimal cut-off in PD. Using a 1.5 SD cut-off to identify patients with any cognitive impairment, Hoops et al. [11] found a cut-off score of 25, which is two points lower than ours. However, the cut-off was not only defined for PD-MCI but also PDD patients (summarized as any cognitive impairment), which might explain the lowered cut-off score. Dalrymple-Alford et al. [10] suggest a cut-off at 26 points for PD-MCI (also defined by a 1.5 SD cut-off). Our results for the original MoCA do not support these findings, as our results suggest a slightly higher MoCA cut-off at 27 points. Defining PD-MCI in our sample with a 1 SD cut-off, revealed optimal cut-off scores of 26 for both the original and weighted MoCA. With a 2 SD cut-off, for the original MoCA a cut-off 26 was identified, and for the weighted MoCA a distinctly lowered cut-off of 24. However, there are no studies confirming this cut-off for early and later stages of PD-MCI. More studies in large PD samples are needed to confirm the diagnostic cut-off of the MoCA.

With a 1.5 SD cut-off for PD-MCI, we did not find a significant difference between the AUC of the ROC analysis for the two MoCA versions. Due to the weighted MoCA, sensitivity was slightly lowered by 2.6% and specificity increased by 6.4% compared to the original MoCA. Compared to the initial study, sensitivity and specificity are altogether lowered for the weighted MoCA. We also did not find a significant difference of the AUC with a 1 SD cut-off. This does not support the notion of a superior discriminant power of the weighted MoCA score. However, with a 2 SD cut-off to define PD-MCI, the weighted MoCA seems to be advantageous to the original MoCA. The AUC level of the weighted MoCA was significantly higher than the AUC of the original version (AUC: 0.81 vs. 0.76; p = 0.044), which led to an increased sensitivity (70.0% vs. 72.5%) and specificity (65.4% vs. 79.0%). This improvement is also represented by a high PPV of 92.1% and moderate NPV of 46.3%. This was an unexpected finding as the weighted MoCA places a higher priority on executive function, which is considered to be highly dysfunctional in early stages of PD-MCI. However, the weighted MoCA takes visuospatial deficits more into account. Lower visuospatial cognitive function has been found to be associated with a faster cognitive decline and progression to PDD [26].

The fact that patients with PD-MCI not only scored significantly lower on the weighted than the original MoCA version, but that the score differences between the original and weighted version were substantially larger in PD-MCI than PD-NC, further indicates that the weighted MoCA score reflects cognitive impairment associated to PD-MCI to a greater degree. This effect was found to be independent of the applied PD-MCI classification cut-off. Cognitive impairment in PD is highly heterogeneous and its severity might reflect a continuum rather than a sudden onset of dysfunction. So far, progression of the cognitive decline is only partly understood; while some patients develop PDD within a short time period, others remain stable or even return to a non-impaired level [2, 27, 28]. However, no reliable, purely cognitive predictor, has been identified to encircle a risk group for PDD among PD-MCI. In summary, our findings show that the weighted MoCA detects cognitive dysfunction in PD-MCI to a greater degree, especially in more advanced stages of cognitive impairment. It is possible that PD-MCI patients scoring lower on the weighted MoCA version might be at higher risk for conversion to PDD than PD-MCI patients scoring higher on the weighted MoCA. In the PD-NC group, the weighted MoCA did not differ significantly from the original for the 1 SD and 1.5 SD cut-off for PD-MCI. However at a 2 SD cut-off, PD-NC patients scored significantly lower in the weighted MoCA than in the original MoCA. By application of this cut-off we suggest that there are at least some patients with cognitive impairment at a very mild stage in the PD-NC group. In our sample, deficits of those persons could be better detected by the weighted than the original MoCA. To further investigate the notion of a possible risk group, longitudinal studies are needed to monitor patients’ disease progression in large PD samples. In our study, the low MoCA scores pose the question of whether those patients might already have PDD. However, patients did not show any signs of activity of daily living dysfunctions, which is the core criteria for PDD diagnosis. Correlations between both MoCA versions and each cognitive domain also ranged at a moderate level and did not differ significantly, indicating that both versions reflect cognitive domains well.

As a limitation, the present study did not include patients with PDD; therefore, we do not know whether we could not replicate the results from the validation study due to these missing PDD patients, or because of a possible invalidity of the scoring algorithm. Another important difference compared to the study of Fengler et al. [13] is the exclusion of patients with moderate or severe depression in our analysis. In PD, depression is very common [29] and it is well-known that occurrence of depression has a negative influence on cognitive functions [18]. Hence, it is possible that the development of the new weighted scoring algorithm was, at least partly, affected by the presence and severity of depression. In our cohort, cases of moderate and severe depression were excluded and BDI-II total scores did not differ significantly between the remaining PD-NC and PD-MCI patients. Therefore, we concluded that cognitive functions could not be ascribed to severity of depression in our sample. In addition, compared to the initial study, we excluded patients with DBS. Cognitive decline after DBS surgery in PD has been controversially discussed [30, 31]. To diminish this possible cause of cognitive impairment we did not include patients with DBS.

It is important to mention that differences in the data between Fengler et al. [13] and our study might result from differences in the neuropsychological test assessments utilized to classify PD-MCI. Both studies used the CERAD-Plus test battery to assess memory, executive functions, and, to some extent, visuospatial and language impairment. However, some tests differ, especially in the attention domain. For more details see Online Resource Table 2. This might lead to varying interpretation of cognitive impairment in patients, even though both studies used the MDS Level II criteria to identify PD-MCI patients. Another noteworthy difference to the study by Fengler et al. [13] is that they used a 1.28 SD cut-off for the CERAD subtests as suggested by the test manual and a percentile rank for the two tests not related to the CERAD test battery. In our study, the diagnostic value of different cut-offs (1, 1.5 and 2 SD) for PD-MCI were compared, therefore, we cannot compare our study and the initial study entirely.

All neuropsychological tests applied were standardized, however, subtests of the WIE and LPS 50+ only correct for age whereas the CERAD-Plus additionally corrects for education and gender. Therefore, we cannot rule out that education and gender status may have, at least partly, affected our PD-MCI classification. As the proportion of males in our PD-NC and PD-MCI groups was comparable and the educational level did not differ statistically between groups, we conclude that between groups effect of the MoCA can be interpreted in our sample. Furthermore, our cohort did differ in motor symptoms which might partly influence some test results. There is also some evidence regarding the influence of dopaminergic therapy on cognitive test performances [27]. Normative values, especially for the MoCA, that take such confounders into account would be valuable for the assessment of cognitive impairment in PD. This also applies to the use of the MoCA in clinical routine as a screening tool for a first impression of patients’ cognitive status. Further comprehensive diagnostic Level II testing can then be applied after the noticeable MoCA score. In our study, the MoCA was conducted separately before the neuropsychological tests. However, we cannot exclude the influence of variabilities concerning the individual investigators or the time during the day of the assessment on test performances.

In summary, we conclude that the weighted MoCA has an advantage for detecting cognitive impairment in more advanced stages over the original version. However, we can only confirm a better overall discriminant power due to the novel scoring algorithm for PD-MCI patients classified with a 2 SD cut-off, leading to a high PPV and increased NPV compared to the original version. Therefore, the application of the weighted MoCA might have a higher potential to encircle those PD patients at risk for future conversion to PDD, which needs to be verified with longitudinal studies.