Introduction

Cognitive impairments are present in both late and early stages of Parkinson’s disease (PD) (Chaudhuri et al. 2006) and have been observed in the domains of executive functioning, memory, and psychomotor speed (Aarsland et al. 2010; Koerts et al. 2011). As the disease progresses, PD patients also exhibit a cognitive decline (Muslimovic et al. 2007).

Measures often used to assess executive functioning in PD are verbal fluency tests (VFT) (Henry and Crawford 2004; Lange et al. 2003). VFT are very sensitive in PD (Williams-Gray et al. 2009), easy to apply, and not time consuming. Furthermore, impairments in VFT can already be seen in the early stages of PD (Henry and Crawford 2004). VFT require a time-restricted generation of multiple response alternatives under constricted search conditions and include an associative retrieval of words based on semantic or phonemic criteria. It is assumed that VFT require an efficient organization of retrieval from memory as well as the ability to keep track of the responses already given (i.e., short-term memory). Furthermore, they require initiation of behavior, inhibition of inappropriate responses, and, subsequently, flexibility to shift to appropriate responses (Henry and Crawford 2004). Many cognitive functions, including memory, executive functioning, and psychomotor speed, are thus involved in VFT performance. Consequently, impairments in various cognitive functions may impair performance on VFT. However, since the scoring of VFT mostly depends on the number of correct responses, it is not clear which cognitive functions or impairments are reflected by this score. Furthermore, since PD is a progressive neurodegenerative disease, it is also not clear whether scores on VFT reflect impairments in the same cognitive functions throughout different stages of the disease. This study aims to investigate which cognitive functions (i.e., executive functions, psychomotor speed, and memory) are measured with VFT in PD, in particular at different stages of the disease.

Methods

Participants

Eighty-eight PD patients participated in this study. All patients were recruited from the Department of Neurology of the University Medical Center Groningen (UMCG) and were diagnosed with idiopathic PD according to the UK Parkinson’s-Disease-Society-Brain-Bank criteria. Disease severity was assessed with the unified Parkinson’s disease rating scale and Hoehn and Yahr scale (H&Y). Symptoms of depression were assessed with the Montgomery–Åsberg depression rating scale (MADRS) (Leentjens et al. 2000). A levodopa equivalent daily dose (LEDD) was calculated for all patients (Esselink et al. 2004) who were assessed in their regular on-state after medication. In order to determine what is measured with VFT at different disease stages, PD patients were divided into two groups according to the H&Y staging: H&Y stage 1–2 (i.e., mild PD; n = 60) and H&Y stage 2.5–3 (i.e., moderate PD; n = 25). Three patients were not allocated to the H&Y groups since their H&Y score was unknown.

In addition, 65 healthy participants were included who were recruited from the Groningen community or were related to other participants. Level of education was rated for all participants with a Dutch education scale ranging from 1 (elementary school not finished) to 7 (university degree). PD patients and healthy participants did not differ in age (t = 0.42; p = 0.68), gender (Chi-square = 0.87; p = 0.35), and education level (z = −1.05; p = 0.29). PD patients in H&Y 2.5–3 were, however, older (t = −3.04; p = 0.003) and had a lower education level than PD patients in H&Y 1–2 (z = −2.00; p = 0.045). These groups did not differ with regard to gender (Chi-square = 2.00; p = 0.57). Furthermore, PD patients in H&Y 2.5–3 reported more symptoms of depression than PD patients in H&Y 1–2 (t = −3.34; p = 0.002). No differences were found between these groups with regard to LEDD (t = −1.16; p = 0.25). Descriptive and disease characteristics of groups are reported in Table 1. Participants with a mini mental state examination (MMSE) score of <24 were excluded. This study was approved by the medical ethical committee of the UMCG. All participants signed an informed consent prior to study inclusion.

Table 1 Demographic and clinical characteristics of all PD patients (n = 88), PD patients in H&Y 1–2 (n = 60), PD patients in H&Y 2.5–3 (n = 25), and healthy controls (n = 65)

Stimulus material and procedure

All participants were assessed with semantic and phonemic VFT and tests of executive functioning and psychomotor speed. Fifty patients (57 %) and thirty-two healthy participants (49 %) were also assessed with a verbal memory test. Since the completion of this memory test was added at a later point in time, not all participants performed this test.

VFT

Semantic VFT required participants to produce as many words as possible belonging to a semantic category within 1 min. Two categories were used, animals and professions. For both tests, five measures were calculated: (1) number of correct responses (errors were defined as repetitions or words other than animals or professions), (2) number of clusters, (3) size of the largest cluster, (4) number of extra-dimensional shifts, and (5) number of intra-dimensional shifts. Clusters were defined as groups of at least two successively generated words belonging to the same semantic category. With regard to the animal test, clusters were based on a previously specified classification (van Beilen et al. 2004). Possible clusters were birds, fish, insects, pets, rodents, reptiles, farm animals, wild animals, foreign animals, fantasy animals (e.g., unicorn), and associations (e.g., baby-dog, baby-bird). Concerning the profession test, no a priori cluster scheme was available. Therefore, a cluster scheme was derived from the actual patterns of words generated by participants. Nine possible clusters were defined including professions in administration, commerce, health care, agriculture, building, education, transport, hotels and catering industry, and police or military.

For each VFT, the list of words generated by each participant was evaluated by two trained examiners. First, clusters were identified. When a word could be allocated to two or more clusters, a decision was made by means of consensus between the two examiners. For example, if the word sequel was giraffe–elephant–shark, it is not clear whether shark should be included in a cluster of foreign animals or in the cluster fish. The decision was based on the word that followed the word shark (i.e., if lion followed shark then shark would be allocated to the cluster of foreign animals; if salmon followed shark it would be allocated to the cluster of fish) (van Beilen et al. 2004).

After the identification of the clusters, the number of extra- and intra-dimensional shifts was determined. Extra-dimensional shifts were defined as transitions between clusters (e.g., from insects to pets); intra-dimensional shifts were defined as transitions within a cluster (e.g., from African to Australian foreign animals). Errors were taken into account when calculating the number of extra- and intra-dimensional shifts.

Phonemic VFT required participants to produce as many words as possible starting with the letter D, A, or T within 1 min each (Schmand et al. 2008). Again, five measures were determined for each phonemic VFT: (1) number of correct responses (excluding errors, i.e., names, words starting with letters other than D, A, or T, and repetitions), (2) number of clusters, (3) size of the largest cluster, (4) number of extra-dimensional shifts, and (5) number of intra-dimensional shifts. Troyer’s classification (Troyer et al. 1997) was used to rate the number of clusters. Possible clusters were words starting with the same first two letters, rhymes, first and last sounds, i.e., words differing only by a vowel sound, regardless of the actual spelling and homonyms. Since this classification did not entirely cover the clusters that were encountered during the rating, two possible clusters were added: verb inflections and semantic associations (e.g., apricot, apple). After clustering, the number of extra- and intra-dimensional shifts was also determined for the phonemic VFT.

Executive functions

The Stroop color word test (Hammes 1978) was used to measure inhibition. Cognitive flexibility was assessed with the trail making test part B (Reitan 1958) and the odd man out (Flowers and Robertson 1985). Finally, the digit span forward and backward of the Wechsler memory scale—revised (Wechsler 1987) were used to assess working memory.

Memory

The Rey auditory verbal learning test (Saan and Deelman 1986) was used to assess verbal memory.

Psychomotor speed

The trail making test part A (Reitan 1958) and the Stroop word card (Hammes 1978) were used to measure psychomotor speed.

Statistical analyses

Performance on the two semantic VFT were combined in a summary score. Furthermore, performance on the three phonemic VFT were summed up. Mean scores were calculated for each of the five different measures, separately for semantic and phonemic fluency. A multivariate analysis of variance (MANOVA), which protects against the inflation of type-I error (Leary and Altmaier 1980), was calculated to compare the performance of PD patients and healthy participants on VFT. Furthermore, a MANOVA was performed to compare groups regarding other cognitive functions. In addition, healthy participants, and PD patients in H&Y 1–2 and H&Y 2.5–3 were compared. Since the latter two groups differed with regard to age, education, and the score on the MADRS, two multivariate analysis of covariances (MANCOVAs) (with age, education, and the score on the MADRS as covariates) were used to compare groups regarding both their performance on VFT and other cognitive tests. Post hoc analyses using Tukey tests were performed when significant differences were found between groups. Effect sizes were calculated for all comparisons and a Bonferroni correction was applied to correct for the use of multiple MAN(C)OVAs. Based on previous research, impairments of cognitive functioning (i.e., VFT) were expected in patients with PD, resulting in one-tailed p values for analyses (Henry and Crawford 2004; Muslimovic et al. 2005; Troyer et al. 1998). Finally, to determine what is measured with VFT at different stages of the disease, four regression analyses were performed (method: stepwise). Two regression analyses were performed within the group of PD patients in H&Y 1–2 to determine to what extent the mean total scores of both the semantic VFT and the phonemic VFT could be predicted by executive functions, memory, and psychomotor speed. The same regression analyses were performed within the group of PD patients in H&Y 2.5–3. Since symptoms of depression, antiparkinsonian medication, and motor symptoms can influence the performance on VFT, these factors were also taken into account in the regression analyses. A Bonferroni correction was also applied to the regression analyses.

Results

Comparison between PD patients and healthy participants

Multivariate analysis of variance (MANOVA) revealed a significant difference between groups, indicating that PD patients showed a significant impairment in VFT (F(10, 100) = 2.87; p = 0.002). PD patients also performed more poorly in other aspects of cognition than healthy participants (F(10, 73) = 2.82; p = 0.003). The effect sizes of these differences were large (VFT: η 2 = 0.22; other cognitive tests: η 2 = 0.28). Subsequent analysis showed that PD patients produced fewer words during VFT than healthy participants as denoted by a significant difference in semantic VFT and a trend toward significance (p = 0.07) in phonemic VFT (Table 2). During semantic VFT, PD patients also produced significantly fewer clusters and extra-dimensional shifts than healthy participants. With regard to phonemic VFT, again a trend toward a significant difference between PD patients and healthy participants was found for the number of intra-dimensional shifts (p = 0.07). PD patients also showed significant impairments in verbal memory, inhibition, cognitive flexibility, and psychomotor speed (Table 2).

Table 2 Performance of PD patients (n = 88) and healthy controls (n = 65) on semantic and phonemic fluency tests and tests of executive functions, memory, and psychomotor speed (one-tailed)

Comparison between healthy participants and PD patients at different stages of the disease

Multivariate analysis of covariance (MANCOVA) with age, education, and a depression rating as covariates revealed a significant and large difference between groups with regard to the performance on VFT (F(20,194) = 1.67; p = 0.02; η 2 = 0.15). However, no differences were found between groups regarding the other cognitive tests (F(20,140) = 1.12; p = 0.17), even though a large effect size was found (η 2 = 0.14). Groups differed in particular with regard to the size of the largest cluster and the number of intra-dimensional shifts in phonemic VFT (Table 3). Subsequent post hoc analysis using the Tukey test indicated that the size of the largest cluster was significantly smaller in PD patients in H&Y 2.5–3 compared to PD patients in H&Y 1–2 (p = 0.02) and healthy participants (p = 0.01). No difference was found between PD patients in H&Y 1–2 and healthy participants (p = 0.99). The same pattern was found for the number of intra-dimensional shifts in phonemic VFT. PD patients in H&Y 2.5–3 showed fewer intra-dimensional shifts compared to both PD patients in H&Y 1–2 (p = 0.01) and healthy participants (p = 0.002), while no difference was found between the latter two groups (p = 0.79). With regard to semantic VFT, no significant differences were found between groups.

Table 3 Performance of healthy control participants (n = 65), PD patients in H&Y 1–2 (n = 60), and PD patients in H&Y 2.5–3 (n = 25) on semantic and phonemic fluency tests and tests of executive functions, memory, and psychomotor speed (one-tailed)

Regression analyses

The performance on semantic VFT of PD patients in H&Y 1–2 was predicted for 49 % by psychomotor speed (trail making test part A; F = 19.06; p < 0.001; Fig. 1a). The performance of PD patients in H&Y 2.5–3 was, however, predicted for 49 % by cognitive flexibility (trail making test part B; F = 19.48; p < 0.001; Fig. 1b). With regard to phonemic VFT, performance of PD patients in H&Y 1–2 was predicted for 27 % by psychomotor speed (trail making test part A; F = 9.58; p = 0.001; Fig. 1c). The performance of PD patients in H&Y 2.5–3 was predicted for 47 % by cognitive flexibility (trail making test part B; F = 13.25; p < 0.001; Fig. 1d).

Fig. 1
figure 1

Scatterplots of associations between the performances of PD patients in H&Y 1–2 and H&Y 2.5–3 on semantic and phonemic VFT, and the significant predictors of these performances. a Association between the performance of PD patients in H&Y 1–2 on semantic VFT and trail making test part A. b Association between the performance of PD patients in H&Y 2.5–3 on semantic VFT and trail making test part B. c Association between the performance of PD patients in H&Y 1–2 on phonemic VFT and trail making test part A. d Association between the performance of PD patients H&Y 2.5–3 on phonemic VFT and trail making test part B

Discussion

PD patients showed an impaired performance on VFT compared to healthy participants. In particular, PD patients generated fewer words than healthy participants, which is consistent with previous research (Henry and Crawford 2004; Muslimovic et al. 2005; Williams-Gray et al. 2007). The analysis of strategies applied during the tests revealed that PD patients generated significantly fewer clusters and extra-dimensional shifts during semantic VFT compared to healthy participants. No differences were found between these groups with regard to clustering on phonemic VFT and the number of shifts (both extra- and intra-dimensional) that were made during both semantic and phonemic VFT. These results are in line with previous studies suggesting that semantic fluency is typically more impaired in PD patients than phonemic fluency (Henry and Crawford 2004). However, it has also been found that clustering strategies are in general not often applied by elderly people and patients with neurodegenerative diseases (McDowd et al. 2011). This might explain the inconsistency of findings reported in the literature with some studies finding a decreased clustering in PD patients (Raskin et al. 1992) and others not revealing such a deficit (McDowd et al. 2011; Troyer et al. 1998). Furthermore, this might also explain why a decreased clustering has not been found consistently throughout all VFT as performed in the present study.

The main focus of this study was to determine what VFT measure at different stages of PD. Therefore, performance of mild PD patients (H&Y 1–2) was compared to the results of patients with a moderate PD (H&Y 2.5–3) and healthy participants. No differences were found between groups with regard to the number of words generated during both VFT. This is somewhat surprising, since earlier research showed that the clinical severity of PD was negatively associated with word production during VFT (Flowers et al. 1995). Furthermore, a longitudinal study reported that newly diagnosed PD patients progressed in 3 years from a mild to a moderate disease severity which was accompanied by a decreased performance on VFT (Muslimovic et al. 2009). A possible explanation for this discrepancy is that we used a cross-sectional approach instead of a longitudinal approach, which might not have been sensitive enough to detect subtle differences between mild and moderate PD patients. However, data analyses showed that moderate PD patients generated fewer words during semantic VFT than mild PD patients as revealed by a non-significant but medium to large difference. Furthermore, patients with moderate PD also used a different strategy to perform phonemic VFT. This was reflected by a significantly smaller size of the largest cluster and significantly fewer intra-dimensional shifts in moderate PD patients compared to mild PD patients. It is not surprising that moderate PD patients showed impairments on both measures, because these measures are not independent from each other: when the size of the clusters becomes smaller, there is consequently less opportunity to shift intra-dimensionally (Troyer 2000). Thus, the present study revealed differences between mild and moderate PD patients in verbal fluency, however, only on the level of the strategies underlying fluency performance. A similar effect of disease severity on clustering and switching was reported in a previous study (Troyer et al. 1998) in which demented PD patients showed fewer shifts during both semantic and phonemic VFT, and generated smaller clusters during phonemic VFT than non-demented PD patients. Therefore, the analysis of clustering and switching strategies in PD might provide a sensitive measure of cognition for disease severity in PD which is more sensitive than the total number of words. The present study also demonstrated that this measure is not only sensitive to the difficulties of patients with very advanced cognitive impairments, i.e., PD patients with dementia (Troyer et al. 1998), but also to the cognitive decline of non-demented PD patients.

In order to further elucidate what VFT determine at different stages of PD, the underlying cognitive functions which account for the performance on VFT in mild and moderate PD patients were assessed. Moderate PD patients did not show a significantly different performance than mild PD patients on tests of executive functions, memory, and psychomotor speed. However, the cognitive functions associated with the performance on VFT did differ between patient groups. The performance of mild PD patients on both VFT was most strongly predicted by psychomotor speed. It is known that impairments in psychomotor speed are already present at the early stages of PD (Muslimovic et al. 2005; Muslimovic et al. 2009). Also, the mild PD patients who were included in the present study tended to show a worse performance on the psychomotor speed measures than healthy control participants as indicated by medium to large effect sizes. Furthermore, research on healthy individuals demonstrated that psychomotor speed is crucial for successful performance on VFT (Unsworth et al. 2011). These results thus suggest that performance of mild PD patients on VFT is influenced by different cognitive functions which should be taken into consideration when interpreting VFT. Impairments of verbal fluency may not necessarily represent a specific deficit of executive functioning but may be the consequence of other cognitive difficulties, in particular a reduced psychomotor speed.

In contrast to patients with mild PD, the performance of moderate PD patients on both VFT was most strongly predicted by cognitive flexibility. It has previously been found that PD patients showed the greatest degree of decline over a 3-year period on the same cognitive flexibility test (trail making test part B) as used in this study (Muslimovic et al. 2009). This may indicate that aspects of functioning underlying this test become more prominent during the course of the disease and may, therefore, have a stronger impact on VFT performance of moderate PD patients than of mild PD patients. This suggestion is consistent with the result that moderate PD patients tended to show a decreased cognitive flexibility than mild PD patients as indicated by a large effect size and the small to medium differences in cognitive flexibility between mild PD patients and healthy control participants. Furthermore, studies on the effects of l-Dopa on cognitive functioning showed that l-Dopa alleviates cognitive flexibility impairments in PD patients in mild stages of the disease (Cools et al. 2001; Cools 2006). As the disease progresses, the cognitive flexibility impairments might become less reversible by the use of l-Dopa and, therefore, appear to be more prominent later in the course of the disease. In addition, recent studies show that the cholinergic neurotransmitter system, which is strongly related to cognition, including executive functioning, degenerates as the disease progresses (Bohnen et al. 2006; Hilker et al. 2005). This might also explain why functions such as cognitive flexibility start to play a more prominent role in the performance on VFT as the disease progresses. It has, however, to be pointed out that the test of cognitive flexibility used in the present study is strongly influenced by psychomotor speed. Considering that psychomotor speed is also a strong predictor of VFT performance in mild PD patients, one can argue that psychomotor speed is a consistent factor determining VFT performance of both mild and moderate PD patients. This is confirmed by studies reporting that PD patients show a decline in psychomotor speed as the disease progresses (Muslimovic et al. 2009). Furthermore, the spatio-temporal progression of dopaminergic degeneration also provides support for this finding, since the degeneration from the dorsal to the ventral fronto-striatal circuitry hampers the supplementary motor area, and the motor and premotor cortex already in the beginning of the disease (Cools 2006). Based on the findings of the present study, it can be concluded that in the moderate stages of PD both cognitive flexibility and psychomotor speed influence the performance on VFT.

The present results must be viewed in the context of some limitations. First, not all participants were assessed with the verbal memory test. Second, the group of moderate PD patients was relatively small. Therefore, the results of some regression analyses should be viewed with caution and a replication of the present findings on a larger group of patients would be desirable. Finally, it needs to be considered that the neuropsychological tests used in the current study were described as tests that capture one aspect of cognitive functioning, e.g., psychomotor speed. Neuropsychological measures are, however, not pure measures of one cognitive function. For example, the trail making test part A is not only measuring psychomotor speed but is also a test for visual search and scanning. Therefore, other cognitive functions may also have influenced test performances of participants.

In conclusion, different cognitive functions underlie the performance of PD patients at different stages of the disease in VFT. At the mild stages of PD, psychomotor speed accounts for the performance on VFT, whereas at the moderate stages cognitive flexibility (and psychomotor speed) are strong predictors. This indicates that impairments in VFT do not necessarily represent a specific deficit of executive functioning in patients with PD but should rather be interpreted in the context of disease severity and dysfunctions in other domains of cognition.