Introduction

Semantic memory is the component of declarative memory that refers to retrieval and use of cumulative knowledge about the meaning of words, people, places, and things without reference to the source, temporal context, or the experience that produced that knowledge. The most common clinical tests that measure aspects of semantic memory are vocabulary tests, general knowledge tests, object naming tests, and word fluency tests. The Pyramids and Palm Tree Test (PPT), originally developed by Howard and Patterson [1], is another popular task routinely used in clinical settings to assess semantic memory. This task measures the capacity to access detailed semantic information about words and pictures, necessary for the identification of the analogies, which link conceptually two perceptually, and functionally distinct entities. More precisely, in the word version of this task, the participant is presented with three written words, one above the other two, and the task is to match the top word (e. g., bottle) to one of the other two words (e. g., cup and grass) with which it is most closely associated. The picture version of the PPT is similar to the word version except that pictures are used rather than words.

The PPT, together with other semantic memory tasks, has proved more useful for the cognitive assessment of Semantic Dementia (SD) and Alzheimer’s disease (AD). These neurodegenerative pathologies predominantly affect the temporal lobe and are associated with semantic memory impairments. Individuals affected by SD have modest to severe semantic disorders that are observable across both visual and verbal modalities, and for both production and comprehension [24]. Alzheimer’s disease also determines major semantic memory problems, albeit somewhat milder than those observed in SD, in addition to episodic memory, executive functions and, in some cases, visual perception problems [4, 5]. It is important to note that, although SD and AD patients show the most severe, pure and consistent semantic disorders, patients who do not show typical semantic deficits may also perform poorly on some semantic memory tasks. For instance, recently, Rogers et al. [4] showed that patients with SD, AD, progressive nonfluent aphasia (PNFA), frontal variant frontotemporal dementia (fvFTD) and posterior cortical atrophy (PCA) all had abnormal scores (relative to controls) on different semantic measures (i.e., category fluency, confrontation naming, PPT). However, these groups of patients showed different patterns of performance across measures. These results suggest that the apparent semantic impairments in some neurodegenerative disorders are secondary to deficits in other cognitive abilities, such as verbal production (for spoken stimuli, responses, or both, as in PNFA), visual perception (for pictorial stimuli, as in PCA), and executive functioning (necessary to deal with multiple response alternatives or to generate serial responses, as in fvFTD), inherently taxed by tests that ostensibly measure semantic memory, More importantly, Rogers et al.’s [4] results suggest that, in order to achieve a better characterization of semantic memory impairments in the different neurodegenerative syndromes, it is important to use multiple tasks that span visual and verbal modalities and tap mechanisms of reception and expression.

To the best of our knowledge, there are no other normative data on PPT performance except those of Howard and Patterson [1] and the preliminary normative data recently obtained by Guddayol-Ferré et al. [6] and by Rami et al. [7]. However, the former set of data was obtained on a sample of only 60 healthy English-speakers subjects, while the latter referred to a Spanish-speaking population. The present study thus aimed to provide normative data on a large sample of elderly Italian population on both the word and the picture versions of the PPT. The effects of gender and education on performance in the PPT were also analyzed. The availability of normative data based on a large sample will allow a more reliable use of this neuropsychological tool for clinical assessment in Italian-speaking dementia population.

Method

Participants

A total of 464 healthy elderly (226 males and 238 females) participated in this study on a voluntary basis. Selection criteria for participating in this study were as follows: (a) a score of at least 21 on the Mini Mental State Examination (MMSE; [8]), (b) no history of significant neurological and/or psychiatric disorders, (c) no evidence of metabolic, endocrine or nutritional deficiencies, (d) age higher than 49 years, (e) more than 3 years of education. In order to meet the above selection criteria data from 101 participants, originally included in a sample of 565, were trimmed. The sample had a mean age of 64.52 years (range = 49–94; SD = 9.54) and a mean educational level of 8.5 years of schooling (range = 3–25; SD = 4.69). The distribution of the study group by age, gender, and education is presented in Table 1.

Table 1 Distribution of the 464 participants by age, gender and educational level

In order to assess whether educational level (years of education) differs across age groups and gender, a 4 × 2 ANOVA was performed with age-group (four levels 49–59; 60–69; 70–79; 80–95) and gender (two levels) as between-subjects factors and education as the dependent variable. Results showed a significant effect of age-group (F (3,456) = 24.27; \( \hat{\eta }_{\text{p}}^{2} \) = 0.138; P < 0.01), no effect of gender and no interaction. Post hoc analysis (Unequal N HSD test) for age group revealed that older adults were less educated than younger adults. More precisely, young adults aged 49–59 (M = 10.53; SD = 5.12) were significantly more educated than all other groups of older adults (60–69, M = 8.48, SD = 4.27; 70–79, M = 6.46, SD = 3.47; 80–95, M = 5.53, SD = 3.38). Similarly, adults aged 60–69 were more educated than all other groups of older adults (70–79 and 80–95). The two oldest groups (70–79 and 80–95) had an equivalent educational level.

Material and procedure

Both the word and the picture versions of the PPT consisted in 52 triads (plus 3 practice triads that are excluded from the analyses), presented on a A4 page (see Appendix for the Italian words used in the word version of the PPT). The target stimulus was always presented above the other two. For each triad, participants were asked to select, through finger pointing, the bottom item that was semantically related to that in the top. The maximum score was 52. Items for which the participant remained hesitant were scored 0.5 points as suggested by Howard and Patterson [1]. All participants performed both the word and picture versions of the PPT as part of a broader research program about the effects of healthy aging on different aspects of memory functioning. The word and picture versions of the PPT were administered in two different days, with a time gap of about 2 weeks from one another. The order of presentation of the word and picture versions was counterbalanced across all participants.

Statistical methods

The statistical analyses included various methods: analysis of variance (ANOVA) and covariance (ANCOVA), and simple and multiple linear regressions.

To assess the effect of age on semantic memory performance an ANOVA was performed and then, in order to control for the educational level, an ANCOVA was run.

As a first step for the standardization of the test, in order to preliminary check which demographic variables were to be inserted in the final models for their predicting value, simple linear regression analyses were conducted. Finally, in order to generate the prediction equation (essential for obtaining factors of correction for the scores of the test), a multiple regression analysis was run.

Results

Effects of age and educational level on performance in the PPT

Raw scores from the word and the picture versions of the PPT were analyzed in two different ways.

Initially, to assess whether semantic memory performance decreases with increasing age, two different ANOVAs (one for each version of the test) with age-group (four levels: 49–59; 60–69; 70–79; 80–95) as a between-subject factor and subjects’ test scores as a dependent variable were run. Results showed a significant effect of age-group both in the word (F (3,460) = 5.24; \( \hat{\eta }_{\text{p}}^{2} \) = 0.033; P < 0.01) and picture (F (3,460) = 8.30; \( \hat{\eta }_{\text{p}}^{2} \) = 0.051; P < 0.01) versions of the test. Trend analysis showed a decreasing linear trend for both the word (F (1,460) = 8.94; \( \hat{\eta }_{\text{p}}^{2} \) = 0.019; P < 0.01) and picture (F (1,460) = 17.39; \( \hat{\eta }_{\text{p}}^{2} \) = 0.036; P < 0.01) test versions.

As education was found to differ across age-groups, the effect of age on semantic memory was assessed after controlling for the educational level. Precisely, two separate ANCOVAs (one for each version of the test), with age-group as a between-subjects factor, subjects’ test scores as a dependent variable, and education (years of school) as a covariate, were conducted. For both versions of the test, results showed a significant effect of the covariate (word version: F (3,459) = 10.06; \( \hat{\eta }_{\text{p}}^{2} \) = 0.021; P < 0.01; picture version: F (3,459) = 15.51; \( \hat{\eta }_{\text{p}}^{2} \) = 0.033; P < 0.01). Results also indicated that after controlling for educational level, age-group differences in performance were still significant in the picture version (F (3,459) = 3.51; \( \hat{\eta }_{\text{p}}^{2} \) = 0.022; P < 0.05) but not in the word version (F (3,459) = 2.14; \( \hat{\eta }_{\text{p}}^{2} \) = 0.014; P = 0.09) of the PPT. Trend analyses showed a significant decreasing linear trend of age-group for the picture version (F (1,459) = 8.19; \( \hat{\eta }_{\text{p}}^{2} \) = 0.018; P < 0.01) and a marginal trend for the word version (F (1,459) = 3.73; \( \hat{\eta }_{\text{p}}^{2} \) = 0.008; P = 0.054).

It is known that vocabulary knowledge is related to years of schooling. It has been indeed observed in several studies that less years of schooling are usually associated with lower vocabulary knowledge [9]. Therefore, given that older adults were less educated than younger adults, it seems reasonable to assume that the major impact of education on the word version as compared to the picture version can be accounted for by cross groups differences in vocabulary knowledge.

Standardization of the word and picture versions of the PPT

To standardize the two PPT tests, the statistical methodology as proposed by Spinnler and Tognoni [10], and by Carlesimo et al. [11] was adopted. Preliminarily, simple linear regression coefficients between the scores obtained in the two versions of the PPT and age, education, and gender were calculated. Then, only those independent variables with a significant predicting value on the dependent variable (for P < 0.01) were inserted in the final models. As a result, only education and age were considered for the final regression equations, one for each version of the PPT, while gender was excluded as it was not significant. Before computing the regression coefficients, the most convenient transformations were sought. Education scores were transformed into quadratic scores. Such a transformation was necessary in order to avoid the application of non-linear regression coefficients, which are less easy to interpret and use [11]. On the other hand, transformation of age scores was not necessary. Indeed transformations of age scores did not produce regression coefficients higher than untransformed age scores.

In order to obtain possible factors of correction, regression coefficients were calculated by computing the final linear function model (one for each version of the PPT) that considered the influence of age and education on the semantic performance. As a result, correction tables of observed data were created based on the prediction equation (Tables 3, 5), by setting in the formula of the equation the values of age and education corresponding to the table.

Afterwards, non-parametric tolerance limits were calculated. The lowest limit of the tolerance interval was established with the ranks method [11]. Setting a confidence level of 95% and a one-tailed tolerance limit of 95% [12] for the 464 observations, the rank of 16 was identified. After correction, the observations were ranked according to a increasing ordinal scale for each version of the PPT. The score corresponding to the 16° observation was set as the lowest tolerance limit. The choice of using non-parametric limits was due to the fact that being the shape of the distribution of the population unknown, one cannot assume the normality of the distribution. On the contrary, non-parametric techniques do not need to assume the existence of a normal distribution of data and are exclusively based on observations made on an ordinal distribution.

To sum up, at first, the best linear model was extracted for each version of the PPT. Then, observations were corrected for age and education according to the correction tables produced based on the linear model. Then, corrected scores were ranked on an ordinal scale and the cut-off was established.

Word version of the PPT

Results from the final multiple regression analysis showed that both age (F (1,461) = 7.18; \( \hat{\eta }_{\text{p}}^{2} \) = 0.015; P < 0.01) and education (F (1,461) = 9.90; \( \hat{\eta }_{\text{p}}^{2} \) = 0.021; P < 0.01) were significant predictors of performance in this version of the PPT. The mean scores as a function of age and education for the word version of the PPT are presented in Table 2. The correction table for the word version (Table 3) shows the adjustments to be added to the original score, whenever appropriate, to account for the effects of age and education. The lowest tolerance limit for this version of the PPT corresponds to 40.78.

Table 2 Mean number (and standard deviation) of correct responses as a function of age for the word version of the PPT (N = number of participants)
Table 3 Correction values for age and education for row scores on the word version of the PPT

Picture version of the PPT

Results from the final multiple regression analysis showed that both age (F (1,461) = 9.22; \( \hat{\eta }_{\text{p}}^{2} \) = 0.020; P < 0.01) and education (F (1,461) = 16.96; \( \hat{\eta }_{\text{p}}^{2} \) = 0.035; P < 0.001) were significant predictors also for this version of the PPT. The mean scores as a function of age and education for the picture version of the PPT are presented in Table 4. The correction table for the picture version (Table 5) shows the adjustments to be added to the original score, whenever appropriate, to account for the effects of age and education. The lowest tolerance limit for this version of the PPT corresponds to 40.15.

Table 4 Mean number (and standard deviation) of correct responses as a function of age for the picture version of the PPT (N = number of participants)
Table 5 Correction values for age and education for row scores on the picture version of the PPT

Discussion

The aim of this study was to provide standardization and normative data for the word and the picture versions of the PPT. The analyses revealed that both education and age influenced performance.

The influence of education on semantic memory has not been yet clearly defined. Some earlier studies reported a significant effect of education on some semantic memory measures, such as the picture version of the PPT ([6, 7]; the word version was not administrated in these studies), the Boston Naming Test [13], semantic verbal fluency [14, 15], vocabulary and general knowledge [15, 16], whereas other did not [17, 18]. Similarly, extant results on the effects of age on semantic memory are somewhat controversial. Within the bulk of the existing literature, there is in fact evidence for age invariance in a variety of semantic memory tasks, including object descriptions [19], semantic priming [20], lexical decisions [21], classification speed [22], and category access [23]. At the same time, there is evidence for age-related deficits on some other semantic memory tasks, such as retrieval of category information [24], semantic fluency [25], and picture naming [13, 26]. Results of more recent studies have however shown that adjustment for educational factors revealed a convex age-related trend, consistent with small improvements from age 35 to age 65, followed by late deficit [27, 28]. Education therefore appears to be more important than adulthood per se for semantic memory functioning [27].

Being based on data from a large sample, the present study can be convincingly taken as demonstrating that performance on both the word and the picture versions of the PPT is influenced by demographic variables, and that individual scores should, therefore, be adjusted on their basis, before making any inference on the performance of individual patients.

The PPT can be used in the clinical assessment in Italian-speaking dementia population. It is important to notice that the administration of PPT in clinical contexts and for diagnostic purposes should occur only after a thorough evaluation of the specificity and sensitivity of the task [11]. Such indexes of specificity and sensitivity can be obtained by comparing the performance of the normative sample included in the present research with that of large groups of patients affected by various dementia syndromes. Future research might therefore identify more reliable values for the diagnosis of diverse forms of dementia.