Introduction

Aging is associated with declining cognitive performance, including a decline in executive function and episodic memory (Nyberg et al. 2012; Wilson et al. 2006). These cognitive impairments are associated with synaptic loss and neural cell death (brain atrophy) (Morrison and Baxter 2012; Westman et al. 2013). The brain regions showing greatest volumetric reduction in grey matter in healthy elderly individuals are the frontal and temporal lobes, followed by the hippocampus in healthy adults (Bourisly et al. 2015; Lemaitre et al. 2012). Early quantification of global and regional grey matter volume loss in the earlier stages of adult life could enable identification of persons at high risk of substantially accelerated cognitive decline and offer opportunities for preventive interventions.

Longitudinal structural Magnetic Resonance Imaging (MRI) has been widely used to assess how brain atrophy contributes to cognitive decline (Frisoni et al. 2010; McDonald et al. 2012; Sluimer et al. 2008). However, there is a paucity of neuroimaging studies that examine regional grey matter volumes with domain-specific cognitive decline in healthy adults. Whilst previous longitudinal studies have supported the relationship between brain atrophy and cognitive decline, they have been limited by short follow-up duration (1–5 years) (Carmichael et al. 2012; Kramer et al. 2007; Mungas et al. 2005; Rusinek et al. 2003). Others have included participants with wider age range or at older ages (> 65 years old) (Burgmans et al. 2009, Resnick et al. 2003, Sluimer et al. 2008), or individuals at risk of Alzheimer’s disease (AD) with subjective memory complaints or Mild Cognitive Impairments (MCI) (Cardenas et al. 2011; Driscoll et al. 2009; Henneman et al. 2009; Jack et al. 2005; McDonald et al. 2012). These factors may limit the applicability of proposed brain volume predictors for cognitive decline in healthy adults.

The role of hippocampal atrophy in aging and cognitive decline has received significant attention in both neuroimaging and pathological research (Burgess et al. 2002; Fleischman et al. 2013; Hackert et al. 2002; Raz et al. 2005; Yavuz et al. 2007). The hippocampus is known to be affected early during the neurodegenerative process due to the early formation of the neurofibrillary tangles (Johnson et al. 2012; Rémy et al. 2005). Although the rate of hippocampal atrophy reflects the severity of cognitive decline and AD progression (Henneman et al. 2009; Yavuz et al. 2007), their relationship with domain-specific cognitive decline in healthy adults in both cross-sectional and longitudinal studies is not as clear. A large number of studies found an association between hippocampal volume and episodic memory performance in healthy adults (Cardenas et al. 2011; Carmichael et al. 2012; Hackert et al. 2002; Kramer et al. 2007; Persson et al. 2012; Tupler et al. 2007; Van Petten et al. 2004; Ystad et al. 2009). Fewer studies observed a relationship with working memory, executive function or processing speed (Fleischman et al. 2013; Oosterman et al. 2008; Papp et al. 2014). Most of these studies are limited by short follow up duration (<5 years); however, one study has assessed hippocampal atrophy and repeated neuropsychological tests over a 10-year period (den Heijer et al. 2010).

In the present study, we examine longitudinal whole brain and hippocampal volumes and subsequent changes in verbal episodic memory and executive function over a 10-year period in a cohort of healthy elderly women. We also consider baseline hippocampal volume and cortical grey matter volumes of the frontal and temporal lobes in order to test their association with longitudinal changes in cognitive performance over 10 years of follow-up assessments.

Materials and methods

Participants

All data for the present study were collected from the Women’s Healthy Ageing Project (WHAP) cohort. In brief, WHAP started in 1992, when 438 Australian women (aged between 45 and 55 years) were selected by random population sampling and enrolled into a prospective longitudinal follow up study (Szoeke et al. 2013). A detailed description of the inclusion and exclusion criteria was defined in previous publication (Szoeke et al. 2016). The inclusion criteria to the initial WHAP study were women whom baseline menopausal status was able to be determined, had an intact uterus and were not taking hormonal therapy or oral contraceptives. In the present study, participants who completed brain MRI scan with cognitive assessments at baseline (2002) and follow-up (2012) time points were included in the analysis.

In 2002, a total of 60 women (mean age 59.4 ± 2.2 years) underwent brain MRI scan. Although 60 women were scanned at the baseline, the longitudinal analyses were restricted to a subsample of 40 women, who had completed cognitive follow-up measurements taken at both time points (2002 and 2012), 23 of whom had follow-up MRI scans in 2012.

The WHAP participants were selected for the present study on the basis of having completed cognitive function assessment and subsequent MRI scans across a decade of longitudinal prospective follow-ups. The study protocol for WHAP project was approved by the University of Melbourne Human Research Ethics Committee and fully complaint with the guidelines of the National Health and Medical Research Council ethical standards (HREC 931149X2 & 1,034,765). All subjects provided written informed consent before participation.

MRI acquisition

MR imaging was performed at The Royal Melbourne Hospital, using a 3T MR scanner to acquire high resolution images. In 2002, a 3T GE Signa MR imaging system was used to acquire 3D Axial T1-weighted imaging sequences with voxel dimension = 0.48 × 0.48 × 1.55 mm, repetition time (TR) = 9300 ms; echo time (TE) = 1900 ms and matrix size = 512 × 512 × 110. In 2012, a Siemens 3T Tim Trio MR scanner, with 12-channel phased-array head coil, was used to acquire 3D Magnetisation-Prepared Rapid Gradient Echo (MPRAGE) T1-weighted images with isotropic 1-mm voxel, TR = 1900 ms, TE = 2.98 ms and matrix size = 256 × 256 × 176. Both MR sequences permit quantification of global and regional brain volumetric analysis. Scans were collected and transferred to a LINUX workstation in the brain imaging lab, at the Royal Melbourne Hospital for storage and image processing.

Imaging measures

Baseline cortical and subcortical grey matter volumes

The cortical reconstruction and volumetric segmentation of the frontal, temporal cortex and hippocampus was performed using the FreeSurfer software package, Version 5.3.0 (Laboratory for Computational Neuroimaging, Charlestown, MA, USA). The technical details of this process are described by Fischl (2012). In brief, the procedure includes skull striping, motion corrections, automated Talairach transformation and intensity normalisation. This is followed by automated topological cortical surface correction to optimally generate both the grey/white matter and the grey matter/cerebrospinal fluid interfaces (Davatzikos and Bryan 1996; Fischl et al. 1999). Surface parcellation and subcortical segmentation were completed using a predefined anatomical-labelled atlas. The reliability of FreeSufer’s cortical parcellation and whole brain segmentation has been reported previously (Desikan et al. 2006; Destrieux et al. 2010; Fischl 2012; Fischl et al. 2004; Maclaren et al. 2014). Hippocampal volume measures were acquired from the segmentation, and cortical grey matter measurements of frontal and temporal lobes were obtained from cortical parcellation. The FreeSurfer output was visually checked to ensure the cortical reconstruction and subcortical segmentation of the hippocampus were accurate and without topological effects. To account for differences in head size, the output volumes were then divided by the total intracranial volume (ICV) for each subject.

Hippocampal atrophy

FreeSurfer longitudinal stream was run to estimate hippocampal atrophy from two-time points (2002 and 2012). The subcortical segmentation for the hippocampus, derived from FreeSurfer agrees well when compared to manual tracing (Cherbuin et al. 2009; Morey et al. 2009). FreeSurfer has shown higher correlation with manual tracing of the hippocampus in terms of volume overlapping (Dice’s coefficient > 80%) compared to FSL/FIRST (>75%) (Wenger et al. 2014). The longitudinal assessments of hippocampal volume were performed in FreeSurfer by creating a base template from the two time point cross-sectional results. This method is known to have greatly improved the robustness and sensitivity of the overall segmentation (Morey et al. 2010; Reuter et al. 2012). The data were visually inspected for each step of FreeSufer process and corrected prior to completion of any further analysis. The hippocampal atrophy rate was calculating by subtracting the longitudinal outputs from the two-time points and expressed as percent of subjects’ ICV.

Whole brain atrophy

Longitudinal whole brain volume changes were measured over a 10-year period using Structural Image Evaluation Using Normalisation of Atrophy (SIENA), part of the FMRIB-FSL software library, Version 5.0.7 (FMRIB Software Library, Oxford, UK). In brief, SIENA starts with the extraction of skull and brain images from input data at each time point (Smith et al. 2002). The two output masks derived during this process are aligned with the output of the skull to find brain/non-brain edge points between two-time points. The mean edge displacement is converted into a global estimate of the percentage of whole brain volume changes between two time intervals (Sluimer et al. 2008). This fully automated method provides a robust and accurate estimation of the percentage of brain volume changes (PBVC) and is less sensitive to inter-scanner variability of the brain images from two-time points (Cover et al. 2011; Durand-Dubief et al. 2012; Smith et al. 2002; Takao et al. 2011). Moreover, SIENA has been widely used in longitudinal MRI studies of normal aging and neurological disorders (Farzan et al. 2015; Guo et al. 2016; Mak et al. 2015; Sluimer et al. 2008; Takao et al. 2012).

Longitudinal cognitive assessments

Cognitive tests were administered to the WHAP cohort by trained psychologists. Verbal episodic memory was assessed using the delayed recall scores of the California Verbal Learning Test, second edition (CVLT-II) (Delis et al. 1987) and the Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) (Welsh et al. 1994). Executive function was measured by the timed Trail Making Task Part-B (TMT-B) (Clark et al. 2004a). As the Trial Making Test is a timed task with higher score indicating low performance, this variable score was transformed to a raw score by subtracting the original scores from the maximum group scores. Raw scores were then converted to z-scores based on the group mean and standard deviation (SD). Averaged composite z-scores were performed to measure verbal episodic memory. The change in verbal episodic memory and executive function over a 10-year was calculated by subtracting baseline composite z-scores from the corresponding scores 10 years later. A negative change z-score indicated a decline in cognitive performance. Details on each test’s normative values and administration for this population have been reported previously (Clark et al. 2004a, b; Elkadi et al. 2006a, b; Szoeke et al. 2013). In addition, depressive symptoms were also assessed in the WHAP cohort using the Centre for Epidemiological Studies Depression (CES-D) scale. During a mean follow-up of 10 years, 3 participants developed Mild Cognitive Impairment (MCI).

Statistical analysis

All statistical analysis was performed using SPSS Version 21.0 (IPM SPSS, Chicago, IL). An independent-samples t-test was run to determine if there were differences in the baseline demographic characteristics between those women who agreed to participate in follow-up assessments compared to women who did not. Using linear regression models, initial analyses correlated the rates of whole brain and hippocampal atrophy with the z-score in changes of verbal episodic memory and executive function that occurred over a decade. Separate regression analyses were computed to model the association between baseline hippocampal, frontal and temporal grey matter volumes with changes in both cognitive functions. This test was used to predict the decline in cognitive performance (outcome variables) as a function of brain volumes (predictor variables at baseline). Linear regression models were then adjusted for age and education. These variables were chosen because they are strongly associated with cognitive functional changes and are considered to potentially confound the association between brain volume and cognition. A two-tailed p-value of <0.05 unadjusted for multiple comparisons was considered to be statistically significant in this study.

Results

A descriptive summary of demographic characteristics and cognitive measures for the study population at baseline is given in Table 1. The mean age of our study participants at baseline was 59.58 ± 2.25 years. They had a mean education of 13 years. Women who didn’t participate in cognitive assessments at follow-up were excluded from the analysis. No differences were observed between women who included and excluded from the analysis in brain volumes, cognitive and depression scores at baseline (Table 2). However, women who participated in the follow-up analysis were better educated than those who didn’t participate. Therefore, education was included in the adjusted analysis. 18.33% of our participants were APOE ɛ4 carriers, with 11 women having at least one APOE ε4 alleles at baseline. An independent sample t-test showed that there was a higher whole brain and hippocampal atrophy rates over 10 years for APOE ɛ4 carriers compared to non-carriers, but the differences in the mean volumes were not significant (supplementary Table S1). A larger sample size of APOE ε4 carriers would likely be required to further investigate this finding.

Table 1 Descriptive statistics for imaging cohort
Table 2 Demographic characteristics, cognitive measures and MRI volumetric findings for imaging cohort at baseline

Baseline brain volumes and cognition (n = 40)

The mean of total frontal cortical volume at baseline was 11.17 ± 0.72% ICV, and total temporal cortical volume was 5.86 ± 0.47% ICV. Smaller baseline grey matter volumes of frontal and temporal lobes were associated with greater decline in verbal episodic memory a decade later (p = 0.002 and p = 0.02 respectively). These findings remain significant after adjusting for age and education (Table 3). When both cortical grey matter volumes of frontal and temporal lobes were combined in the linear regression model, 21% of the verbal episodic memory scores variance was explained by the baseline frontal volume. On the other hand, we observed that smaller baseline total hippocampal volume was correlated with lower performance in executive function over a decade (p = 0.02), adjusted for age and education (Table 3). This indicates that smaller hippocampal volume predicts poorer performance in future executive function.

Table 3 Linear regression for association between baseline brain volume and changes in cognitive function over 10 years (n = 40)

Longitudinal brain volume measurements and cognition (n = 23)

The mean whole brain atrophy was 0.8 ± 0.2% and hippocampal atrophy was 0.6 ± 0.4% per year. Atrophy rates were significantly correlated (r = 0.61, p = 0.002). Higher whole brain and hippocampal atrophy rates were significantly associated with decline in verbal episodic memory across a decade (p = 0.001 and p = 0.03, respectively) (Table 4). When whole brain and hippocampal atrophy rates were combined in the linear regression model, 32% of the change in verbal episodic memory scores variance explained by the whole brain atrophy rate. Indeed, these results showed a slightly stronger relationship between whole brain atrophy and changes in verbal episodic memory. These findings remain significant when corrected for age and years of education for whole brain and hippocampal atrophy rates (p = 0.008 and p = 0.02), respectively (Table 4). This indicates that both higher whole brain and hippocampal atrophy rates were associated with poorer verbal memory performance over time. In contrast, no significant association was found with atrophy rates and change in executive function.

Table 4 Linear regression for association between rates of brain atrophy and cognitive function changes over 10 years (n = 23)

Discussion

In this prospective analysis of WHAP participants, the major findings are that baseline hippocampal volumes significantly predict changes in executive function performance, baseline grey matter volumes in frontal and temporal lobes are associated with changes in verbal episodic memory, and higher whole brain and hippocampal atrophy rates are correlated with poorer performance in verbal episodic memory across one decade.

Although smaller hippocampal volume at baseline is known to be sensitive in predicting episodic memory decline in older adults (Mungas et al. 2005; Tupler et al. 2007; Ystad et al. 2009), some studies have found no association with any of the cognitive domains (Cardenas et al. 2011; Carmichael et al. 2012). In our study, we observed that smaller hippocampal volume at baseline is correlated with decline in executive function over a decade of follow-up. These findings are in accordance with previous studies that have examined the relationship between hippocampal volume and executive function and processing speed, which suggests that the hippocampus may have a role beyond memory and might be involved in executive function (Oosterman et al. 2008; Papp et al. 2014). As in our study, both studies were conducted in non-demented adults’ samples, but the WHAP cohort was younger at baseline (age range 56–65) and included only women. In addition, evidence from population-based studies of elderly women has shown that decline in executive function precedes decline in memory and this was associated with increased risk of late life cognitive impairments (Carlson et al. 2009; Johnson et al. 2007). Our findings imply that early measures of hippocampal volume at midlife might be a useful biomarker for predicting changes in late life executive function in elderly women.

We also observed that smaller baseline grey matter volumes in frontal and temporal lobes were associated with greater decline in verbal episodic memory a decade later. Whilst both frontal and temporal grey matter volumes are vulnerable to change with aging (Crivello et al. 2014; Driscoll et al. 2009; Lemaitre et al. 2012; Rusinek et al. 2003), their relationship with domain specific cognitive decline is inconsistent in cognitively intact subjects. In cross-sectional studies, memory performance has been found to be negatively associated with temporal and prefrontal grey matter volumes in cognitively healthy adults compared to demented individuals (Duarte et al. 2006; Van Petten 2004), whereas an opposite pattern was found in other studies examining longitudinal changes in cognition (Cardenas et al. 2011; Tisserand et al. 2004). They found that smaller grey matter volume in frontal and temporal lobes has been shown to be correlated with lower performance in both memory and executive function in elderly adults. These studies were limited by short follow-up of cognition (1–3 years), however, in our study we conducted neuropsychological tests over 10-years and observed similar findings. The role of both frontal and temporal cortices in memory is being increasingly recognised with evidence of functional connectivity of prefrontal cortex and medial temporal lobe during memory tasks (Fjell et al. 2015; Maillet and Rajah 2013). Our results thus suggest that early quantification of frontal and temporal grey matter volumes could help to identify individuals at greater risk of late life memory impairments in healthy elderly women.

A strong relationship between whole brain and hippocampal atrophy with cognitive decline has been demonstrated in MCI and demented individuals (Archer et al. 2010; den Heijer et al. 2010; Fjell et al. 2009; Henneman et al. 2009; Sluimer et al. 2008). Whilst there are several studies that examine whole brain and hippocampal atrophy with domain-specific cognitive decline in non-demented elderly subjects (Kramer et al. 2007; Mungas et al. 2005; Schmidt et al. 2005; Söderlund et al. 2004), they have been limited by short follow-up (1–6 years). Only one study assessed hippocampal atrophy and cognitive function over a 10-year period (den Heijer et al. 2010). This study examined a group of healthy elderly adults (male and female) with wider age range (60–90 years old), but also found an association between hippocampal atrophy and decline in verbal episodic memory over 10 years of follow-up. In our study, we confirm that higher whole brain and hippocampal atrophy rates are associated with greater decline in verbal episodic memory over a decade of follow-up in healthy elderly women. These findings extend the diagnostic window to 10 years before the cognitive impairment and highlight the importance of longitudinal methods for early therapeutic intervention of late life cognitive decline in healthy elderly women.

The major strength of the present study is the long-term period of our longitudinal analysis. The data were sampled from a large, prospective, longitudinal study of the WHAP cohort with follow-up assessments of cognitive function that included both verbal episodic memory performance and executive function over a decade. The other strength of this study is that it consists of participants within a narrow age range, reducing the confounding influence of age on brain atrophy and allowing a more accurate association between brain atrophy and changes in cognitive function to be determined. In addition, automated image segmentation methods were used for longitudinal and cross-sectional volumetric analyses produce similar results as found in previous studies (Apostolova et al. 2012; Morey et al. 2009; Sluimer et al. 2008; Smith et al. 2007).

Our cohort consisted only of cognitively healthy elderly women; this may limit the extent to which we can generalize our findings to a mixed population (male and female) with cognitive impairment. Although the WHAP cohort comprised a female only population, this sampling bias is important as women show greater brain atrophy and have a higher risk of developing dementia compared to men (Crivello et al. 2014; Guo et al. 2016; Hua et al. 2010). However, the small sample size included in our longitudinal study limited our ability to control for other confounding factors such as genetic variation and vascular diseases. Of the baseline cohort, only 1/3 of the participants underwent longitudinal brain MRI scans and this may have biased analyses of long term atrophic changes. Previous neuroimaging studies, with 10-year follow-up periods, had more than 50% dropout of the original cohort and included participants who developed MCI (den Heijer et al. 2010; Driscoll et al. 2009). Our study had 30% attrition with no differences between those who included and excluded from the analysis. In addition, a different scanner manufacturer in our longitudinal study may also introduce a bias in the mean volume differences. This limitation is common in longitudinal neuroimaging studies over such an extended period in terms of scanners and software upgrades (Durand-Dubief et al. 2012; Takao et al. 2011). Despite the reliability of the FreeSurfer software for the hippocampal volumes (Cherbuin et al. 2009; Morey et al. 2009), manual segmentation techniques which are the gold standard would be more accurate. Another possible limitation of our study is that cerebrovascular pathology was not directly assessed and measured. Further research is needed to clarify the relationship between cerebrovascular pathology, brain volume and cognition.

Conclusion

Structural brain volume changes identified in our cross-sectional and longitudinal MRI analysis may allow early prediction of cognitive function with normal aging in elderly women. The present study suggests that midlife measures of hippocampal, cortical frontal and temporal volumes provide domain-specific contributions to the assessment of future cognitive decline. These findings may be relevant to our understanding of the neurocognitive processes with aging and provide a useful biomarker for therapeutic interventions to delay or prevent the progression of dementia.