Introduction

Breast cancer survivors have long reported cognitive complaints during and after ending chemotherapy treatment. Many patients (up to 60–75%) experience mild cognitive deficits in domains mainly involving memory, attention, psychomotor speed, and executive functioning (Janelsins et al. 2014; Wefel et al. 2015). Despite the partial protection by the blood-brain barrier, chemotherapy could affect the brain through direct and/or indirect neurotoxicity (Han et al. 2008; Ahles et al. 2010; Vichaya et al. 2015) potentially leading to cognitive deficits (Wefel and Schagen 2012; Dietrich et al. 2015).

Recently, neuroimaging studies started to examine possible neural correlates of cognitive impairment following chemotherapy. Both functional and structural brain changes have been reported (de Ruiter and Schagen 2013; Deprez et al. 2013b; McDonald and Saykin 2013; Pomykala et al. 2013; Saykin et al. 2013). A limited number of neuroimaging studies have explored possible cerebral white matter (WM) alterations using diffusion tensor imaging (DTI) (Deprez et al. 2013a). Differences in WM microstructure have been reported in breast cancer survivors when comparing standard dose chemotherapy-treated patients with controls up to 5 months (Deprez et al. 2011), 22 months (Abraham et al. 2008) and 5 years (Kesler et al. 2015) after chemotherapy, whereas 10 years (Stouten-Kemperman et al. 2015b) and 20 years (Koppelmans et al. 2014) later, such differences could no longer be observed. One cross-sectional study reported differences in WM microstructure 10 years after treatment with high-dose chemotherapy (de Ruiter et al. 2011).

In our previous study (Deprez et al. 2012), to the best of our knowledge the only longitudinal DTI study to date regarding this topic, we compared DTI measures at baseline and three to four months after chemotherapy. We reported WM microstructure changes in the group of chemotherapy-exposed patients, whereas no changes were found in the group of non-chemotherapy-exposed patients or matched healthy controls. Furthermore, the observed WM changes correlated significantly with performance decreases in verbal memory and attention.

In all of these studies, the DTI measure fractional anisotropy (FA) was used to explore WM changes in the brain. FA measures the presence of a preferred direction of diffusion and may be reduced due to changes in cell density and architectural organization. Therefore decreased FA is frequently attributed to WM damage (Concha 2014). However, it is an unspecific measure, which is unable to differentiate between intra- or extra-axonal changes, or assess myelin (Tournier et al. 2011; Jones et al. 2013). More advanced multi-shell diffusion magnetic resonance imaging (dMRI) can provide metrics that are sensitive to both intra and extra-axonal diffusion. One of them is diffusion kurtosis imaging (DKI) (Jensen et al. 2005). This extension of DTI measures the kurtosis of the diffusion probability distribution function, possibly reflecting tissue ‘complexity’ (presence of tissue compartments with different diffusion properties). Another advanced model is “neurite orientation dispersion and density imaging” (NODDI), which yields indirect measures of axonal dispersion and density (Zhang et al. 2012). A complementary MRI technique, ‘myelin water imaging (MWI)’, which is not based on diffusion, provides an indirect yet more specific estimate of myelin content (MacKay et al. 2006; Laule et al. 2007). Combining these complementary metrics can improve the characterization of WM (Billiet et al. 2014), but this approach has not yet been applied to study the evolution of brain changes following chemotherapy.

More research is needed including these complementary WM MRI techniques to determine whether WM changes after standard-dose chemotherapy are reversible or whether there is long-term or even delayed WM damage. We therefore invited the same cohort of our longitudinal study for a third MRI scan and neuropsychological evaluation three to four years after therapy. We employed two study designs: (1) in a longitudinal design, we investigated if there was evidence of WM recovery and its relation with cognitive changes, using DTI and neuropsychological testing in the chemotherapy-treated patient group, and (2) in a cross-sectional design, we investigated if subtle microstructural WM differences between groups at the third time point may still be detectable using previously unavailable advanced diffusion MRI techniques and MWI.

Patients and methods

Participants

The study was approved by the local Ethical Commission and conducted in accordance with the declaration of Helsinki. Participants evaluated in a previous longitudinal DTI study at baseline (t1) and three to four months after treatment (t2) as described in Deprez et al. (Deprez et al. 2012) were contacted three to four years later for a third MRI scanning and cognitive testing session. In total, from the initial 34 women with early-stage breast cancer who received adjuvant chemotherapy (C+), 16 who did not receive chemotherapy (C-), and 19 healthy controls (HCs), respectively 25 C+, 14 C- and 15 HCs could be included for evaluation at the third time point (t3). See Fig. 1 for details on dropout and exclusion. All imaging and neuropsychological analyses include only people who participated at all three time points.

Fig. 1
figure 1

Overview of drop-out and exclusion of participants at t3

Acquisition details for magnetic resonance imaging scans

  1. (1)

    To assess longitudinal changes between the three time points, the same 3 T scanner (Philips Intera, Best, the Netherlands) and eight-channel phased-array head coil was used as in our previous study (Deprez et al. 2012). Whole-brain DTI spin-echo echo planar imaging (SE-EPI) was acquired with the following scanning parameters: 68 contiguous sagittal slices, 112 × 109 matrix size, 220 × 220 mm2 FOV, 4956 ms TR, 55 ms TE, 2.5 parallel imaging factor, 2.2-mm slice thickness, 1.96 × 1.96 × 2.2 mm3 voxel size. Diffusion gradients were applied along 45 non-collinear directions with a b-value of 800 s/mm2. Additionally one non-diffusion-weighted (b = 0) set of images was acquired resulting in a total scan time of 10.34 min. A T1-weighted whole brain 3D–TFE (182 contiguous coronal slices; 250 × 250 mm2 FOV; 4.6 ms TE; 9.7 ms TR; 1.2 mm slice thickness; 256 × 256 matrix; 0.98 × 0.98 × 1.2 mm3 voxel size), a T2-weighted TSE (28 transverse slices; 230 × 184 mm2 FOV; 4 mm slice thickness; 3000 ms TR; 80 ms TE), and a FLAIR (28 transverse slices; 230 × 183 mm2 FOV; 125 ms TE; 11,000 ms TR; 2800 ms IR delay; 4 mm slice thickness; 256 × 256 matrix; 0.65 × 0.87 × 4 mm3 voxel size) were also acquired to search for primary brain pathology as an exclusion criterion.

  1. (2)

    For advanced dMRI as well as MWI for in-depth cross-sectional evaluation of the WM at t3, a newer 3 T scanner (Philips Achieva, Best, the Netherlands) with a 32-channel phased-array head coil was used. The diffusion MRI data consisted of high-angular resolution diffusion imaging (HARDI) datasets with b-values 700, 1000 and 2800 s/mm2, acquired along 25, 40 and 75 directions, respectively, in addition to 10 b = 0 images. Constant parameters were TR/TE = 7600/65 ms, 58 slices and isotropic voxel size 2.5 mm (Poot et al. 2010). The myelin water imaging data consisted of 32 echoes with first TE and ΔTE = 10 ms, TR = 1 s, reconstructed voxel size 1 × 1 × 2.5 mm3 (Prasloski et al. 2012b).

Image processing and analysis

Longitudinal DTI image analysis

DTI pre-processing at t3 was performed in exactly the same way as described previously (Deprez et al. 2012) for t1 and t2, using ExploreDTI (Leemans et al. 2009), and included (i) visual quality assurance, (ii) subject motion and distortion correction with reorientation of the b-matrix (Leemans and Jones 2009) and (iii) an iterative nonlinear tensor estimation process to generate FA-maps. DTI datasets were non-rigidly registered to a population-based DTI atlas generated from all participants’ DTI data (Van Hecke et al. 2008; Van Hecke et al. 2011). Finally, the spatially normalized maps were smoothed with an anisotropic smoothing kernel (FWHM = 6 mm) (Sage et al. 2009; Van Hecke et al. 2010).

For each group, a whole-brain-voxel-based repeated-measures ANOVA model was applied to FA, using SPM8 (Ashburner and Friston 2011), with time as within-subject factor and verbal IQ and depression score as covariates-of-no-interest (Deprez et al. 2012). As before (Deprez et al. 2012), covariates (i.e. scanner upgrade and goodness of tensor-estimation-fit) were added to account for scanner drift, upgrades and maintenance (Harrison et al. 2011; Takao et al. 2012). From the repeated-measures ANOVA model we extracted adjusted mean FA values at all time points, in the four ROI previously associated with chemotherapy-related changes from t1 to t2 (Deprez et al. 2012) (Fig. 2): a region covering (1) the corona radiata and corpus callosum, (2) frontal (superior longitudinal fasciculus (SLF)), (3) parietal (SLF) and (4) occipital (forceps major) WM tracts. Post-hoc tests with Bonferroni correction were used to compare main effects between timepoints.

Fig. 2
figure 2

Longitudinal results in ROI. Mean FA values of patients that participated at the 3 timepoints in the 4 regions of interest. t1, assessment at baseline; t2, assessment 3–4 months and t3: 3–4 years after treatment. ROI1: region covering parietal part of corona radiata and corpus callosum; ROI2: region covering frontal part of superior longitudinal fasciculus; ROI3: region covering parietal part of superior longitudinal fasciculus; ROI4: region covering part of forceps major. * corrected p-value <0.05 ** corrected p-value <0.001

During the period of acquiring the scans at t3, there has been one hardware maintenance. Unfortunately, all the scans of the non-chemotherapy treated breast cancer patients were acquired after this maintenance. Therefore it was not possible to build the repeated measures ANOVA model for this patient group as the covariate defining the group and the covariate reflecting the scanner upgrade are the same.

Advanced cross-sectional dMRI and MWI analysis

Diffusion tensors and diffusion kurtosis tensors were estimated from the concatenated diffusion datasets. FA and mean, axial and radial diffusivity (MD, AD and RD) were estimated from the diffusion tensor, and mean kurtosis (MK) from the kurtosis tensor using ExploreDTI (Leemans et al. 2009). The mean kurtosis (MK) from diffusion kurtosis imaging (DKI) is related to the relative presence of compartments with restricted and hindered diffusion properties. The motion and distortion corrected DWIs were also processed using the NODDI toolbox (Zhang et al. 2012). NODDI yields parameters estimating the cerebrospinal fluid fraction (CSF) and neurite density index (NDI). This is the ratio of intracellular-like diffusion (e.g. axons and dendrites) over extracellular-like diffusion (e.g. glia). The misalignment of white matter fibers is modeled through the orientation dispersion index (ODI). This is the NODDI-alternative for FA.

All MWI datasets were inspected for motion or other artifacts and discarded in case of a blurry first echo image. A voxel-wise non-negative least squares approach was applied to estimate the inherent T2 distribution of the multi-exponential signal, while accounting for stimulated echoes and applying a smoothness constraint (Whittall and Mackay 1989; Prasloski et al. 2012a). Myelin water fraction (MWF) was obtained as the fraction of T2 components between 10 and 40 ms.

A population-based atlas was constructed in the same space as the longitudinal DTI atlas, based on the b = 2800 s/mm2 data of a representative subset of patients and controls (Van Hecke et al. 2008). All metrics were registered to this template, following the pipeline as described in Billiet et al. (Billiet et al. 2015). Individual white matter masks, resulting from segmentation of anatomical images using SPM, ensured all analyses were restricted to white matter.

First, all DTI, DKI, NODDI and MWI metrics were compared between groups in the four ROI. Second, to explore possible differences outside the ROI we performed whole-brain voxel-based analysis comparing DTI, DKI, NODDI and MWI measures between groups.

For the ROI analysis, mean metric values in the four ROIs were obtained on which one-way ANOVA models were applied with group as a fixed factor. Given the high codependency between diffusion metrics, Bonferroni correction for all comparisons was deemed overly strict. We therefore considered only two modalities (dMRI and MWI), yielding eight comparisons. Hence main effects of group were deemed significant at p < 0.0063.

Voxel-based one-way ANOVA models with group as fixed factor were applied to all metrics using SPM8 (Ashburner and Friston 2011). A voxel-threshold was set at p-uncorrected <0.001 and clusters with family-wise-error (FWE) corrected p-value below 0.05 were deemed significant.

Neuropsychological assessment and analysis

All participants were evaluated using the same neuropsychological test battery as described before (Deprez et al. 2012), which included the domains of attention, concentration, memory, executive functioning, and cognitive/psychomotor processing speed. To avoid repetition bias, alternate test forms were used for the AVLT verbal memory task. Self-reported cognitive functioning was assessed with the Cognitive Failure Questionnaire (CFQ) (Broadbent et al. 1982). Subscales were used for distraction, distraction in social situations, names and word finding, orientation, and a total summary score. All participants completed the Spielberger State-Trait Anxiety Inventory (STAI) (Spielberger 1985), the Beck Depression Inventory (BDI) (Bosscher et al. 1986), and Fatigue Assessment Scale (FAS) (De Vries et al. 2004). Verbal IQ was measured using a Dutch version of the National Adult Reading Test (Schmand et al. 1991). Neuropsychological evaluation and MRI scanning took place on the same day.

Statistics were performed with SPSS 18.0 (SPSS, Chicago, IL). Test scores and subjective complaints that showed a significant group-time interaction from t1 to t2 (Deprez et al. 2012), were selected for further analysis. We used SPSS linear mixed effects modeling (ME) with (1) time as repeated effect and subject as within-subject factor to account for correlation among repeated test scores within each subject, (2) an unstructured covariance structure, (3) time, group and group-time as fixed effects and (4) IQ and BDI as covariates. Additionally, within-group analyses were performed to test for an effect of time within the separate groups. When ME modeling with time as repeated measure and fixed effect and IQ and BDI as covariates yielded a significant effect of time, post-hoc tests were applied to assess changes from t2 to t3. No correction of multiple comparison was performed for the neuropsychological outcome measures.

Pearson’s correlations were used to explore the relationship within the C+ group between (1) changes from t2 to t3 in neuropsychological test scores, self-reported measures, and changes in FA; (2) number of days between t2 and t3 and changes in FA and (3) subjective cognitive complaints (CFQ), depression (BDI) and fatigue, at t3. Statistical significance was assessed at p < 0.05.

Results

Participant demographic and clinical data

Participant demographic and medical information is summarized in Table 1. All participants were active premenopausal women between 32 and 51 years old when entering the study at t1. At t3, however, 20 C+ (88%), 8 C- (57%) and 3 HCs (20%) were menopausal. Sixteen C+ and 11 C- patients were receiving endocrine treatment at t3. The chemotherapy-treated patients did not differ from the controls with regard to age, education, verbal IQ and anxiety. However, a significant difference at baseline (F = 4.0; p = 0.02) and group-time interaction (F = 7.2; p = 0.002) was found for depression score BDI (Fig. 3a). Table 2 provides details on participants who participated at t1 and t2, but not at t3. No significant differences were seen in terms of difference scores (t2-t1) in neuropsychological test performance and DTI FA values between the C+ patients that were included in the present longitudinal study and the C+ patients that did not participate at t3 (Table 3).

Table 1 Background and clinical characteristics of patients and controls
Fig. 3
figure 3

Longitudinal evaluation of depression score, verbal memory and subjective complaints. F-statistics and p-values correspond with the group-time effect of a mixed effects model. BDI: Beck Depression Inventory; AVLT: Auditory Verbal Learning Test; CFQ: Cognitive Failure Questionnaire. C+: chemotherapy-treated patients; C-: patients not treated with chemotherapy; HC: Healthy controls

Table 2 Characteristics of participants who participated at t1 and t2 but not t3
Table 3 Comparison of previous performance (t2-t1) of C+ patients that are included in the present longitudinal study and the C+ patients that are no longer participating at t3

Longitudinal assessment of DTI fractional anisotropy

Two C+ patients were excluded from the analysis due to excessive motion and signal dropout in one of the DTI images. Repeated-measures ANOVA showed that mean FA values differed significantly between timepoints (ROI1: F = 7, p = 0.002; ROI2: F = 14.1, p < 0.0001; ROI3: F = 10.5, p < 0.0001; ROI4: F = 18, p < 0.0001). Post-hoc tests showed significant decreases from t1 to t2 followed by significant increases from t2 to t3 in FA (p < 0.02, corrected for 4 ROI) in the chemotherapy-treated group in the four ROIs (Fig. 2). No significant differences in FA values were seen for the group of healthy controls (ROI1: F = 0.9, p = 0.42; ROI2: F = 1.2, p = 0.30; ROI3: F = 1.7, p = 0.20; ROI4: F = 0.9, p = 0.40) and a significant group x time interaction was found in 3 of the 4 ROI (ROI1: F = 0.8, p = 0.45, ROI2: F = 6.5, p = 0.004, ROI3: F = 3.9, p = 0.029; ROI4: F = 5.6, p = 0.008).

Within the C+ group, changes in FA from t2 to t3 did not correlate with changes in neuropsychological test scores or subjective complaints. Significant correlations however were found between changes in FA from t2 to t3 in ROI1 and ROI2 and the number of days between t2 and t3, reflecting time-since-treatment (r = 0.40, p < 0.05) (Fig. 4), however not corrected for multiple ROI.

Fig. 4
figure 4

Correlation between recovery of FA in ROI1 and time since treatment. Scatter plot of mean difference in FA values between t3 and t2 and number of days between the two timepoints. FA: Fractional Anisotropy; C+: patients that received chemotherapy

Cross-sectional imaging analysis at t3

Two healthy volunteers withdrew from the additional scan. Five MWI datasets were discarded due to unsatisfactory data quality, as described in the methods section. One additional MWI dataset was discarded due to a limited field of view not covering all the ROIs. For the multi-shell diffusion analysis, the final sample size was 25 C+, 14 C- and 13 HC. For MWI data this was 20 C+, 14 C- and 12 HC.

There were no significant differences in advanced dMRI or MWI metrics between groups in any of the ROIs (all corrected p > 0.05). Also the voxel-based analysis yielded no clusters of significant differences in any of the DTI, DKI, NODDI or MWI measures (all FWE-corrected cluster p-values > 0.05).

An overview of all cross-sectional MR imaging results is provided in Table 4.

Table 4 Cross-sectional comparison of diffusion metrics between C+, C- and HC at t3

Longitudinal neuropsychological assessment

Mixed effects modeling showed significant group-time interactions for verbal memory (AVLT learning: F = 2.6 (p = 0.047)) and processing speed (9PEG: F = 3.01 (p = 0.027)). BDI was not significant in the models (p-values from 0.09 to 0.95), while IQ was significant for attention (p < 0.015) and memory tests (p < 0.002), but not processing speed (p > 0.09). Figure 3b illustrates the decreased performance from t1 to t2 followed by a performance increase from t2 to t3 for the C+ group, whereas both control groups showed increased performance between those time points consistent with a learning effect. The other test indices showed similar trends (Table 5).

Table 5 Neuropsychological test results

No significant group-time interactions could be identified for the different CFQ indices reflecting subjective cognitive complaints. BDI was a significant covariate in those models. We did, however, see significant effects of time for CFQ distraction (p = 0.003), CFQ names and word finding (p = 0.008) and CFQ total score (p = 0.01), reflecting more subjective complaints at t3 compared to t1 for all three participant groups. While the C+ group showed a pattern of initial increase from t1 to t2 followed by a (minor) decrease in complaints at t3, the C- group showed an increase from t2 to t3 (Fig. 3c). One-way ANOVA analysis at t3 revealed no significant differences in subjective cognitive complaints between the three groups.

Additionally, in the C+ group, longitudinal changes in subjective complaints (CFQ total score) from t2 to t3 did correlate with changes in anxiety (r = 0.60, p = 0.003) and depression (r = 0.75, p < 0.001) but not with changes in neuropsychological test results or FA (p > 0.05).

At t3, subjective complaints correlated with scores of fatigue (r = 0.40, p = 0.002) and depression (r = 0.33, p = 0.01), but not with neuropsychological test results.

Discussion

Our data provides evidence for recovery of previously reported (Deprez et al. 2012) WM microstructure alterations in patients with breast cancer. Patients showed initial decrease in FA from baseline to t2, linked with cognitive impairment, with return to baseline levels 3–4 years later as well as increased performance on cognitive tests at t3. Interestingly, we found a significant correlation between time after treatment and recovery of WM damage (reflected by increased FA) in two of the four investigated regions, albeit not corrected for multiple ROIs. Furthermore at t3, no significant differences could be found in mean advanced diffusion (DTI, DKI, NODDI) and MWI metrics between C+ patients, C- patients and healthy controls, confirming the hypothesis of recovery.

What was the WM injury and what drives its recovery?

Based on the results of preclinical studies (Han et al. 2008; Seigers et al. 2013; Briones and Woods 2014), case studies (Moore-Maxwell et al. 2004) and our own results (Deprez et al. 2012; Deprez et al. 2013a), we previously hypothesized that chemotherapy-induced changes in WM could be related to WM demyelination. Based on this hypothesis, the longitudinal recovery of FA could indicate remyelination, although axonal reorganization could also have contributed to the initially observed FA decline (de Ruiter et al. 2011; Deprez et al. 2013a; Stouten-Kemperman et al. 2015b). However, the lack of cross-sectional group differences in MWF and ODI at t3 suggests that, in case the changes from t1 to t2 were indeed related to differences in myelination and/or axonal organization, these are likely no longer present three to four years after chemotherapy.

(Partial) recovery has been observed before, after WM changes related to radiation therapy (Hua et al. 2012), alcohol (Pfefferbaum et al. 2014) and cocaine (Bell et al. 2011) abuse, and traumatic brain injury (Sidaros et al. 2008). Furthermore, from studies on healthy volunteers, we know that WM microstructural properties as measured with DTI can change in response to cognitive exercise (Lovden et al. 2010; Takeuchi et al. 2010). Such neuroplastic changes may also explain the positive outcome of cognitive exercise programs in cancer survivors (Von Ah et al. 2012; Kesler et al. 2013; Morean et al. 2015). As most women included in this study were young and active, cognitive challenges at work and social activities may have contributed to medium-term recovery through neuroplastic mechanisms. Moreover, we detected positive correlations between time since treatment and recovery in terms of FA, suggesting improvement over time.

When does WM recovery occur?

When exactly the process of recovery starts and when it is complete is unclear. Previous longitudinal studies reported (partial) recovery in neuropsychological, volumetric grey matter, and functional brain changes 1 year after ending chemotherapy (Schagen et al. 2002; Ahles et al. 2010; McDonald et al. 2010, 2012; Lepage et al. 2014). Contrary to this, 5 years after standard-dose chemotherapy, Kesler et al. (Kesler et al. 2015) reported lower FA in an older group of patients with breast cancer (43 to 72 years) compared to healthy controls. Older age might be linked to higher vulnerability to chemotherapy-treatment (Ahles et al. 2010), slower recovery of WM damage (Baltan 2015) and less exposure to cognitive challenges stimulating recovery. Additionally, Kesler et al. included patients from 5 months up to 14 years after treatment, and thus may not have been able to capture a two-phase process of initial transient chemotherapy-induced WM injury followed by recovery. In another cross-sectional study, ten years after treatment with standard-dose chemotherapy for breast cancer, Stouten-Kemperman (Stouten-Kemperman et al. 2015b) et al. did not detect any differences in FA compared to controls, whereas differences were found in patients treated with high-dose. Finally, 14 years post treatment, testicular cancer patients treated with chemotherapy, showed widespread increase in the diffusion kurtosis imaging metric, radial kurtosis, compared to controls (Stouten-Kemperman et al. 2015a). These studies suggest that age, chemotherapy-dose, type of cancer and chemotherapeutic agent, all may have an impact on the process of initial WM damage and the timing of subsequent recovery.

Subjective cognitive complaints, affective symptoms, and recovery

While we detected recovery of chemotherapy-related WM changes in the C+ group, changes in FA did not correlate with changes in subjective complaints, which for the C+ group decreased minimally between t2 and t3, but became similar across groups. Furthermore, while longitudinal changes in subjective complaints from t2 to t3 in the chemotherapy-treated group also did not correlate with changes in neuropsychological test scores, they did correlate with changes in anxiety and depression score. Additionally, at t3, subjective complaints correlated with fatigue. These findings align with evidence from others (Shilling and Jenkins 2007; Biglia et al. 2012; Bower and Ganz 2015; Wefel et al. 2015) that self-reported cognitive complaints are more strongly associated with emotional factors (e.g. cancer-related negative feelings and fatigue (Bower and Ganz 2015)) than with objective cognitive performance (Wefel et al. 2015). Whilst these results suggest that latent cognitive complaints could be linked with affective symptoms and fatigue, it is worth noting that an animal study from Han et al. found delayed WM degeneration following administration of 5-FU (Han et al. 2008). Delayed WM degeneration could be linked with later-onset cognitive changes (Wefel et al. 2010) and persistent longer-term cognitive deficits (Kreukels et al. 2008; Weis et al. 2009; Koppelmans et al. 2012).

Methodological considerations

The strengths of this study include (1) its longitudinal design including 3 time points, (2) the use of advanced dMRI and MWI tools to assess microstructural changes at t3 and (3) the use of a population-based atlas and well-validated registration methods to minimize registration errors crucial for multimodal group studies.

Additional longitudinal research is needed using advanced dMRI and MWI to further clarify the nature and the timing of the observed changes in the WM microstructure. Larger sample sizes are needed to assess the differential influence of treatment (including endocrine treatment) and menopausal status. In our sample, almost all patients in the C+ group became menopausal at t3, which was not the case in both control groups. Hormonal changes linked with endocrine-treatment and chemotherapy-induced menopause could potentially have influenced cognition and MRI findings. However, possible associated cognitive effects are still not well understood and conflicting results have been reported (Fan et al. 2005; Hermelink et al. 2010; Tager et al. 2010; Vearncombe et al. 2011; Conroy et al. 2013; Zwart et al. 2015). As we observed improved cognitive performance in the C+ group at t3, the impact of chemotherapy-induced menopause on cognition may be negligible (or reversible). In this longitudinal analysis, we chose to only include participants with data available at the three timepoints. The higher drop-out in the chemotherapy-treated patient group (26%) than in control groups (C-: 12.5% and HC: 21%) may therefore have influenced our results. However, when including all participants in the ME model (also participants that did not participate at t3), we obtained more significant group x time interactions for more neuropsychological tests (Data available upon request).

Conclusion

Our results provide evidence for recovery of chemotherapy-induced WM microstructural changes three to four years after treatment. Longitudinal changes in diffusion parameters could therefore serve as a sensitive biomarker for treatment-induced neurotoxicity and follow-up on possible recovery. This recovery was confirmed by advanced diffusion and MWI metrics, which showed no cross-sectional differences between patients and controls.