Androgen deprivation therapy (ADT) is a recommended treatment for prostate cancer patients who are at intermediate or high risk for recurrence or local metastasis [1]. ADT can take the form of surgery to remove the testicles (orchiectomy) or more commonly, the administration of pharmacological agents such as luteinizing hormone-releasing hormone (LHRH) agonists or antagonist, often in combination with nonsteroidal anti-androgens [2]. The mechanism of action of ADT is elimination of androgens such as testosterone from the body. Although surgical and pharmacological forms of ADT are effective in delaying tumor progression in prostate cancer patients [3], they also have the potential to produce a number of adverse side effects [2]. One such negative side effect may be cognitive impairment [2, 4]. This possibility is supported by a growing body of research which suggests that naturally occurring reductions in testosterone play a role in age-related declines in cognition [5, 6]. For example, lower free testosterone levels have been found to be associated with worse performance on objective neurological tests of visual memory, verbal memory, visuomotor scanning, and visuospatial rotation in healthy community-based samples of older men [6, 7]. Additional supporting evidence comes from research on sex differences in cognition. Visuospatial ability, in particular, consistently yields differences between the sexes favoring males, suggesting the possible influence of testosterone on this aspect of cognitive performance [8, 9].

A previous review of research on cognition in prostate cancer patients treated with ADT concluded that the majority of patients experience cognitive decline in at least one cognitive domain, with visuospatial ability and executive functioning being the most commonly reported problem areas [10]. The most recent review updated the literature review and considered how various study designs impacted the ability to detect ADT-related changes in cognition [11]. Across studies using a pre-ADT baseline, prostate cancer patient group, or noncancer group as a comparison, spatial memory was consistently shown to be worse in the ADT-treated group [11]. However, it was noted in both reviews that findings were mixed across studies, with some reports of improved functioning in areas such as verbal memory [10, 11]. Furthermore, no overall effect sizes were reported in either review to help determine the magnitude of the observed cognitive changes. In addition, several studies were not included in the most recent review that examine a broad range of cognitive domains and may help to determine which areas of functioning are most likely affected by ADT.

Accordingly, this study aimed to provide an updated systematic review of the existing literature on the effects of ADT on cognition in men with prostate cancer. In addition to summarizing the results of studies of objective neuropsychological performance during ADT, the magnitude of observed cognitive changes was examined using meta-analysis. Meta-analysis can be used to quantify the degree of cognitive change and determine the reliability of change across study samples, issues that were not addressed in the previous literature reviews [10, 11]. Specifically, this meta-analysis evaluated the effect size of ADT on separate cognitive domains and tested the hypothesis that prostate cancer patients treated with ADT will perform worse than comparisons across cognitive domains. In addition, this review examined various moderators, such as study design and total duration of ADT, to determine if they accounted for some of the discrepant findings observed across studies.

Method

Search strategy

The study was conducted in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement [12]. Peer-reviewed articles considered for inclusion in the meta-analysis were collected via electronic searches of English language articles in PubMed Medline, PsycINFO, Cochrane Library, and Web of Knowledge/Science (see Online Resource for search terms). Reference lists from relevant reviews and meta-analyses were also examined to identify articles. The search was inclusive of articles published between 1950 and June 2012.

Selection strategy

The following criteria were used to determine which studies would be included from the original list of retrieved abstracts. First, all selected articles had to report original data. Accordingly, review papers, meta-analyses, editorials, and letters to the editor were excluded. Second, the studies must have reported on adult males diagnosed with prostate cancer and undergoing some form of pharmacological ADT. Eligible study designs included longitudinal comparisons (an assessment before or within 4 weeks of the initial ADT dose compared with at least one subsequent assessment at least 3 months after ADT initiation), comparisons with a prostate cancer control group, or comparisons with a noncancer control group. Finally, studies must have reported objective neuropsychological data. Studies that only reported on mental status exams and other broad cognitive screening measures such as the Mini Mental Status Exam, Short Portable Mental Status Questionnaire, Cambridge Cognition Examination, and High Sensitivity Cognitive Screen were excluded. Pairs of independent reviewers (H.L.M., J.M.C., M.G.C., and Y.A.) determined which retrieved abstracts were eligible for further review based on the inclusion criteria outlined above. Resulting lists of eligible articles were then compared and any disagreements were settled by discussion among reviewers. Full-text articles for the selected abstracts were reviewed to confirm eligibility.

Review strategy

Relevant information was independently abstracted by two raters using standardized abstraction forms. The following information was abstracted: study characteristics (i.e., study design, sample size, comparison group matching criteria), ADT sample characteristics (i.e., age, education, type of androgen blockade), and timing of first follow-up cognitive assessment after initiation of ADT treatment. In cases where studies had multiple follow-up assessments post-ADT, the first on-treatment follow-up assessment at least 3 months following the start of ADT was chosen for inclusion in the meta-analysis to reduce potential practice effects of repeated cognitive testing and reduce the likelihood of assessing patients during off-treatment phases of intermittent ADT. Focusing on the first on-treatment assessment also has the added benefit of detecting if there are immediate effects of ADT. Objective cognitive data were also abstracted (i.e., group means, standard deviations, and sample sizes). Authors were contacted to provide data in cases where articles did not report sufficient data to calculate effect sizes. Abstracted data were compared between raters and checked for discrepancies. Inconsistencies between raters were resolved after discussion and review of the article or original data submitted by study authors.

Measured outcomes

Various neuropsychological tests were used across studies to determine cognitive functioning of ADT patients and controls. Because the classification of tests into cognitive domains varied widely between included studies, the included neuropsychological tests were divided into seven cognitive domains based on an established neuropsychological reference text [13] and by consensus among the research team. The final classification of tests into cognitive domains is presented in Table 1.

Table 1 Neuropsychological tests by cognitive domain

Statistical analysis

Individual effect sizes for each neuropsychological test were calculated for each available comparison (i.e., longitudinal, prostate cancer control, or noncancer control). Between-subject differences were based on the first assessment after the start of ADT, and within-subject differences were based on the pre-ADT baseline relative to the first post-ADT assessment time point. Effect sizes were calculated using Hedge’s g [14], the mean difference between comparison groups divided by the pooled standard deviation. All effect sizes were coded such that lower scores indicate worse performance in the ADT group versus baseline or control group. In study comparisons where more than one neuropsychological test was available in the same cognitive domain, an effect size was calculated for each test; effects sizes were then averaged over all tests in the domain for that study. Finally, in studies where the ADT group was separated into different types of treatment regimens, the calculated effect sizes were based on the pooled data across ADT treatment groups.

Random effects models were used to calculate the effect sizes for each of the seven cognitive domains. Moderator analyses were conducted when significant heterogeneity was found (I 2 ≥ 65 %) among sample effect sizes within the same domain. Results were stratified by study design comparison (longitudinal, prostate cancer control, or noncancer control) to determine the impact of comparison on the effect of ADT. Mean duration of ADT at first follow-up in months was selected as another potential moderator variable a priori. This moderator analysis was examined using meta-regression with method of moments estimation [15].

The overall average effect sizes for each cognitive domain were assessed for publication bias using funnel plots and trim and fill plots for each domain that exhibited a statistically significant effect size. Orwin’s fail-safe N was also calculated to determine the stability of the significance of the resulting overall effect size [16]. Specifically, the total number of studies with null or opposite findings that would be needed to render the effect size no longer significant was calculated. A trivial effect was set a priori to g = −.10 and the mean point estimate in missing studies was conservatively assumed to be − .005. Larger values for Orwin’s fail-safe N indicate more robust findings [16].

Results

Study selection

A total of 157 unique articles were identified for potential inclusion in the current review (see Fig. 1). Based on the stated inclusion criteria, a total of 128 abstracts were deemed ineligible. An additional eight studies were excluded after full-text review, leaving a total of 21 articles abstracted for the meta-analysis. Of those, three were excluded after further review and another two studies were excluded after they were determined to report on the same data already included in the meta-analysis [17]. Finally, we requested data from the authors of seven of the 16 remaining studies. Authors of four of the studies responded and provided the requested data [1821]; two studies were excluded due to insufficient data, and one study was included with partial data [22]. Consequently, 14 original articles were included in the present meta-analysis. These articles reported on data from a total of 12 nonoverlapping study samples (see Table 2).

Fig. 1
figure 1

Selection of included studies

Table 2 Characteristics of included studies (k = 14)

Description of study participants

Of the included articles, three (21 %) reported cross-sectional data [2325]. All three of these studies had noncancer control groups and one also had a prostate cancer control group [24]. Of the cross-sectional study designs, the total duration of ADT ranged from a mean of 23 to 31 months (median = 27 months). The remaining 11 articles (65 %) reported on longitudinal assessments of prostate cancer patients from pre-ADT baseline to a first posttreatment follow-up ranging from 1 to 9 months after the start of ADT (mode = 6 months) [1722, 2630]. Five of these longitudinal studies also had a noncancer control comparison group [17, 22, 2729], one also had a prostate cancer control comparison group [19], two had both noncancer and prostate cancer control comparisons [18, 20], and three had no comparison group [21, 26, 30]. Finally, three studies (21 %) initially separated ADT groups based on type of treatment received (short- or long-term ADT; goserelin or leuprorelin) [19, 20, 24]. Hence, the calculated effect sizes for the ADT group for these studies were based on pooled data across ADT treatment types.

Sample characteristics are shown in Table 2. With regard to sample size, the number of prostate cancer patients per study who received ADT ranged from 14 to 77 (median = 46), with a total of 417 patients across studies. Mean age of the ADT groups ranged from 63.2 years to 71.0 years across study samples. Mean years of education for the ADT groups ranged from 6 to 22 years for the 10 studies that provided this information, with most studies reporting mean education at the college level. Among the four studies that included a non-ADT prostate cancer control group, sample sizes for these groups ranged from 14 to 82 (median = 48), with a total of 122 unique patient controls. Of these studies, two reported on data from patients who were considered for ADT but were randomized to a close monitoring group that did not receive any active treatment at the time of study assessments [19, 20], one study reported that these patients previously underwent surgery or radiation or both [24], and one study did not report information about past or current treatment for the non-ADT prostate cancer control group [18]. Among the ten studies that included a noncancer control group, sample sizes for these groups ranged from 7 to 82 men (median = 45), with a total of 285 unique men without prostate cancer. The noncancer control group were healthy men recruited from the community in nine studies [17, 20, 2225, 2729] with four of these nine specifying that hypogondal men were excluded [22, 24, 27, 28]. One study recruited the noncancer control group from men with nonmalignant prostatic diseases in urology clinics and found no differences in other comorbidities between the noncancer patient group and the ADT group [18]. Across all included studies, five studies (36 %) excluded patients with metastatic disease [17, 18, 24, 25, 28], three (21 %) excluded patients with bone or central nervous system metastases [22, 27, 30], two (14 %) included patients with metastatic disease [21, 23], and four (29 %) did not specify metastatic status or did not indicate that inclusion was based on metastatic status [19, 20, 26, 29]. Twelve (86 %) of the studies used cognitive, neurologic, or psychiatric impairment as an explicit exclusion criterion [1723, 2528, 30] and two (14 %) did not specify if participants were excluded based on these criteria [24, 29].

Meta-analysis

Table 3 displays the weighted average effect size by cognitive domain. When collapsing across all study designs, there was a significant effect of ADT for one cognitive domain. Patients treated with ADT demonstrated significantly worse functioning on the visuomotor ability domain (g = −0.67, p = .008). See Fig. 2 for the forest plot of the study effect sizes in the visuomotor ability domain. Only the visuomotor domain demonstrated significant heterogeneity across studies (I 2 = 66.79). Thus, moderator analyses were conducted just for this domain.

Table 3 Weighted average effect sizes by cognitive domain
Fig. 2
figure 2

Forest plot of effect sizes (g) for visuomotor ability

Moderator analyses for visuomotor domain data indicated there was no significant effect of study comparison type on the observed effect size of ADT on cognition (QB = .90, p = .64). That is, the deleterious effect of ADT on visuomotor ability did not vary significantly depending on the control comparison used across studies. Duration of ADT treatment was also evaluated as a moderator. Meta-regression indicated that total time on ADT at the time of the follow-up assessment was a significant moderator of the effect of ADT on visuomotor ability (p = .04) such that the magnitude of the deficits was larger in studies with shorter time to follow-up.

Publication bias

As shown in the funnel plot in Fig. 3, the trim and fill procedure imputed five studies to the left of the mean. The adjusted effect size after the trim and fill procedure was g = −0.85 (95 % CI −1.36 to −0.34). This suggests that if systematic bias does exist in the meta-analysis, it is very slight and biased towards underestimating the effects of ADT on visuomotor ability. Regarding the robustness of the observed difference in visuomotor ability between patients and controls, Orwin’s fail-safe N indicated that 209 studies would be needed to render the observed group differences trivial.

Fig. 3
figure 3

Funnel plot of effect sizes by standard errors for visuomotor ability. White circles indicate observed values for each comparison and black circles indicate imputed values

Discussion

The current meta-analysis was conducted on 14 studies examining the effects of ADT on objective cognitive functioning in men with prostate cancer. Results indicated that patients treated with ADT performed significantly worse on visuomotor tasks compared to controls (effect size, g = −0.67). There were no differences in performance on tests of attention/working memory, executive functioning, language, verbal memory, visual memory, and visuospatial ability. These findings suggest that, on average, patients treated with ADT for prostate cancer can anticipate focal cognitive deficits in visuomotor ability. This finding points to the subtlety of the cognitive effects of ADT and the need for researchers to carefully select which measures they use to evaluate cognition in these patient groups.

To the best of our knowledge, this report represents the first meta-analysis of studies of cognitive functioning associated with receipt of ADT. As noted in previous qualitative reviews [10, 11], individual studies have reported deficits in visuospatial skills, spatial memory, and executive functioning as well as possible changes in verbal memory. In contrast, our meta-analysis found no clear-cut evidence of changes in these cognitive domains nor differences compared to control groups. Meta-analytic reviews have the advantage over qualitative reviews of quantifying the magnitude of the observed effect. The conclusions of the previous reviews were based on the observed statistical significance of each individual study which can be influenced by the size of the sample. Our meta-analytic review examined the effect sizes for each cognitive domain across study samples, allowing for an objective comparison across studies and a less biased method for summarizing the overall influence of ADT on cognition.

As noted previously, visuospatial abilities may be particularly vulnerable to changes in testosterone levels. The cognitive domains outlined in the current review distinguished between tasks with visuospatial and spatial memory components and those with a visuomotor component. This allowed for the evaluation of the effects of ADT on different types of spatial tasks. This is an important distinction given previous research showing that ADT administration often results in loss of muscle mass and muscle strength [3133] which may suggest that other motor abilities could be affected. For example, in a study that evaluated overall physical performance in men undergoing ADT, scores were in the impaired range on measures of balance, walking speed, and quadriceps strength [34]. Using this approach to coding cognitive domains, we observed deficits on visuomotor tasks, such as the Block Design test, a task that involved both cognitive and manual manipulation (e.g., using patterned blocks to reproduce an abstract two-dimensional design), but not on spatial memory and visuospatial tasks, such as mental rotation and route tests which do not require manual manipulation. Unfortunately, there were not enough studies that conducted tests of motor abilities only, so it is difficult to determine whether ADT is associated with impaired ability to integrate visual and motor abilities or whether motor abilities only are affected.

Regarding moderator analyses, duration of ADT and study design were examined to determine whether they contributed to heterogeneity across studies in the visuomotor domain. Only duration of ADT was found to be a moderator; findings suggested that deleterious effects of ADT on visuomotor skills occur early in the course of treatment and dissipate as time progresses. These findings must be viewed with caution, however, as most studies assessed patients only within 9 months of initiation of ADT. Accordingly, effects of ADT on visuomotor abilities beyond 9 months remain largely unknown. Study design did not emerge as a significant moderator; however, this may have been due to limited power to detect these changes. There were no differences between mean effect sizes for patients treated with ADT when compared to their own baseline assessments or to men with or without prostate cancer who never received ADT. These findings are inconsistent with other meta-analytic reviews of the effects of various cancer treatments on cognitive functioning which found comparisons with control groups yielded larger differences relative to within-patient longitudinal comparisons [3537]. The pattern of findings across reviews suggests that cancer itself may negatively affect cognition, which is consistent with several studies documenting cognitive impairments in cancer patients before the start of systemic therapy [3840]. One possible explanation for the lack of evidence for longitudinal change within patients in other reviews is that declining cognitive functioning may be hidden by the benefits of practice effects that occur on most neuropsychological tests when administrations are repeated [41]. Thus, lack of significant improvement over multiple testing sessions may itself be an indicator of a deficit.

Potential publication bias was evaluated in the current meta-analysis because only published studies were included in the review. Although potential bias was found for the significant effect size for the visuomotor domain, it was slight and not in the usual direction. Specifically, the results of the trim and fill procedure indicated that the observed effect may have been larger had more studies been available. This was also true in another recent meta-analysis of cognitive outcomes in cancer patients [37] and is likely due to the greater likelihood for null results to be published because of their significance for planning treatments and the relative recency of this area of research.

A limitation of the present meta-analysis was that it was not possible to effectively assess the impact of age or education as moderators despite the likelihood that these variables would be related to cognitive outcomes. There was limited range of the mean age of the ADT groups across studies (range = 62.1 to 78.0) and inconsistent matching of control groups on this variable made it difficult to evaluate the impact of age. Regarding education, we encountered a similar problem of inconsistently matched comparison groups across studies and also the difficulty of comparing studies within and outside of the USA given the differences in educational standards internationally. Another limitation was the lack of information about the treatments previously received by the prostate cancer control groups; few studies included details about prior or ongoing treatments and those that did revealed potentially heterogeneous comparison treatment groups within the control group. Future research should more carefully describe and define patient comparison groups. An additional limitation was the limited length of follow-up evaluations in the current meta-analysis. We found that the effects of ADT on cognition may dissipate as time progresses, but future studies are needed to evaluate long-term effects after 9 months. Finally, an additional limitation was that most studies did not include sufficient details about the administration of ADT to detect if there was an effect of intermittent administration versus continuous, or between pharmacological and surgical ADT. Data for all the studies included in this meta-analysis were from patients who were administered ADT continuously prior to the follow-up assessment, so it could not be determined if there are potentially reversible changes in cognition when ADT is discontinued.

In summary, results from the current meta-analysis suggest that ADT-related deficits may occur in visuomotor tasks rather than visuospatial tasks without a motor component. It is unclear whether these deficits are primarily motor in nature and related loss of muscle mass secondary to ADT [32, 33] or whether spatial aspects of these tasks are negatively affected by ADT, as suggested by previous literature linking testosterone to visuospatial skill [8, 9]. Future studies should include tasks that evaluate visuospatial skills with and without a motor component as well as tasks of motor speed. Clinically, this meta-analysis suggests that patients can expect cognitive functioning after initiation of ADT to be similar in many respects to that prior to ADT. With the exception of visuomotor skills, cognitive functioning will be comparable on average to prostate cancer patients without ADT and men without cancer. This information may aid patients considering treatment options for prostate cancer and provides a more comprehensive understanding of the likely side effects of ADT.