Introduction

Prostate cancer is the most common solid malignancy and second-leading cause of cancer mortality in males in the United States, with the incidence highest in Europe but lowest in Southern/Eastern Asia [1]. A strong genetic dose–effect has been shown in prostate cancer: men with one first-degree relative with prostate cancer have a twofold increased risk, and those with two first-degree relatives have a fivefold increased risk of developing this disorder compared with men without family history [2]. However, genetic factors contributing to the huge amount of sporadic prostate cancer are unclear. A large panel of genes including androgen receptor (AR) gene has been proposed as susceptible candidates for prostate cancer.

AR gene located on Xq11-12 encodes the androgen receptor [3]. The activation of AR is medicated by binding to androgenic hormones testosterone or dihydrotestosterone in the cytoplasm [4]. Androgens, acting through the DNA binding transcription factor AR, play crucial roles in developing the male phenotype during embryogenesis, achieving sexual maturation at puberty, and maintaining male reproductive function and behavior in adulthood. The defects of AR lead to genital abnormalities [5]. Furthermore, AR can inhibit p53 accumulation in the nucleus of LNCaP cells, providing a posttranscriptional mechanism whereby androgens control prostate cell growth and survival [6]. The defective, mutated AR can cause prostate cancer and androgen insensitivity syndromes [5]. Notably, the CAG repeat polymorphism in exon 1 of AR may modulate the androgenic effects. Transcription of androgen-target genes is attenuated with increasing length of triplet residues in vitro [6]. The CAG repeat polymorphism modulates androgenicity in various tissues and psychological traits in healthy eugonadal men: the longer the repeat tracts, the less pronounced is the androgenic effect compared to individuals with similar testosterone concentration. Additionally, prostate volume and growth in testosterone-substituted hypogonadal men are dependent on the CAG repeat polymorphism of AR. More importantly, the polymorphism is reported to be associated with prostate cancer risk [7].

Identifying the role of AR genetic variants may provide a clue to elucidate the genetic underpinnings of prostate cancer. CAG repeat polymorphism (rs4045402) in AR gene has been extensively evaluated [7, 8]. Many genetic association studies have attempted to link AR CAG repeat to prostate cancer. However, the results are inconsistent. This lack of reproducibility might stem from the study design, the small magnitude of effect of the SNP, the issues with respect to statistical power, or the heterogeneity in study populations [9]. In addition, the variable repeats of CAG sequence had a wide ethnic variety [10]. The purpose of our meta-analysis was to investigate whether the AR CAG repeat polymorphism was associated with prostate cancer risk and whether this polymorphism had genetic heterogeneity across different geographic regions and different study designs.

Methods

Literature search

We employed three searching engines, including PubMed, Excerpta Medica Database (EMBASE) and Web of Science, to collect available publications as of July 1, 2010. We focused on the papers written in English and performed in humans. We utilized searching key words ‘prostate cancer’ and ‘androgen receptor’ or ‘AR’, combined with ‘CAG’. We further checked all retrieved articles for citations for studies that had not been initially identified. If two or more studies shared the same patients or control subjects, we selected the one with a larger sample size; if more than one geographical or ethnic populations were included in one report, we considered each population or group independently.

Inclusion criteria

We identified studies that satisfied the following criteria: (i) evaluation of AR CAG repeat polymorphism association with prostate cancer; (ii) case–control, nested case–control, or cross-sectional studies using either a hospital-based or a population-based design; (iii) sufficient information on CAG repeat distributions between patients and controls for estimating the odds ratio (OR) and its corresponding 95% confidence interval (CI).

Extracted information

Two authors (M. Gu and W. Niu) of this article independently extracted the following information from all qualified studies: first author’s last name, publication date, population ethnicity, study design, baseline characteristics of the study population, and the genotype distribution in patients and controls. Any encountered discrepancies were adjudicated by a discussion until a consensus was reached. For consistency, the continuous variables expressed as mean ± standard error (SE) were converted to mean ± standard deviation (SD).

Statistical analysis

We predicted the contribution of AR CAG repeat polymorphism to the risk of prostate cancer using the Stata software version 11.0. In this meta-analysis, we implemented the random-effects model, instead of fixed-effects model, to pool the individual effect-size estimates together. Within a fixed-effects model, only sampling error contributes to the differences between the observed effect-size estimates across individual studies. To the contrary, there are two sources of variance coexisted in a random-effects model including the sample error and between-study heterogeneity [11]. Considering the heterogeneity between studies, it is appropriate to utilize a random-effects model [12].

We performed subgroup analysis to narrow down the studies by removing an individual study each time or to assess their individual effects on different ethnic groups. Moreover, to estimate the extent to which one or more covariates explain the heterogeneity, meta-regression was employed. The meta-regression model, as an extension to random-effects meta-analysis, relates the treatment effect to the study-level covariates, assuming a normal distribution for the residual errors with both a within-study and an additive between-studies component of variance.

Furthermore, we assessed the publication bias using both the fail-safe number (Nfs) with the significance set at 0.05 for each meta-comparison and the Egger’s regression asymmetry test. Especially if the calculated Nfs value is smaller than the number of observed studies, the meta-analysis results might take the risk of publication bias. We calculated the Nfs0.05 according to the formula Nfs0.05 = (∑Z/1.64)2 − k (k is the number of articles included in the meta-analysis). Moreover, we undertook the funnel plot and Egger’s test to assess the publication bias. Egger’s test can detect funnel plot asymmetry by determining whether the intercept deviates significantly from zero in a regression of the standardized effect estimates against their precision.

Results

Search of reports

Because not every qualified study had provided the specific distributions of AR CAG repeat counts, we focused on three widely evaluated dichotomous comparisons, viz. ≥23 repeats of CAG sequence versus others, ≥22 repeats versus others and ≥20 repeats versus others. Our search and selection process was described in Fig. 1. After extensive literature search, we finally identified 27 reports that satisfied our inclusion criteria and conducted at least one aforementioned comparison [1339]. Thereof, five of 27 reports had listed specific counts of each CAG repeat between patients and controls [13, 17, 1921]. Additionally, there were two studies involving more than one study groups, which were considered separately [21, 26].

Fig. 1
figure 1

Flow diagram of search strategy and study selection

Characteristics of qualified studies

In total, there were 14 reports [1326] involving 16 study groups including 2,972 patients and 3,792 controls comparing ≥20 CAG repeats with others, 17 reports [1317, 1923, 2632] involving 19 study groups including 3,835 patients and 4,774 controls comparing ≥22 repeats with others, and 11 reports [13, 17, 1921, 3338] involving 13 study groups including 3,372 patients and 2,631 controls comparing ≥23 repeats with others. Detailed information regarding study design, diagnostic methods of prostate cancer, country or ethnicity, and age is presented in Supplementary Table S1.

Main comparison of AR CAG repeat polymorphism

Compared with CAG repeat <20, 22 or 23, carriers of ≥20, 22 or 23 repeats had 21% (95% CI: 0.61–1.02; P = 0.076, Fig. 2), 5% (95% CI: 0.81–1.11; P = 0.508, Fig. 3) and 5% (95% CI: 0.76–1.20; P = 0.681, Fig. 4) decreased risk of prostate cancer in the random-effects model.

Fig. 2
figure 2

The overall and separate contrast of AR CAG polymorphism of greater than or equal to 20 repeats versus others by both study design (the upper panel) and geographic region (the lower panel). The summary treatment effect (odds ratio or OR) is shown by the middle of a solid diamond whose left and right extremes represent the corresponding 95% confidence interval (95% CI)

Fig. 3
figure 3

The overall and separate contrast of AR CAG polymorphism of greater than or equal to 22 repeats versus others by both study design (the upper panel) and geographic region (the lower panel). The summary treatment effect (odds ratio or OR) is shown by the middle of a solid diamond whose left and right extremes represent the corresponding 95% confidence interval (95% CI)

Fig. 4
figure 4

The overall and separate contrast of AR CAG polymorphism of greater than or equal to 23 repeats versus others by both study design (the upper panel) and geographic region (the lower panel). The summary treatment effect (odds ratio or OR) is shown by the middle of a solid diamond whose left and right extremes represent the corresponding 95% confidence interval (95% CI)

Subgroup analysis

Considering the fact that prostate cancer tends to develop in men over the age of 50, we removed those studies with age <45 in controls in sensitivity analysis. Compared ≥20 CAG repeats with others, we observed significant protective effect of this polymorphism on prostate cancer (OR = 0.74; 95% CI: 0.60–0.91; P = 0.004).

Moreover, since different study designs (population-based and hospital-based) might bias the association results, we analyzed overall studies according to study design. The magnitude of association in population-based studies was slightly stronger for comparisons of ≥20 repeats (Fig. 2) and ≥23 repeats (Fig. 4) compared to hospital-based studies. For examples, compared with CAG repeat <20, carriers of ≥20 repeats conferred 22% decreased risk (P = 0.16) in population-based studies, and 16% decreased risk (P = 0.313) in hospital-based studies (Fig. 2).

Furthermore, after classifying studies according to main geographic regions (USA, Europe and Asia [and Brazil if available]), carriers of ≥20 repeats had 11% decreased risk in populations from USA, 53% decreased risk from Europe, and 20% decreased risk from Asia, compared with CAG repeat <20, whereas none of these predictions reached statistical significance (P = 0.446, 0.061 and 0.291, respectively). A similar magnitude was noted in USA populations (OR = 0.90; P = 0.187) for comparison of ≥22 repeats with others, and this magnitude was alleviated in European populations (OR = 0.97; P = 0.796), whereas this comparison yielded a possible increased risk for populations from Asia (OR = 1.32; P = 0.456) but not from Brazil (OR = 0.71; P = 0.261). Contrastingly, comparison of ≥23 repeats with others generated a significant prediction for prostate cancer in European populations only (OR = 1.17; P = 0.039).

Meta-regression analysis

Testing effects of age (in both cases and controls), study design, and geographic region on heterogeneity among the individual ORs showed that none of these confounders was significant sources of between-study heterogeneity for all comparisons of CAG repeat (data not shown).

Publication bias

In order to assess the publication bias, we calculated the Nfs at the significance level of 0.05 for each comparison. As for contrasts of ≥20 repeats of CAG sequence versus others, ≥22 repeats versus others, and ≥23 repeats versus others, the Nfs0.05 values reached as large as 208, 194 and 135, respectively, which were much greater than the number of included studies (n = 16, 19 and 12). Besides the suggestive symmetry of funnel plot (Fig. 5), Egger’s test indicated low probability of publication bias for allelic comparison (P = 0.132 for ≥20 repeats; P = 0.838 for ≥22 repeats; P = 0.299 for ≥23 repeats).

Fig. 5
figure 5

Funnel plots for studies investigating the effect of AR CAG polymorphism on prostate cancer for ≥20 repeats (a), ≥22 repeats (b) and ≥23 repeats (c), respectively. Vertical axis represents the log of OR; horizontal axis represents the SE of log(OR). Funnel plots are drawn with 95% confidence limits. OR odds ratio, SE standard error. The graphic symbols represents the data in the plot be sized proportional to the inverse variance

Discussion

Our meta-analysis found that AR CAG repeat polymorphism with ≥20 repeats might confer a protective effect on prostate cancer risk in subjects with 45 years older. To improve the current therapy and prevention strategies, it is essential to explain inter-individual difference in susceptibility to prostate cancer [39, 40]. Inherited factors such as genetic polymorphisms involved in carcinogenesis might account for this difference. Researchers has paid particular attention to AR exonic CAG repeats and its association with prostate cancer susceptibility [41, 42]. However, the inconsistent results undermined the predictive value of the genetic variation. We therefore conducted a meta-analysis to systematically address this issue.

We observed that AR CAG repeat polymorphism with ≥20 repeats might confer a protective effect among the prostate cancer patients with 45 years older but not all the prostate cancer patients. To avoid the premature conclusion, we admit that the data was based on three widely evaluated dichotomous comparisons. Considering the statistical power, we did not evaluate other comparisons. It is indicated that, to generate robust data, a much larger sample size including >1,000 subjects in each group is required [43]. Our sample size in each comparison confirmed our ability to detect moderate effects of the genotypes.

Since prostate cancer is a late-onset disease [44], we removed studies with control groups including subjects less than 45 years old, and found that the magnitude of association was strengthened for comparison of ≥20 repeats of CAG sequence with others. Given that age is a surrogate for a host of unmeasured attributers; we speculate that the CAG repeat polymorphism might interact with age, so that incorporation of gene-age interaction might overcome part of the inconsistencies. In addition, we cannot exclude the possibility of moderate effect of CAG repeat polymorphism on prostate cancer in view of the marginal associations. This polymorphism might in linkage with other genes or polymorphisms within or near the AR to drive the malignant phenotype. A large, well-performed study focusing on both gene–gene and gene-environment interaction is needed to explore the mechanism of prostate carcinogenesis.

Moreover, the incidence of prostate cancer varies widely across the world, so we conducted subgroup analysis to assess this geographic effect. Of note we observed that the marginally protective effect of ≥20 CAG repeats in European populations. Additionally, although no significance was reached in populations from the USA, the magnitude of AR CAG repeat polymorphism remained almost the same across different comparisons. This divergence may be due to the different genetic backgrounds among ethnic groups. It is necessary to construct a database of polymorphisms associated with prostate cancer in each racial/ethnic group.

Besides, we separately performed analyses across populations with different study designs, and found associations were slightly different for all comparisons between population-based studies and hospital-based studies. Further meta-regression analysis did not reveal significant source of between-study heterogeneity for study design, as well as for age and geographic region. Although some statistical bias could not be eliminated and there was an indication of significant between-study heterogeneity, there was no evidence of publication bias for all comparisons in this meta-analysis as reflected by both the fail-safe number and the Egger’s test, indicating the strength of our results.

The strengths of our study include a large sample size and no indication of publication bias. However, some limitations should be considered. First, all included studies had the cross-sectional design, which precludes further comments on cause-effect relationship. Second, due to the relative small sample size of some studies or lack of necessary information, we were unable to perform further subgroup analyses. Third, we cannot retrieve common information from all these original publications upon various confounding factors such as smoking and drinking. Thus, we cannot jump to a conclusion until further confirmation/validation of our finding is made.

In summary, this meta-analysis has extended previous findings on the association between AR CAG repeat polymorphism and prostate cancer risk. We found that AR CAG repeat polymorphism with ≥20 repeats might confer a protective effect among the prostate cancer patients with 45 years older but not all the prostate cancer patients. Further studies are required to investigate AR adjacent genetic markers to confirm whether the present association is causal or due to linkage disequilibrium.