Introduction

Polycystic Ovary Syndrome (PCOS), first reported in 1935 [1], is an common endocrine disorder in reproductive-age women [2], with three major features containing hyperandrogenism, ovulatory dysfunction and polycystic ovaries. Women with PCOS, meanwhile, may be subjected to other relevant endocrine and metabolic diseases, and are at risk of impaired glucose tolerance, endometrial cancer, and cardiovascular disease [3, 4]. The pathogenesis of PCOS still remains largely obscure, but more and more studies have implicated a number of genes [5].

The androgen receptor (AR) gene, located at Xq11-12, including eight exons and seven introns, reaches a length of 90-kb, featured with CAG and GCC trinucleotide polymorphic repeats in exon 1. The AR protein, a member of the steroid hormone receptor family, is composed of four functional domains: the N-terminal domain (NTD), the DNA-binding domain (DBD), the flexible hinge region, and the ligand-binding domain (LBD). Structural analysis has demonstrated that the NTD, encoded by first exon 1 of the AR gene, is related to AR transactivational activity [6], and this domain consists of two kinds of polymer, polyglutamine stretches encoded by CAG trinucleotide repeat segment and polyglycine stretches encoded by GCC trinucleotide repeat segment [7, 8]. Normal CAG repeat number varies from 11 to 36 [9]. Previous studies in vitro have demonstrated that CAG repeat number shows an inverse correlation with AR activity [914]. Hence, it’s not difficult to understand that shorter CAG repeats, increasing AR activity, are associated with prostate cancer [15, 16], hirsutism [17] and hyperandrogenism in ovary [18, 19], while longer CAG repeats, decreasing AR activity, are related to hypoandrogenicity [20] and male infertility due to impaired sperm production [10, 21]. CAG tracts beyond the normal range (>40 CAG repeats) leads to Kennedy’s syndrome, a fatal neuromuscular disease [22].

These previous study results suggested that the CAG repeat polymorphism of the AR gene may participate in the development of PCOS by determining the variational sequence AR, which interact with androgen and transfer specific biochemistry signals. Thus a large number of studies have been conducted to investigate the relationship between AR gene CAG repeats and PCOS, however, with inconsistent results [19, 2333]. Based on these inconsistent results, we performed current study to research this association and meta-analysis.

Materials and methods

Subjects

We recruited 224 PCOS patients (mean age, 28.75 ± 3.4 years) and 223 infertile women (mean age, 29.35 ± 4.0 years) at the Reproductive and Genetic Hospital of CITIC-XIANGYA in China. All these women are members of China’s Han, receiving in vitro fertilization (IVF) due to infertility. PCOS was diagnosed according to the criteria of Rotterdam Revised 2003 [34] (2 out of 3): oligo and/or anovulation; clinical and/or biochemical signs of hyperandrogenism; polycystic ovaries, and other reasons of hyperandrogenism such as congenital adrenal hyperplasia, androgen-secreting tumors, or Cushing’s syndrome were excluded. PCOS patients included in current study showed polycystic ovaries (follicle >12 in unilateral and bilateral ovaries) and higher serum total testosterone (TT ≥ 0.7 ng/ml), or polycystic ovaries and menstrual cycle longer than 37 days.

The 223 women, selected as control group, were infertile by reason of fallopian tube jam or male sterility, and didn’t have structural abnormalities of the uterus or ovaries, irregular menstrual cycle, family hypertension, cardiovascular diseases and family history of diabetes. The study was approved by Institutional Review Board of the Reproductive and Genetic Hospital of CITIC-XIANGYA and informed consent was obtained from all patients and controls.

Analysis of CAG repeat length

Genomic DNA was extracted from the peripheral blood using QIA symphony DNA Kits (QIGEN, Suzhou, China) according to manufacturer’s instructions, then stored at TE (Tris-HCl and EDTA). The CAG repeat tracts were amplified using the forward primer (5′-CCTGTTGAACTCTTCTGAG-3′) and the reverse primer (5′-TCTGGGACGCAACCTCTCTC-3′). A total volume of 10 μl reaction mix contained 50 ng DNA, 10 μM of each primer, 5 mM MgCL2, 2.5 mM of each dNTP, 1.0 μl 10× buffer and 1 unit of Taq DNA polymerase (MBI Inc., Burlington, Canada). Polymerase chain reaction (PCR) were carried out under the following steps: an activity of Taq DNA polymerase at 92 °C for 2 min, then 30 cycles (denaturation at 94 °C for 40s, anneal at 61 °C for 40s, extension at 72 °C for 40s), a final extension at 72 °C for 5 min. 4 μl PCR products were subjected to agarose gel electrophoresis to identify amplified (CAG)n repeat fragments and 5 μl PCR products were separated by 8 % vertical polyacrylamide gel electrophoresis, getting two heterozygous stripes. Then, after cutting and purifying, another agarose gel electrophoresis was performed to separate the two heterozygous stripes, getting single band respectively. Finally, the CAG repeat numbers of homozygous stripes and single band of heterozygous stripes were calculated directly after purifying.

Clinical and biochemical assay

Age, body weight (BW), and height(H) were obtained from clinical case notes. Body mass index (BMI) was calculated using BW/H2 (kg/m2). Also, follicle-stimulating hormone (FSH), luteinising hormone (LH), oestradiol (E2) and testosterone (T) were measured during 2–4 days of menstrual cycle or the period of amenorrheic PCOS patients.

Statistical analysis

The AR gene CAG repeat length, and clinical and biochemical characteristics in both PCOS patients and controls were compared by two sample independent t-test. We divided mean CAG bi-allelic length into two groups (≤22, >22), using Chi-square test to compare AR-CAG alleles distribution. PCOS patients were categorized into hyperandrogenic (TT ≥ 0.7 ng/ml) and nonhyperandrogenic (TT < 0.7 ng/ml) subgroups. Difference in serum T levels between genotypes was tested by analysis of variance test (ANOVA) and Pearson’s correlation coefficient was applied to analyze the correlation between CAG repeats and T.

All data were analyzed by SPSS software (version 11.5; SPSS Inc., Chicago, IL, USA). All continuous data were presented as mean ± SD. P < 0.05 was considered to be statistically significance.

Meta-analysis

At present, although many studies had been performed, there still is no consistent conclusions about the relationship between CAG repeat length and PCOS. Considering that, we carried out a meta-analysis including our current case–control study to make a comprehensive estimate.

Search strategy

A comprehensive literature research was performed by two independent authors to find all eligible articles, exploring the association between the CAG repeat length polymorphism and PCOS, in PubMed, Science Direct (up to October 31, 2013) by using the following key words: “Polycystic Ovary Syndrome”, “PCOS”,“PCO”, “androgen receptor”, “AR”, “CAG”, “trinucleotide”. Moreover, the research was conducted without language limitation. We reviewed the titles and abstracts of all potential papers, then, considered whether these studies fulfilled the inclusion criteria. If some studies were ambiguous, we would screen the full-text for more information. Additional studies were identified by manually searching the reference lists of retrieved articles.

Study selection

The inclusion criteria: (1) case–control studies concerned with the association between CAG repeat length polymorphism and PCOS. (2) Sufficient data available to calculate the standardized mean difference (SMD) or odds ratio (OR) with each 95 % confidence intervals (95 % CIs). (3) Definition criteria in PCOS: National Institute of Health (NIH) criteria, 2003 Rotterdam criteria, or other suitable criteria.

The exclusion criteria: (1) studies without sufficient raw data. (2) studies without control group, or studies focused on inhuman gene. (3) Family-based studies.

Data extraction and analysis

Two independent researchers independently extracted data from eligible studies. Extracted data are as follows: the first author’s name, publication year, geography, location, sample size, bi-allelic mean of CAG repeat length, method.

All data were managed and analyzed by Review Manager 5.2 and Stata 11.0. For continuous data, standardized mean difference (SMD) was calculated as effect sizes (ES), while for dichotomous data, odds ratio (OR) was computed as ES. P < 0.05 was taken as significance difference in statistics.

Heterogeneity between studies was identified by Q-statistic and I2-statistic tests. For Q-statistic, significance level was P < 0.1, while for I2 tests, we considered that I 2 < 25 %, between 25 and 75, and >75 % showed low, moderate and high heterogeneity. If I 2 < 50 %, fixed effect model was applied, otherwise, random effect model was applied. Publication bias was checked by Begg’s test and Egger’s test. For sensitivity analysis, we removed one study, then recalculated the overall ES.

Results

Clinical and biochemical data analysis

The clinical and biochemical features of the PCOS patients and controls were described in Table 1. There were no significant differences in age between patients and controls (28.75 ± 3.4 vs. 29.35 ± 4.0, P > 0.05), while significant differences in BMI (22.25 ± 3.4 vs. 21.16 ± 2.5, P = 0.001). Compared with controls, PCOS patients had significantly higher levels of LH, E2, total T, and obviously lower levels of FSH.

Table 1 Clinical and biochemical characteristics in the PCOS patients and controls

Comparison of the polymorphism CAG repeat length

The CAG repeat length ranged from 12 to 28 in PCOS women and from 18 to 32 in controls (Fig. 1). No significant differences were observed in the mean CAG repeat number between the case group and control group (22.65 ± 2.5 vs. 23.09 ± 2.1, P = 0.116). When CAG alleles were categorized into short bi-allelic (<22 repeats) and long bi-allelic (>22 repeats), there were significant differences (P < 0.001) in the distribution of CAG allele between cases and controls (Table 2). In PCOS patients, short bi-allelic had higher frequency than long bi-allelic while was contrary in controls (Fig. 2).

Fig. 1
figure 1

Distribution of CAG biallelic mean in PCOS patients and controls

Table 2 Distribution of CAG biallelic mean, according to the qualitative categories of CAG repeats
Fig. 2
figure 2

Frequency distribution of CAG bi-allelic mean, divided into 2 categories (≤22,>22)

The association between the polymorphism CAG repeat length and testosterone in PCOS patients

Given that many previous studies have reported the relationship between AR-CAG repeat and T without consistent results, we investigate this association again. PCOS patients were divided into two groups, hyperandrogenic whose total T concentration more than 0.7 ng/ml and nonhyperandrogenic whose total T concentration less than 0.7 ng/ml. Compared with hyperandrogenic group, nonhyperandrogenic group had a trend for a lower mean CAG repeat length in bi-allelic average (22.3 ± 2.5 vs. 23.9 ± 2.2, P = 0.008) and short bi-allelic (21.0 ± 2.0 vs. 21.5 ± 0.8, P = 0.567) and long bi-allelic (24.3 ± 1.4 vs. 25.9 ± 1.6, P = 0.005) shown in Table 3. These results showed the CAG repeat lengths might positively correlated with T. Further Pearson’s correlation analysis also presented the significant correlation between CAG repeat lengths and T (r = 0.237, P = 0.008). On the other hand, we observed that the level of T showed a upward trend in three subgroups (mean CAG repeat < 22, 22, >22) of patients, being 0.45 ± 0.15, 0.53 ± 0.16 and 0.57 ± 0.23 respectively, which was confirmed by ANOVA (F = 3.121, P = 0.048) and regression analysis (P = 0.015) depicted in Fig. 3.

Table 3 AR-CAG repeat lengths in PCOS women with below or above normal total testosterone levels
Fig. 3
figure 3

Testosterone differences among the three groups of PCOS patients

Meta-analysis

Eligible study

A total of 712 articles were found in PubMed and Science Direct. After carefully reviewing, finally, ten studies [2332] and current article with 1,536 PCOS patients and 1,807 controls were included in this meta-analysis. The flow diagram of study selection was presented in Fig. 4. All eligible studies were about the relationship between AR gene (CAG)n polymorphism and PCOS, 4 of which assessed X-chromosome inactivation. For assessing the relationship between CAG repeat lengths and T concentration, six studies divided PCOS patients into two subgroups according to T concentration, compared mean CAG repeat numbers, and four studies calculated Pearson’s correlation coefficient. Of eleven articles, only our articles remained unpublished, and other ten articles published between 2000 and 2013. The characteristics of included studies were listed in Table 4.

Fig. 4
figure 4

Flow diagram of study selection

Table 4 Characteristics of 11 included studies in present meta-analysis

Quantitative data synthesis

Eleven studies compared the CAG bi-allelic mean repeats between patients and controls, one of which observed no statistical differences [31]. Also, according to significant heterogeneity (I 2 = 56 %, P = 0.001), and the results of random effect model (SMD = −0.05, 95%CI = −0.24-0.09, P = 0.38), meta-analysis didn’t presented significant difference between CAG bi-allelic mean repeats and PCOS (Fig. 5).

Fig. 5
figure 5

Meta-analysis of the association between CAG bi-allelic mean and PCOS

Subgroup classified by ethnicity analysis were conducted to explore the source of heterogeneity (Fig. 5). For eastern subgroup, heterogeneity test presented significant difference (I 2 = 70 %, P = 0.005), so we used random effect model and the overall ES examination showed no significant difference (SMD = −0.07, 95%CI = −0.24-0.09). Among western populations, the heterogeneity test showed no significant difference (I 2 = 27 %, P = 0.24) between subgroup and in overall ES (SMD = -0.01, 95%CI = −0.15-0.13). Both eastern and western subgroups revealed no significant association between CAG bi-allelic mean repeats and PCOS were found, and that heterogeneity didn’t decrease, reminding ethnicity factor maybe not the source of heterogeneity.

As shown in Fig. 6, a total of six studies including 363 hyperandrogenic and 477 nonhyperandrogenic were applied to analyze the association between CAG repeat length and the level of total T in PCOS patients. However, Meta-analysis revealed significant difference (SMD = −0.59, 95%CI = −1.17–−0.01, P = 0.05), with a high statistical heterogeneity (I 2 = 94 %, P < 0.00001). On the other hand, a meta-analysis was performed to assess the coefficient r between CAG repeat length and the level of total T in cases (Fig. 7). Heterogeneity between studies was low (I 2 = 47 %, P = 0.13), so fixed effect model was used. Overall effect examination was significant (r = 0.21, 95%CI = 0.13–0.28, P < 0.00001)

Fig. 6
figure 6

Meta-analysis of the association between CAG bi-allelic mean and hyperandrogenism in PCOS women

Fig. 7
figure 7

Fisher’s Z comparison of the association between CAG biallelic mean and testosterone levels

To get an more reliable results, sensitivity analysis were conducted by removing one study each time. The pooled SMD didn’t show significant change, suggesting the results was reliable. However, meta regression including three independent variables, sample size, PCOS definition and ethnicity failed to explore the potential source of heterogeneity because of limited available information (P = 0.07, 0.56, 0.37). For publication bias test, the Egger’s test (P > 0.119) and Begg’s test (P > 0.379) showed no significant publication bias.

Discussion

It is well known that PCOS is a common endocrine disorder in women at reproductive age [2]. The polymorphism of AR-CAG repeat length has been a research focus in the etiology of PCOS and, however, a large number of studies have been performed to investigate the association between CAG repeat polymorphism in AR gene and PCOS, without concurrent results. So the purpose of our study was to assess this association in Chinese han people to provide more convincing proof.

According to our results, PCOS patients tended to have a lower mean CAG repeat number than controls without significant differences while significant differences were found when the CAG repeat number was divided into two categories. These results were contrary to one previous study [30]. Skrgatic et al. [30] observed that the mean CAG repeat length in PCOS was higher than in controls, although no significant differences, and that the distribution of the CAG repeats showed no significant difference as well. Both our study and Skrgatic et al. study used the cut-off value 22 for the short mean repeat and long mean repeat. Hence, we infer that the differences may result from different variation in CAG repeat length in different population. Similar to our results, Xia et al. [35] found CAG bi-allelic distribution presented significant difference in cases and controls, setting 20 as the cut-off value.

As for the association of the CAG repeat length and serum testosterone, there is no accordant conclusion. Liu et al. determined 148 Han Chinese with PCOS and found no difference in the mean CAG repeat of low total T and high total T. Jääskeläinen et al. [24] repeated no relationship between the mean CAG bi-allelic repeat and serum T concentration. However, Kim et al. [27] revealed significant correlation between CAG biallelic average and free T concentration. Skrgatic et al. [30] demonstrated a positive linear association between the CAG bi-allelic repeat length and T level. Our findings support the results of Kim et al. and Skrgatic et al. Different from other studies, we compared the difference between hyperandrogenic and nonhyperandrogenic in PCOS patients not only using mean CAG repeat (P = 0.008), but also using short CAG bi-allelic repeat (≤22, P = 0.567) and long CAG bi-allelic repeat (>22, P = 0.005). An interesting thing is that the short CAG bi-allelic repeat in hyperandrogenic and nonhyperandrogenic group present no significance difference compared with mean CAG repeat and long CAG bi-allelic repeat. Our results suggested this could attribute to the longer allele narrowing the difference of short CAG bi-allelic between hyperandrogenic and nonhyperandrogenic. To obtain a reliable conclusion, Pearson’s correlation analysis were calculated to quantitatively assess correlation between CAG repeat length and serum total T (r = 0.237, P = 0.008), the results of which consistent with Kim et al. and Skrgatic et al.. However, more studies are needed to further demonstrate the relationship between the CAG repeat length and serum T.

Based on the variance among studies, we performed a meta-analysis to quantify the association between CAG repeat polymorphism and the PCOS. The summary SMD for mean CAG repeat length showed no difference in PCOS patients and cases, and in hyperandrogenic and nonhyperandrogenic subgroup in PCOS. While the pooled confirmed a significant correlation between mean CAG repeat and T. The meta-analysis results are identified with our case–control study. In the subgroup analysis, we observed that high heterogeneity existed in western subgroup, while low heterogeneity existed in Eastern subgroup, indicating ethnicity might not the potential source of heterogeneity. However, we didn’t further explore ethnicity factor because of insufficient information. Three former meta-analysis [32, 36, 37] also revealed no relationship between CAG bi-allelic mean repeat and PCOS. Zhang et al. [36] used the short CAG allele and the X-weighted bi-allelic mean repeat besides CAG bi-allelic mean length to assess the association of CAG length with PCOS risk, but the sample size is too small. Wang et al. [37] divided CAG alleles into long alleles and short alleles, then, compared the genotype distribution. Both two studies enlighten that we should assess the association of CAG length with PCOS risk not only using mean CAG repeat, but also considering the longer alleles, X-chromosome inactivation, and CAG repeat distribution pattern et al., then make a comprehensive conclusion.

We didn’t consider X-chromosome inactivation (XCI) in present study, so this is one imperfect point of our experiment. As we all know, XCI in women occurs early in embryonic development. The AR-CAG alleles were located on X-chromosome, so when XCI happened, the long and short alleles coding for the AR would be affected [38]. Shah et al. [33] found there is no difference in XCI between cases and controls, but the shorter allele was preferentially active in PCOS women with nonrandom XCI. However considering XCI occurs in many tissues, XCI in blood can’t equal to in ovary tissue [27].

In summary, we concluded that, current study in Han Chinese women suggested that the distribution of AR gene CAG bi-allelic differed between PCOS patients and controls. Our study and meta-analysis showed CAG repeat length is associated with T concentration in PCOS patients. Further study is needed to explore the mechanism by which CAG repeat contribute to the etiology of PCOS.