Introduction

The clinical handling of poor ovarian responders (POR) remains a major challenge for fertility care providers. When POR is diagnosed (especially in patients with advanced age), the risk of a low oocyte retrieval rate and IVF (in vitro fertilization) failure is considerably high [1,2,3].

According to the recent literature, POR include a significant proportion of women referred for IVF treatments, ranging from 9 to 24% [2,3,4,5]. Different pharmacological approaches have been proposed for improving the reproductive prognosis in POR, but an effective and internationally accepted strategy is still lacking [6,7,8,10]. Over the last years, different studies have evaluated the role of testosterone supplementation before or during controlled ovarian stimulation (COS) in POR, with variable results [10,11,12,13]. The biological rationale of testosterone supplementation for POR would be to facilitate the transition of follicles from the quiescent to the growing pool, during the early and intermediate stages of follicular maturation [14]. Notably, testosterone may increase the number of pre-antral and antral follicles [15, 16], as well as augment the expression of FSH receptors in granulosa cells, potentially enhancing the ovarian responsiveness to gonadotropins [17, 18].

Nevertheless, although a certain concentration of androgens is needed for accomplishing a proper folliculogenesis, their absolute or relative excess may be even detrimental for follicle development. [19, 20] Therefore, the discussion on the benefits and risks of testosterone supplementation for POR is still open.

In 2014, Luo et al. conducted a meta-analysis evaluating the impact of testosterone pre-treatment on POR undergoing IVF cycles. Even if the authors found that transdermal testosterone was effective in improving the success of IVF, the reliability of their results was significantly limited by the few number of studies (n = 3) and patients included, as well as by the various sources of heterogeneity across the studies [21].

From 2014 to 2018, four additional randomized controlled trials (RCTs) explored the effects of testosterone supplementation for POR [10, 22,23,24]. Therefore, we aimed to provide a new summary of evidence on the effectiveness of testosterone supplementation for POR on IVF outcomes.

Methods

Study design

This is a systematic review and meta-analysis of RCTs evaluating the effectiveness of pre-treatment with testosterone on IVF outcomes in POR. The study protocol was registered in PROSPERO before the start of the literature search (CRD42017067270). The review was written following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [25].

Search strategy

Electronic databases (Medline, Scopus, Embase, ScienceDirect, the Cochrane Library, Clinicaltrials.gov, Cochrane Central Register of Controlled Trials, EU Clinical Trials Register, and World Health Organization International Clinical Trials Registry Platform) were searched from their inception until March 2018. The key search terms included the following: “testosterone” [Mesh] AND “poor ovarian responder” OR “diminished ovarian reserve” OR “controlled ovarian stimulation” OR “in-vitro fertilization OR “assisted reproductive technology.”

Inclusion criteria

  • Language: studies reported in English language

  • Study designs: randomized controlled trials

  • Population: infertile women with poor ovarian response undergoing IVF

  • Intervention: testosterone therapy

  • Timing of intervention: before and/or during the course of ovarian stimulation

  • Comparator: infertile women with poor ovarian response undergoing IVF receiving no intervention or placebo

  • Outcomes

  • Primary outcomes: live birth rate (LBR)

  • Secondary outcomes: clinical pregnancy rate (CPR), miscarriage rate (MR), total oocytes, MII oocytes, total embryos

  • Outcomes definitions

  • Live birth (per woman [LBR]): “live birth” defined as the delivery of one or more living and viable infants

  • Clinical pregnancy rate (per woman [CPR]): defined as the presence of a gestational sac on transvaginal ultrasound or other definitive clinical signs

  • Miscarriage rate (per clinical pregnancy [MR]): defined as fetal loss prior to the 20th week of gestation

  • Total oocytes (per cycle): defined as the amount of oocytes (mean ± SD) retrieved at pick-up

  • MII oocytes (per cycle): defined as the amount (mean ± SD) of MII oocytes obtained

  • Total embryos (per cycle): defined as the number (mean ± SD) of embryos obtained

Study selection and data extraction

Titles and abstracts were independently screened by two authors (A.V., M.N.). The same authors independently assessed studies for inclusion and extracted data about study features (design and time of the study), populations (participants’ number and characteristics), definition of POR, type of intervention, ovarian stimulation cycles, and IVF outcomes. A manual search of reference list of included studies was also performed in order to avoid missing relevant data. We searched for published (full-text studies and meeting abstracts) and unpublished studies (i.e., for whom only a registered protocol was available) from the aforementioned electronic databases. The results were compared, and any disagreement was resolved by consensus.

Risk of bias

Two authors (A.V., M.N.) independently judged the methodological quality of the included studies by using the criteria reported in the Cochrane’s Handbook for Systematic Reviews of Interventions [26]. Seven specific domains related to risk of bias were assessed, which are as follows: random sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective data reporting, and other bias. Authors’ judgements were expressed as “low risk,” “high risk,” or “unclear risk” of bias. For the estimation of “selective data reporting,” we evaluated study protocols, when available. If not available, studies were judged at unclear risk of bias. Results were compared, and disagreements were resolved by consensus.

Statistical analysis

The data analysis was performed independently by the two authors (A.V., M.N.) using Review Manager (version 5.3). All analyses were performed with the random-effects model (by DerSimonian and Laird), with an intention-to-treat approach. Variables were compared using the risk ratio (RR) or mean differences (MD), with a 95% confidence interval (95% CI). A p value lower than 0.05 was considered as statistically significant. Heterogeneity was measured with the Higgins I2. Sources of heterogeneity were explored with sensitivity analysis (by excluding each study and different study subgroups [according to the risk of bias scores] from aggregate analysis) and subgroup analysis (in order to evaluate the specific influence of different testosterone ways and timing of administration on aggregate analyses) when at least four studies were included in the pooled analysis. Risk of bias across studies was not measured due to the low number of studies included (according to the Cochrane’s Handbook recommendation).

Results

Study selection

The electronic searches provided a total of 1582 citations. After the removal of 628 duplicate records, 954 citations remained. Of these, 936 records were excluded after title/abstract screening (not relevant to the review). We examined the full text of 18 remaining manuscripts, and, of these, we excluded 10 papers—one paper due to the lack of data concerning CPR and/or LBR [27], one trial reported data not analyzable [28], three papers because the design was observational [29,28,31], and six papers because they were review/meta-analysis [11,12,13, 21, 32, 33]. Finally, 7 manuscripts were included in the meta-analysis [10, 22,23,24, 34,35,36]. See Fig. S1.

Full details about the included studies are available in Table 1.

Table 1 Data about general features of the manuscripts evaluated in this systematic review and meta-analysis including the dose and the days of transdermal testosterone treatment

Included studies

The 7 RCTs included a total number of 573 participants. One study was double-blinded [36], and two were single-blinded [10, 24]. The remaining ones were open-label studies [22, 23, 34, 35]. Two studies were placebo-controlled [24, 36]; the remaining were treatment versus no treatment [10, 22, 23, 34, 35].

Patients

All women had a prior diagnosis of POR. In four studies, POR was defined according to the Bologna criteria [10, 22,23,24]. In the remaining studies [22, 34, 36], specific definitions for POR were used (details reported in Table 1). Detailed information about patients’ features (i.e., mean age, BMI, duration of infertility, AMH [antimullerian hormone], AFC [antral follicle count], serum basal FSH, LH, and testosterone levels) is reported in Table S1.

COS cycles

Three studies employed a long GnRH-agonist protocol with high starting dose of gonadotropins (300–450 IU) [10, 34, 36], while in three additional trials, a short-antagonist protocol with 300 IU starting dose of gonadotropins was employed [22, 24, 35]. Mskhalaya et al. did not specify the type of protocol [23].

Recombinant-FSH (rFSH) was used in all trials [10, 22, 24, 34]; only one trial did not specify the type of gonadotropin [23]. For details about the COS protocol, type of gonadotropin, and dosage, see Table 1 and Table S2.

In all studies except two (in which these data were not specified [23, 24]), embryo transfer was performed after two–three days of in vitro culture [10, 22, 34,35,36]. Number of embryos replaced for each study is reported in Table S3.

Concerning luteal phase support (LPS), two trials employed 600 mg of micronized progesterone daily [10, 34]; one trial 400 mg of micronized progesterone daily plus 2500 IU of hCG three, six, and nine days after ovulation induction [36]; and two trials employed progesterone gel 90 mg daily [22, 35]. Finally, the remaining two studies did not specify the type of LPS. [23, 24]

Type, dose, and duration of intervention

In all except one trial [23] (using oral testosterone undecanoate), testosterone was administered transdermally. The formulation of the drug was a gel in five trials [10, 22, 24, 35] and a patch in the study by Fabregues et al. [34]

The daily dose of testosterone varied among studies; two trials administered gel 1% 12.5 mg/day [22, 35]; two trials gel 1% and 2% 10 mg/day, respectively [10, 36]; one trial patch 2.5 mg/day [34]; one trial gel 1% 25 mg/day [24]; and one trial 40 mg/day oral route [23].

The timing of administration was before starting COS for two–three weeks in four trials [10, 22, 35] and for at least 40 days in one trial [23]. Fabregues et al. administered testosterone for only five days preceding COS [34], whereas Saharkhiz et al. administered testosterone during COS until ovulation induction [24].

Risk of bias

  • Random sequence generation: In the majority of studies, an adequate method of random sequence generation (computer randomization) was employed. [13, 25, 37, 38] Two studies were judged at unclear risk of bias (no information reported) [23, 24].

  • Allocation concealment: No information was reported in the majority of the studies [10, 22,21,24, 34, 35]. Only one study was judged at low risk of bias [36].

  • Blinding of participants and personnel: Study design was blinded in three trials, and so, they were judged at low risk of bias [10, 24, 36]. The remaining studies were at unclear risk [22, 23, 34, 35].

  • Blinding of outcome assessment: In all studies, the blinding of the assessor was not specified; so, they were judged at unclear risk.

  • Incomplete outcome data: Four studies were judged at low risk of bias because they reported data about our primary outcomes and well described included patients and loss during intervention [10, 22, 35]. Two studies were judged at high risk of bias because they did not report data about primary outcome [24, 34]. One trial was judged at unclear risk because no data are reported about patients’ loss after recruitment and intervention [23].

  • Selective data reporting: Five studies were judged at unclear risk of selective data reporting due to the absence of the recorded study protocol [22,23,24, 35, 36]. The remaining two studies were judged at low risk [10, 34]. See a descriptive synthesis or risk of bias in Fig. S2.

  • Other bias: In the study by Fabregues et al., the two randomized arms performed a different type of ovarian stimulation: high rFSH starting dose and then step-wise reduction in the intervention arm and high rFSH starting dose plus 150 hMG in the control group [34]. This may represent a further source of bias in the assessment of the effects related to testosterone administration.

Effects of intervention

Primary outcome

The analysis included a total number of 457 participants from altogether five studies [10, 22, 23, 35, 36]. Women receiving testosterone therapy showed higher LBR in comparison to controls (RR 2.29, 95% CI 1.31–4.01, p = 0.004), with low statistical heterogeneity (I2 = 0%) (Fig. 1a).

Fig. 1
figure 1

a Forest plot of comparison: live birth rate (LBR) according to group allocation—intervention (testosterone) versus control. b Forest plot of comparison: clinical pregnancy rate (CPR) according to group allocation—intervention (testosterone) versus control

Secondary outcomes

We found a significant advantage in the intervention group in terms of CPR (RR 2.32, 95% CI 1.47–3.64, I2 = 0%, p = 0.0003, Fig. 1b), with no difference in MR (RR 0.85, 95% CI 0.28–2.55, I2 = 0%, p = 0.77). Moreover, patients receiving testosterone showed higher number of total oocytes (MD = 1.28 [95% CI 0.83, 1.73], I2 = 6%, p < 0.00001), MII oocytes (MD = 0.96 [95% CI 0.28, 1.65], I2 = 40%, p = 0.006), and total embryos (MD = 1.17 [95% CI 0.67, 1.67], I2 = 1%, p < 0.00001).

For details about COS and IVF outcomes for each of the trials included in the meta-analysis, see Tables S2S3.

Subgroup analysis and sensitivity analysis

Subgroup analysis according to the testosterone way of administration (transdermal versus oral) showed no statistical difference in terms of both LBR and CPR (see Fig. S3a and Fig. S3b, respectively). Also, subgroup analysis concerning the days (≥ 21 days versus < 21 days) and the timing of testosterone administration (before the beginning of COS versus during COS), when feasible, showed no statistical difference among the subgroups (p = ns). See Fig. S4a and Fig. S4b (LBR and CPR) and Fig. S5 (CPR), respectively. The serial exclusion of each of the studies or of specific study subgroups according to the risk of bias judgment (studies at low risk of bias in at least three domains) did not provide statistical changes to aggregate results.

Adverse effects

No trial reported adverse effects resulting from the intervention.

Discussion

The present meta-analysis summarizes the best available evidence regarding the use of testosterone in several formulations before and/or during COS in POR patients. The real effectiveness of this treatment still represents a popular topic for a debate. In 2011, Sunkara et al. published a meta-analysis including studies comparing COS outcomes in patients pre-treated by androgen- (DHEA or testosterone) or androgen- modulating agent (letrozole) versus controls, and they did not reveal any significant differences in the number of oocytes retrieved and LBR. Thus, the authors concluded that there was insufficient evidence to support the use of androgen supplementation. However, probably related to the lack of high-quality studies available at that time, the inclusion, without discrimination, of trials using different types/dosages of androgens and different timing/duration of therapy may had affected the overall quality of the results [13].

In 2014, Luo et al. conducted a meta-analysis comparing the effects of pre-treatment with transdermal testosterone on POR undergoing IVF/ICSI. Even if the authors demonstrated that pre-treatment with transdermal testosterone may improve the clinical outcomes for POR, we cannot underestimate the potential impact of the limited available evidence at that time (represented by 3 studies), the low sample size, and the heterogeneities of the included studies [21].

From 2014 to 2016, evidence from in vitro studies underlined the importance of a proper androgen balance for accomplishing a correct follicular development [14, 19, 20]. Moreover, four additional RCTs on testosterone therapy in POR were recently published, necessitating a new summary of evidence [ 10, 22,23,24].

Main findings

The present meta-analysis included a total number of 573 POR, from seven RCTs, that underwent IVF cycle. We tested the effects of testosterone supplementation on LBR, CPR, and other COS parameters. We found, with good consistency (I2 = 0%), that testosterone supplementation significantly increased LBR (RR = 2.29; p = 0.004). Similarly, we found (with low inconsistency; I2 = 0%) that the intervention was associated with higher CPR (RR = 2.32; p = 0.0003). Finally, we found that testosterone supplementation significantly increased also the number of total oocytes and MII oocytes retrieved, as well as the number of total embryos obtained, in comparison to placebo or no treatment. No difference was found in MR between groups. Sensitivity and subgroup analysis did not provide any statistical change to the pooled results, confirming their robustness. No adverse effects associated with the intervention were reported.

Biological rationale

A large amount of experimental evidence supports the critical role of androgens in early follicular development and granulosa cell proliferation [21, 37, 38]. Available data suggest that androgens may physiologically stimulate the early stages of follicular growth and increase the number of pre-antral and antral follicles [14, 15, 17,18,19,20]. Data from in vitro murine models confirmed these findings, showing that androgens significantly increased the diameter of immature follicles and enhanced the development of pre-antral follicles [37, 39]. Furthermore, animal models showed a role for testosterone in stimulating the transition of follicles from primary to secondary stages, increasing the resulting number of ovulatory follicles [40, 41]. Accordingly, some older studies on human granulosa cells showed a positive correlation between mRNA levels of androgen receptors, FSH receptor expression, and androgen concentrations. All these findings are potentially suggesting that androgens may play a key role in the folliculogenesis, by increasing ovarian responsiveness to gonadotropins [15,16,17,18].

Although the exact molecular mechanisms through which androgens may exert these functions are still unclear, clinical data as well are in line with these observations. For example, it has been observed that the administration of high doses of testosterone induced the appearance of PCOs [42, 43], with a significant increase in the number of small antral follicles [44]. Barbieri et al., evaluating baseline serum testosterone of 425 females undergoing IVF, observed that testosterone levels decreased significantly with advancing age and had a positive correlation with the number of oocytes retrieved [45]. In line, in aged patients, a reduction of the “androgen mileu” in the ovary represents a common characteristic [46]. Finally, Frattarelli et al. reported that females with higher baseline levels of testosterone required a lower FSH dose, a shorter duration of ovarian stimulation, and were more likely to achieve a pregnancy, than did females with lower testosterone levels [38].

Strengths and limitations

The present meta-analysis is the largest and most comprehensive on this issue. We included only RCTs in order to minimize the bias related to the study design. Strict inclusion criteria and rigorous methodology represent further points of strength of our study. In addition, sensitivity and subgroup analysis did not provide statistical changes to our results, confirming their consistency.

Nevertheless, some limitations should be considered when interpreting our findings. Firstly, different outcomes were calculated by pooling the results of a small number of studies, patients, and events. Additionally, certain heterogeneity across studies was present in terms of POR definition (the Bologna criteria adopted as a reference standard only in the most recent trials) and testosterone therapy (dose, timing, and duration of administration). In particular, considering the number of oocytes retrieved (which is one of the three points for POR definition), we noticed differences from the comparison of older and newer papers that adopted the Bologna criteria. In the older papers [34,35,36], the number of obtained oocytes ranged from 3.8 to 5 in the control groups versus 5.1 to 5.4 in the treatment groups; in the newer papers [10, 22,23,24], it ranged from 1.17 to 3.9 in the control groups versus 2.48 to 5.8 in the treatment groups. This fact, as previously reported, is certainly a source of heterogeneity, and so, caution is necessary for data interpretation.

The identification of the right target population, in this case, POR, is of crucial importance for testing the efficacy of new therapies. Our paper, certainly confirmed the difficulty in comparing results of different trials that applied different definitions of POR [47]. Analyzing the inclusion criteria of older papers [34,35,36], we notice that POR definition is not as selective as the Bologna criteria. However, also, the Bologna criteria have been questioned, and recently, the POSEIDON group suggested a more detailed novel stratification that will result to more accuracy to explain the different types of poor response both for scientific and clinical purposes [48].

We are aware that, considering all these aspects, our data seems weak for suggesting large-scale testosterone applicability. However, as already stated, our manuscript has collected the best literature on this topic highlighting both the positive and negative aspects of published articles. Testosterone adjuvant treatment seems promising, but its efficacy has to be confirmed with new large trial with rigorous methodology and inclusion criteria.

Finally, it is important to emphasize the problem related to publication bias. Despite the fact that the number of included manuscripts is too low for comprehensive and effective analysis of this aspect, we cannot underestimate the possible bias related to “small-study effects.” Therefore, basing on available data, any impact of publication bias on the final estimates could not be excluded. Certainly, our rigorous methodology and the comprehensive research of published and unpublished data represent a good point in favor to the consistency of our data.

Interpretation

Despite recent technological advances, the management and treatment of POR is still a debated issue in IVF [48]. Even if the Bologna criteria try to simplify the concept of poor response especially for scientific purposes, we must remember that POR are not a homogeneous population and their prognosis may vary greatly depending on parameters, such as age and number of oocytes, retrieved [47, 48].

In particular, patient’s age is the stronger predictor of the clinical outcome, in terms of oocyte genetic and reproductive competence [48, 49]. The possibility to obtain at least one euploid blastocyst is strictly related to age and number of MII oocytes obtained; as the age increases, the number of oocytes required to obtain at least one euploid blastocyst increases drastically. This fact is of crucial importance for a correct stratification of POR, because ovarian stimulation and related strategies must have the goal to maximize oocyte yield according to age-related chances [48].

The current impossibility of overcoming the oocyte aging has led scientific research to focus on methods to increase ovocyte yield in this group of patients [50]. Two strategies have been extensively studied; first, the application of different stimulation protocols with different types, timing, and dosages of exogenous gonadotropins in order to recruit as many follicles as possible and second, the administration of different types of adjuvant therapy (before or during ovarian stimulation) in order to increase the pool of antral follicles and their response to gonadotropins (implementing FSH receptor) [50]. Concerning the first strategy, a recent meta-analysis found no evidence of difference in CPR and OPR between patients treated with antagonist versus agonist protocol [51]; moreover, no difference was found also between mild versus conventional stimulation [52]. Differently, double stimulation (DuoStim) seems very promising but no randomized trial has been published [53]. Concerning the second strategy, the most studied adjuvant therapies were growth hormone (GH), DHEA, and testosterone pre-treatment [50]. Different pre-clinical and clinical studies (also randomized trials) suggested a positive effect of these approaches; however, discrepancy and heterogeneity in published literature limited their clinical applicability [50].

In particular, transdermal testosterone seems a promising strategy with a good biological rationale and an effectiveness demonstrated by seven RCTs. With our meta-analysis, we tried to examine not only the biological rationale and clinical applicability of testosterone pre-treatment, but especially the future research steps to be taken in order to finally confirm or not its real effectiveness.

Considering our data, testosterone supplementation before COS might significantly increase the chances of pregnancy (CPR and LBR) in POR patients. Even if no strong data are available concerning the optimal way of administration, the duration, and the timing of therapy (subgroup analysis failed to demonstrate any difference), we can extrapolate some considerations analyzing the available literature.

Transdermal testosterone (gel in 5 trials, patch in one trial) represents the most common way of administration; only one trial administered oral testosterone. So, in daily practice, the transdermal route resulted preferable due to stronger evidence supporting it.

Concerning the timing of administration (before or during COS) and the duration of therapy (≥ 21 or < 21 days), we found no difference in the subgroup analysis; however, stronger evidence are reported supporting the use of testosterone before COS (four RCTs). In the same way, stronger evidence support the use of testosterone for more than 21 days.

In particular, the study by Kim et al. was the first which explored the effects of testosterone pre-treatment according to different duration of application [22]. Interestingly, they showed significantly better results in terms of CPR and LBR only in patients treated with transdermal testosterone longer (four weeks) suggesting a time-effectiveness ratio more than a dose-effectiveness ratio.

The duration of androgen supplementation may be critical in stimulating follicular growth. As suggested by the data reported by Casson et al., a longer duration of androgen application before and during FSH stimulation might be required to effectively improve follicular growth [54].

This concept seems of crucial importance because, as suggested in the recent position paper by Polyzos et al., testosterone mainly acts during the earlier stages of folliculogenesis by playing a role in follicle activation and growth [33]. So, considering that the transition period from the pre-antral to the antral follicular stage in humans lasts approximately 70 days, increasing the testosterone treatment duration to more than four weeks may potentially add to increasing the number of recruited follicles. This novel aspect may explain the negative results by Fabregues et al. (only 5 days of testosterone pre-treatment) and by Massin et al. (only 15 days of testosterone pre-treatment) regarding oocytes retrieved and pregnancy rate [34, 36].

Conclusions

Pre-treatment with testosterone seems promising to improve the success of IVF in POR patients. Specifically, available data support a positive impact of transdermal testosterone on LBR, CPR, and other COS parameters (total number of oocytes and MII oocytes retrieved, total embryos obtained). Due to the limitations of available studies, further RCTs on larger populations, with rigorous methodology and inclusion criteria, are still mandatory in order to finally confirm or not its real clinical effectiveness as well as to establish the best timing, dose, and duration of testosterone administration before IVF.