Introduction

Ovarian follicle counts decrease as women age until they enter menopause [1, 2]. With every passing menstrual cycle, the ovaries gradually exhaust their supply of viable oocytes. This loss of oocytes occurs at different speeds in different women [1, 2]. Decreased ovarian reserve (DOR) is used to describe “women of reproductive age having regular menses whose response to ovarian stimulation or fecundity is reduced compared with women of comparable age” [3].

Many different methods of evaluating ovarian reserve have been studied, including basal follicle stimulating hormone (FSH) level, anti-müllerian hormone (AMH) level, and antral follicle count (AFC) with varying abilities to predict responsiveness to controlled ovarian hyperstimulation [4]. An elevated early follicular FSH, signifying decreased negative feedback from low ovarian inhibin production, has been proven to predict significantly lower in vitro fertilization (IVF) pregnancy rates [5, 6]. AMH, a glycoprotein produced by the granulosa cells of primary, preantral, and antral follicles, has been positively associated with outcomes such as oocyte yield, pregnancy rate, and live birth rate [3, 7]. Classically, AFC has been considered the gold standard in predicting ovarian response, although debates on its relative value compared to AMH continue. Ultimately, the true test of ovarian reserve, by virtue of the definition, is observing the actual ovarian response to exogenous stimulation.

There is much debate concerning DOR and whether it is primarily related to a diminishing pool of viable oocytes or if the quality is also poorer. In this study, we sought to establish whether DOR (determined in one of three ways described below) affected outcomes (pregnancy rates and live births) in women aged 39 years or less undergoing single ideal blastocyst transfer.

Materials and methods

Study population

This is a retrospective cohort study including 507 women who underwent their first embryo transfer between August 2010 and March 2014 at the McGill University Health Centre Reproductive Centre. Ethics approval was obtained from our institution in October 2014 (study code 14-226-SDR).

Inclusion criteria included the following: transfer of a single fresh, ideal quality, autologous blastocyst in patients 39 years and 11 months of age or less. Subjects were excluded on the basis of endocrine disorders including hyperprolactinemia, hypothyroidism, hyperthyroidism, congenital adrenal hyperplasia, congenital uterine anomalies, endometrial polyps, intrauterine synechiae, adenomyosis, or hydrosalpinx. Patients had normal serum TSH levels (between 0.4 and 3.5 IU/L), and prolactin levels (between 2 and 26 ng/ml) on at least one of two assays within several weeks of each other and were non-diabetic (Type I or II). All subjects were included only once in the study.

None of the studied couples had a male with azoospermia or severe oligo-astheno-terato-spermia.

Within the same cohort of patients, three separate analyses were performed using three different criteria to establish DOR—decreased AFC (5 or less), increased basal serum FSH levels (13 IU/L or more), and finally, amount of exogenous stimulation required to bring about folliculogenesis (divided in quartiles of gonadotropin dose needed: FSH 200–1050/LH 0–1050, FSH 1075–1575/LH 0–1050, FSH 1600–2400/LH 0–2100, FSH 2425–7200/LH 0–6600). Outcomes of interest were pregnancy rate, clinical pregnancy rate, and live birth rate.

Baseline ultrasound uterine measurements

All women included in this study had a baseline transvaginal ultrasound performed in the early follicular phase of a spontaneous menstrual cycle (between cycle days 2 and 5) prior to IVF treatment. The ultrasounds were performed in a uniform manner by ultrasound technicians using a Voluson E8 (General Electric Corporation, USA), with women in the dorsal lithotomy position and empty bladders. Measurements and descriptions of the uterus and ovaries were recorded, as well as antral follicle count and any pathology observed such as uterine leiomyomata, endometrial polyps, hydrosalpinx, and adnexal masses or cysts. If anomalies were present, these patients were excluded from analysis. No women had taken hormonal contraception for at least 12 months. The antral follicle count was determined in ovaries without any functional cysts, by counting all follicles 2–10 mm in diameter while scanning form the lateral to the medial side of each ovary.

In vitro fertilization procedure

Three protocols of ovarian hyperstimulation were used: a microdose flare protocol (gonadotropin releasing hormone (GnRH) agonist started on days 2–3 of the cycle after oral contraceptive-induced withdrawal bleeding, and gonadotropin initiated on the third day of the GnRH agonist), a fixed antagonist protocol (gonadotropin initiated on day 2–3 of the natural or provoked menstrual cycle and GnRH antagonist on the sixth day of stimulation, fixed start), and a long agonist protocol (GnRH agonist started after 15 days of an 35 µg-oral contraceptive pill cycle and gonadotropin after 10 days of down-regulation). Women were treated with a combination of recombinant follicle stimulating hormone (FSH) (Follitropin alpha, Merck Serono, MA, USA) and recombinant luteinizing hormone (LH) (rLH) (Lutropin alpha, Merck Serono, MA, USA) or rFSH (Folitropin beta, MERK IND, NJ USA) and human menopausal gonadotropins (Repronex, Ferring, QC, Canada) (hMG). Decisions regarding doses and protocol choice were at the discretion of the treating physician. Injection of human chorionic gonadotropin (hCG) (10,000 IU, Ferring QC, Canada or 250 mcg recombinant hCG, Merck Serono, MA, USA) was administered when two follicles were ≥ 18 mm in diameter, if possible. Transvaginal ultrasound–guided oocyte retrieval was performed 36 h after hCG administration. A more in-depth explanation of the stimulation protocols and medications used are available in the previous publications from this group [8].

Oocyte collection was performed 36 h after hCG triggering, using a 17 gauge single lumen collection needle (Cook Medical, Sydney Australia). Aspiration pressure was maintained at 145 mm HG by a Cook Vacuum Pump (K-Mar 8200) (Cook, Australia). Insemination of retrieved oocytes was done by the conventional IVF or intracytoplasmic sperm injection (ICSI). Fertilization was assessed 16–18 h after insemination for the appearance of two distinct pro-nuclei and two polar bodies. Culture to blastocyst was performed using sequential media (Cook Medical, Sydney, Australia). All embryos were cultured in cleavage medium (Cook Medical, Sydney, Australia) until day 3, and subsequently transferred to blastocyst medium (Cook Medical, Sydney, Australia) for culture to the blastocyst stage.

Only those cycles where at least one excellent quality blastocyst was available for transfer were included in this study, and only one blastocyst was transferred. Ultrasound-guided transcervical embryo transfer was performed using a Wallace embryo replacement catheter (Smiths Medical, USA) under transabdominal ultrasound guidance, with a full urinary bladder. The embryos were placed 2.0–1.5 cm from the uterine fundus. Estradiol (Estrace, Actavis pharma USA) 2 mg orally three times daily and progestin supplements (Prometrium 200 mg vaginally, three times daily, Merck Germany; Endometrin 200 mg vaginally, twice daily, Ferring USA; Crinone 8% vaginaly twice daily, Actavis USA or intramuscular progesterone 100 mg daily, Actavis USA) were started on the day after oocyte collection, and continued until 12 weeks of pregnancy. Ideal or excellent quality embryos for transfer included blastocysts with Gardner grade AA and BA. For details regarding embryo grading, please refer to Gardner and Schoolcraft 1999 [9]. The best available embryo was chosen for transfer, 30 min prior. Any additional embryos were vitrified.

Outcome measures

A quantitative serum beta-hCG test was done 16 days after oocyte retrieval or at 15 days of embryo age; a level greater than 10 IU/L indicated pregnancy. Women with a positive pregnancy test underwent a transvaginal ultrasound to confirm viability 4 weeks following embryo transfer. Clinical pregnancy was defined as the presence of an intrauterine pregnancy with a fetal heartbeat identified on transvaginal ultrasound. Live birth was defined as an infant born with signs of life at greater than 24 weeks of gestational age. The primary outcome was live birth rates, and secondary outcomes included pregnancy and clinical pregnancy rates.

The FSH assay used was the Access assay (Beckman Coulter, Canada). It is a competitive ELISA. The lower limit of detection is 0.2 IU/L, while the upper limit was 200 IU/L. Intra and interassay coefficients of variation were less than 6% in all cases and were assessed in house at 6.8, 23.5, and 45.0 IU/L.

Statistical analysis

Statistical analysis was performed using the Statistical Package for Social Sciences (SPSS 23.0, SPSS Chicago, USA) with logistic regression analysis to control for confounding effects and multiplicity. The confounding effects controlled for included patient age, duration of infertility, gravity, parity, body mass index (BMI), and smoking status. All continuous data were checked for normal distribution using the Kolmogorov–Smirnoff test. All variables in the logistic regression analysis were excluded for co-linearity by correlations, and by exclusions of older women and couples without an ideal blastocyst to transfer from the database. Data are presented as mean ± standard deviation, percentages, and confidence ratios (CI). Two-sided p values ≤ 0.05 were accepted as significant.

Results

Due to the strict inclusion criteria in embryo grade at transfer (Gardner’s grade AA or BA), there was no difference noted in embryo quality for any of the groups compared.

Patients were divided into two groups: antral follicle counts of 5 and less, or more than 5. 53 women had follicle counts of 5 or less. The baseline characteristics of these groups are presented in Table 1 and were similar except for maternal age. Doses of gonadotropins required to stimulate folliculogenesis were also higher among women with lower antral follicle counts as anticipated.

Table 1 Baseline characteristics of patients stratified by antral follicle count

After controlling for confounders, the pregnancy rate in the low (median AFC 4, range 0–5) vs. normal AFC groups (median 19, range 6–75) (40 vs. 53%, p = 0.04, CI 0.20–0.98), the clinical pregnancy rate (29 vs. 46%, p = 0.02, CI 0.43–0.89) and the live birth rate (13 vs. 43%, p = 0.001, CI 0.36–0.78) were superior in the group with a higher antral follicle counts. Figure 1 illustrates these findings. Please note that 71 women in the cohort had polycystic ovary syndrome by the Rotterdam criteria. Among these women, the mean AFC was 40 ± 9. Among women without PCOS and AFC of at least 6, the mean AFC was 13 ± 5.

Fig. 1
figure 1

Pregnancy, clinical pregnancy, and live birth rate as a function

The analysis was subsequently repeated after dividing the cohort on the basis of maximum serum basal FSH level ≥ 13.0 or ≤ 12.9 IU/L. 64 women had maximum basal FSH levels of 13 or greater. The baseline characteristics were similar in both groups (Table 2).

Table 2 Baseline characteristics of patients stratified by basal serum FSH levels

After controlling for confounders, the pregnancy rate when comparing high vs. normal FSH (31 vs. 50%, p = 0.27, CI 0.02–1.9), the clinical pregnancy rate (13 vs. 40%, p = 0.45, CI 0.25–3.2) and the live birth rate (13 vs. 38%, p = 0.48, CI 0.20–27.2) did not differ significantly between these groups. These findings are illustrated in Fig. 2. Of note, these are some of the most extreme differences in pregnancy outcomes between groups among any of the three analyses performed, yet were not statistically significant after controlling for confounding effects. To further test whether this was due to confounders, the analysis was repeated without controlling for any confounding effects. This resulted in no difference in pregnancy rate (p = 0.15); however, the clinical pregnancy rate (p = 0.016) and the live birth rate (p = 0.04) did differ.

Fig. 2
figure 2

Pregnancy, clinical pregnancy, and live birth rate as a function of basal FSH levels

A third analysis was performed using quartiles of FSH doses needed to stimulate folliculogenesis. The baseline characteristics for these groups are presented in Table 3. The pregnancy rate (p = 0.13, CI 0.22–3.4) did not differ between these groups. However, the clinical pregnancy rate (p = 0.003, CI 0.14–0.74) and the live birth rate (p = 0.005, CI 0.13–0.76) were superior in the three groups requiring lower FSH doses, than the group which required the highest quartile of FSH to stimulate. The pregnancy rates in each quartile from lowest to highest were 45, 52, 54, and 41%. The clinical pregnancy rates were 36, 43, 47, and 25%. The live birth rates were 32, 38, 44, and 20%. These findings are illustrated in Fig. 3. There was no statistical difference between the pregnancy, clinical pregnancy, and live birth rates among the lowest three quartiles. The highest quartile (FSH dose 2425–7200) had statistically lower clinical pregnancy and live birth rates than each of the first three quartiles (p < 0.05, in each case).

Table 3 Baseline characteristics of patients as a function of quartile of dosage of exogenous FSH used for ovarian simulation
Fig. 3
figure 3

Pregnancy, clinical pregnancy, and live birth rates as a function of increasing exogenous stimulation to produce folliculogenesis. Dose range for each group was group I (200–1050 IU), group II (1075–1575 IU), group III (1600–2400 IU), and group IV (2425–7200 IU)

To determine whether the effect of DOR as determined by antral follicle count was due to an effect of FSH or LH dose on the endometrium and not ovarian reserve per se, the analysis was repeated after controlling for the dose of FSH and LH used. If the AFC analysis (when controlling for the dose variations) remains significant, it would suggest that ovarian reserve and not endometrium changes are associated with outcomes when an ideal blastocyst is transferred. The analysis was repeated after controlling for the confounding effects noted above, as well as doses of gonadotropins used. AFC less than 5 or greater did not significantly affect pregnancy rate (p = 0.16, CI 0.20–4.7); however, the difference in rate of clinical pregnancy (p = 0.045, CI 0.10–0.96) and live birth (p = 0.04, CI 0.11–0.92) remained statistically significant.

Discussion

In this study, we sought to elucidate the role of ovarian reserve, based on one of three methods of assessment, on ART outcomes in women less than 40 years of age undergoing a single ideal autologous blastocyst transfer. In the first analysis comparing those with low and normal AFC, the pregnancy rate was similar between the two groups; however, the clinical pregnancy and live birth rate were both significantly greater in the group with a higher AFC. In repeating the analysis using disparate FSH levels, no significant difference was observed in any of the three aforementioned outcomes. Finally, those patients in the highest quartile of exogenous FSH required for stimulation had a similar pregnancy rate, but a significantly lower clinical pregnancy and live birth rate as compared to the other three quartiles.

The design of this analysis allows for the elimination of an important confounder in women with varying antral follicle counts—a measure of potential oocyte yield—and demonstrates that those women with low AFC who attain an ideal blastocyst perform poorer than their counterparts with a higher baseline follicle count. While it is known that an elevated AFC is correlated with a better ovarian reserve (i.e., response to stimulation), these findings propose that it is also predictive of improved outcomes (clinical pregnancy rate and live birth rate). As a function of oocyte quality, the AFC has already been evaluated as a potential marker of ovarian aging and aneuploidy. In a prospective study by Grande et al., 47 pregnancies complicated by a variety of aneuploidies (such as trisomy 21, 18, 13, monosomies, etc.) were compared to 812 spontaneous euploid pregnancies—the latter having been used to construct age-adjusted median values for AFC [10]. In comparing the two, nearly 70% of women with aneuploid pregnancies had an antral follicle count below the 50th percentile for their age group, implying the low AFC correlated with poorer quality oocytes as determined by genetic errors [11].

Maximum basal serum FSH levels poorly predicted any significant outcome in our study, despite some existing literature which indicates higher levels resulting in worse IVF outcomes [12]. Perhaps, as a test, it is ideally used in conjunction with other assessments in painting an accurate clinical picture. What is known is that FSH has been outperformed by AFC and AMH in recent studies as a test of ovarian reserve and predictor of live birth [11, 13, 14]. Nevertheless, it remains one of the most commonly ordered investigations in the course of a work-up for infertility.

Finally, the use of high-dose gonadotropins was associated with fewer clinical pregnancies and live births than those requiring less stimulation. Poor response to stimulation defines decreased ovarian reserve; however, the poorer fertility outcomes described even after successfully producing an ideal blastocyst speaks to a possible qualitative difference in oocytes in patients with DOR above and beyond the reduced number of available oocytes. Alternatively, the differences seen in outcomes may be related to endometrial changes as a result of supraphysiological hormone levels. A previous study has demonstrated that in normo-responsive patients, pregnancy rates were lower with higher doses of gonadotropin used [15]. These authors hypothesized that this may be due to alterations in endometrial receptively [15]. An in vitro study demonstrated that prolonged or high-dose exposure to human chorionic gonadotropin (an LH analogue) resulted in down-regulation and internalization of the luteinizing hormone and hCG receptor in endometrial epithelial cells [16]. Women in these situations were also noted to have lower pregnancy rates [16].

While these studies seemed to suggest that a different hormonal milieu was the cause of the different implantation rates, several studies have questioned this explanation. These studies have demonstrated that peri-implantation hormonal extremes (i.e., serum estradiol (E2) levels > 90th percentile [17], or serum E2 > 5000 pg/mL [18]) negatively impact oocyte fertilization rates but not implantation rates, clinical pregnancy rate, or rate of miscarriage. Comparisons between subjects undergoing controlled ovarian hyperstimulation-embryo transfer (COH-ET) versus recipients of donor oocytes (with significantly different areas under the curve (AUC) for E2—3059 vs. 2445 pg/mL, p < 0.001) also failed to reveal a difference in implantation rates and clinical pregnancy rates [19]. The evidence, in this study, suggests that the total dose of gonadotropin required to stimulate folliculogenesis is a marker of ovarian reserve and affects in that manner only at the highest quartile outcomes. Recently, the debate was reopened when a demonstrated that pregnancy rates were lower with fresh but not frozen embryo transfers when higher doses of gonadotropins were used for ovarian stimulation [20]. This study concluded that the endometrium may be adversely affected, probably indirectly, by high dose gonadotropin use in the fresh IVF cycle only, and not in the frozen cycles. It is possible that a dual effect is seen of both the ovarian reserve and the endometrium on outcomes. Further studies should be directed at teasing out the respective contributions of each factor. In this study, we did control for gonadotropin dose and found that basal AFC (< 6) remained an important predictor of pregnancy outcomes.

Strengths of this current study include a large sample size (520 transfers) and the available data required to account for a host of potential confounders such as maternal age, duration of infertility, BMI, and smoking status. Abnormal uterine cavity status was excluded or controlled for (i.e., endometrial thickness, and fibroids, whereas polyps were removed prior to initiation of IVF). All blood tests, ultrasounds, and embryological assessments were performed in one centre, allowing for excellent internal validity. It is difficult to study the effects of higher dose FSH/LH on oocyte quality in patients with DOR when performing a fresh blastocyst transfer given that the endometrium is also subject to change, thus affecting two variables at the same time. A frozen cycle would circumvent this particular potential confounder, but would also introduce a host of possible new ones related to the cryopreservation and thawing. Using only single ideal blastocysts enabled to control for variations in embryo quality seen in other studies. Including serum AMH levels would have helped strengthen this study, however, due to the cost of the assay as of May 2017, it is not routinely done at our fertility centre, and is, therefore, unavailable.

Weaknesses of this study include the retrospective nature and its inherent hidden biases. Although randomization is never possible in such a study, due to the fact that women are not randomized to their ovarian reserve; instead, it is a parameter which they present with. Some women with polycystic ovary syndrome (PCOS) were included in the normal ovarian reserve groups. If anything this should have blighted the difference between the low reserve and normal groups, since PCOS patients are suspected to have lower oocyte potential for pregnancy. However, the groups remained different for pregnancy outcomes when AFC was considered likely making this finding real.

In conclusion, while stratifying by basal FSH level did not yield any significant findings and was likely due to false normal results, those women with a higher AFC and women requiring lower doses of exogenous FSH had higher clinical pregnancy rates and live birth rates than their counterparts following a fresh autologous ideal quality single blastocyst transfer. The ideal combination of pre-stimulation investigations and predictive factors are yet to be determined and likely vary by clinical scenario. It should be noted that there are lots of additional factors apart from AFC and FSH dose influencing IVF outcome, which were not considered in this analysis. Ideally, future randomized studies will be conducted to clarify the significance of total exogenous stimulation on reproductive outcomes and effect on the endometrium.