Introduction

The success of in vitro fertilization (IVF) depends on the number and quality of mature oocytes collected at oocyte retrieval (OR), among other factors. The ovarian response is determined by several factors including ovarian reserve [1]. Several baseline tests exist to determine a decreased ovarian reserve including basal serum follicle-stimulating hormone (FSH) level, basal serum estradiol (E2), basal and stimulated serum inhibin B, anti-Mullerian hormone (AMH), and antral follicular count (AFC). A considerable proportion of low responders and in some centers of all patients have a low number of oocytes retrieved at IVF. These patients may have lower pregnancy rates compared with normal responders [2,3,4,5] and certainly have lower cumulative pregnancy rates [6]. A previous study from our group found that in patients between the ages of 40 and 43 years, the live birth per cycle was 3.3 times higher (11.61% vs. 3.54%, p < 0.001) in responders with more than three oocytes collected compared with women having less than three oocytes [7]. In these patients, age is one of the most important factors to consider [8]. Declining ovarian reserve when combined with advanced age is also associated with a low number of antral follicles, poor oocyte quality, low chance of implantation, and higher rates of early pregnancy loss [9].

The predictive role of the number of oocytes retrieved for poor responders has been previously studied [3, 10, 11]. Women with one oocyte retrieved showed very low pregnancy rates in these studies (0%, 0%, and 2%, respectively). While in cases with two oocytes retrieved, a higher number of pregnancies were observed (15%, 11%, and 4%, respectively).

In another study [8], the authors showed that in patients at least 40 years old who developed three follicles or less, a significant decrease in implantation and pregnancy rates was observed when compared to those who stimulated more follicles. However, patients younger than 40 years of age who developed more than three follicles maintained good implantation and pregnancy rates despite a poor response to ovarian stimulation. Most reproductive centers have adopted criteria to cancel IVF cycles that do not produce an adequate number of mature follicles, which in many centers is a minimum of four. Reichman et al. [12] compared patients with 3 follicles or fewer who continued with IVF versus those who converted to intrauterine insemination (IUI) concluding that IVF compared with IUI presents superior pregnancy rates in the setting of two or more follicles. Nevertheless, it is still to be elucidated whether maternal age and AFC affect pregnancy and LBR in patients having one or two follicles.

When IVF stimulation results in two mature follicles or less, the decision to continue the cycle still poses a clinical challenge, particularly in women less than 40 years of age, with severe male factor and tubal factor infertility. There are several reasons that women may develop few follicles including decreased ovarian reserve or selection of a too low gonadotropin dose. At our center, the treatment cycle is continued regardless of the number of developing follicles. This is done for several reasons including that the government was covering the cost of multiple IVF cycles at the time of the data collection in this study, in combination with significant wait times to perform care. This makes our center an ideal location to study the outcomes in IVF cycles stimulating two or fewer oocytes since selection bias would be minimized. It should be noted that given the lack of cost for the patients, at that time, even the worse prognosis patients continued care. We did not cancel patients to increase the center’s success statistics.

The objective of this study is to evaluate the outcome of IVF cycles in patients with one or two mature follicles ≥ 14 mm at the day of HCG administration, to determine the efficacy of continuing IVF.

Materials and Methods

We conducted a retrospective cohort study of patients who underwent IVF treatment at the McGill University Health Center (MUHC) between January 2011 and December 2014. Inclusion criteria included the following: patients undergoing stimulated IVF or intra-cytoplasmic sperm injection (ICSI) treatments and having only one or two follicles ≥ 14 mm with at least one mature oocyte. Exclusion criteria included the presence of untreated uterine or adnexal pathologies such as uterine leiomyomata, endometrial polyps, hydrosalpinx, and adnexal masses or cysts, oocytes donation cycles, thyroid or prolactin abnormalities, severe male factor infertility (less than 5 million total motile sperm count), endometriosis, and natural cycle or modified natural cycle IVF. The main outcomes studied were the clinical pregnancy (defined as presence of gestational sac in utero by ultrasound scan) rates and live birth rates. Women who underwent IVF cycles at a later time period were excluded because at this point wait times to start IVF have been decreased from 12 months to less than 3 months and patients would get canceled and re-stimulated.

All women included in this study had a baseline transvaginal ultrasound performed in the early follicular phase of a spontaneous or progesterone-induced menstrual cycle (between cycle days 2 and 5) prior to IVF treatment. The ultrasounds were performed in a uniform manner by one of two ultrasound technicians using a Voluson E8 (General Electric Corporation, USA), with women in the dorsal lithotomy position and empty bladders. Measurements and descriptions of the uterus and ovaries were recorded, as well as the AFC. This was performed within 6 months of initiating care. For analysis, AFC in this paper was based on groupings of 0–5, 6–10, and 11 or more. The additional cutoff of 11 was selected because we hypothesized that it may correlate with stimulation of 10 or more oocytes which maximizes outcomes as per the POSEIDON criteria [2].

Treatment protocol, type, and doses of gonadotropins were prescribed on a case-by-case basis according to patient characteristics and clinician preferences. The initial dose of gonadotropin was individualized for each patient according to age, basal FSH levels, AFC, body mass index (BMI), and previous response to ovarian stimulation. Dose adjustments were performed according to ovarian response, which was monitored by vaginal scans and estradiol (E2) determinations.

Ovarian stimulation was performed by means of one of the following protocols: fixed GnRH antagonist protocol and oral contraceptive overlapped long GnRH agonist protocol. Briefly, ovarian stimulation was performed with gonadotropins (Follitropin-alpha, EMD-Serono Inc., Canada; Follitropin-beta, Merck Inc., Canada; human menopausal gonadotropins, Ferring Pharmaceuticals Inc., Canada). Ovulation was suppressed with GnRH antagonist (EMD-Serono Inc. or Merck Inc.) or GnRH agonist (Sanofi-Aventis Inc., Canada).

All treatments were conducted as previously described [13]. Estradiol levels were routinely performed during the study at every follow-up visit. Once the primary follicle had an average diameter of > 17 mm, hCG was administered.

Oocyte collection was performed 36 h after hCG triggering, using a 17-gauge single-lumen collection needle (Cook Medical, Sydney Australia). All visible follicles, regardless of diameter, were aspirated. Aspiration pressure was maintained at 140 mm HG by a Cook Vacuum Pump (K-Mar 8200) (Cook, Australia). Fertilization of retrieved oocytes was carried out by conventional IVF or ICSI, depending on sperm quality and previous history, 3–4 h after oocyte retrieval. Fertilization was assessed 16–18 h after insemination for the appearance of two distinct pro-nuclei and two polar bodies. Culture to blastocyst was performed using sequential media (Cook Medical, Sydney, Australia), in the event that 2 high-quality day 3 embryos were present. Embryos were graded as previously described [14,15,16].

Embryo transfer was performed on day 2, day 3, or day 5, based on the numbers of embryos and their quality. The best quality embryo or embryos were transferred based on the patient’s age, her will, previous failures, and the grade of the embryos based on Quebec mandates for embryo transfer. Ultrasound-guided trans-cervical embryo transfer was performed using a Wallace embryo replacement catheter (Smiths Medical, USA) under transabdominal ultrasound guidance, with a full urinary bladder. The embryos were placed 1.5 to 2.0 cm from the uterine fundus. Estradiol (Estrace, Actavis Pharma, USA) 2 mg orally three times daily and progestin supplements (Prometrium 200 mg vaginally, three times daily, Merck, Germany; Endometrin 200 mg vaginally, twice daily, Ferring, USA; Crinone 8% vaginally twice daily, Actavis, USA, or intramuscular progesterone 100 mg daily, Actavis, USA) were started on the day after oocyte collection and continued until 12 weeks of pregnancy for luteal phase support. None of the patients had embryos to vitrify.

Statistical analysis was performed with one-way ANOVA to compare continuous baseline variables and chi-square (X2) tests to compare categorical variables (Table 1). For count data, we used Poisson regression followed by the analysis of deviance (Table 1). Multivariate binomial logistic regression was used to model the capacity of a number of follicles and age (both treated as categorical predictors) to predict the number of collected oocytes, MII oocytes, 2PN, cleavage-stage embryos, pregnancy, and live birth outcomes (Table 2). To test if the relation between the number of large follicles as predictor of pregnancy outcome depends on the age, we repeated the regression models with the interaction term included. A p-value lower than 0.05 was considered significant (Table 2). To model the contribution of AFC to pregnancy and live birth outcomes, we ran the logistic model using both age and AFC as continuous variables (Table 3). To detect the change in the odds of pregnancy and live birth (LB) in medium and low AFC as compared with high AFC, we re-ran the model with AFC as a categorical predictor (low = 0–5, medium = 6–10, high ≥ 11) (Fig. 2). All analyses were done in R 3.5.1. Interaction effect and smooth regression estimate plots were generated using effects [17] and Hmisc [18] R packages respectively. Mediation analysis was performed based on the counter-factual framework and the interventional effect [19]. The analysis was conducted in R using the intmed package [20] with 1000 simulations. The exposure was AFC (continuous), the mediator was the number of large follicles (categorical), outcome was pregnancy or LB, and age was included as a covariate to the model.

Table 1 Patient characteristics and treatment outcomes per age group
Table 2 Age and follicles as (categorical) predictors for the number of collected oocytes, MII oocytes, and 2PN; cleavage-stage embryos; pregnancies; and live births
Table 3 Age and AFC as (continuous) predictors for pregnancy and live birth

Ethics approval was obtained through the Institutional Review Board (IRB) and the Institutional Ethics Committee of the MUHC (number 13-240-SDR).

Results

The total number of IVF cycles included in the study was 459. No significant difference was observed in the stimulation protocol used among different age groups (p = 0.32). The mean (SD) female age was 39.3 (3.48) years and the average number of oocytes collected was 2.4 (1.6). The number of MII oocytes was 1.7 ± 0.9 per cycle. The average number of fertilized oocytes, as well as cleavage-stage embryos, was 1.1 ± 0.9. Of the 459 cycles, 360 cycles (78.4%) ended in embryo transfer, leading to a pregnancy rate and live birth rate (LBR) per cycle of 13.3% and 5.2%, respectively.

For our stratified analysis, we divided the cycles into three groups based on the female age: ≤ 34 years old, 35–39 years old, and ≥ 40 years of age (9.8%, 33.3%, and 56.9% out of the total number of cycles, respectively). The mean AFC, early follicular phase FSH level, and the mean number of follicles with an average diameter of > 14 mm at the hCG administration day, per age group, are presented in Table 1.

The youngest age group (≤ 34 years) exhibited a 35.5% and a 15.6% pregnancy rate and live birth rate, respectively, per cycle that decreased to 14.7% and 6.5% in the age group 35–39 years, and to 8.4% and 2.7% in women of age ≥ 40 years (Table 1).

A total of 1102 oocytes were collected; 129 oocytes were observed from the youngest group (≤ 34 years old), 370 oocytes were observed from the middle age group (35–39 years), and 603 oocytes were collected from the oldest group (40 years and above). There were no significant differences between the groups in terms of the number of oocytes collected, the number of MII oocytes, the number of fertilized oocytes, and the number of cleavage-stage embryos. Significantly, more embryos were transferred in day 2 versus day 3 or blastocysts, and as the patient age increased, the percentage of day 2 transfer increased as well (Table 1). There was no embryo transfer (ET) in approximately 15% of the cycles of the younger two age groups, while this number increased to 26% in patients aged 40 years and above (p < 0.05). The majority (85%) of embryo transfers occurred on day 2 (Table 1).

The pregnancy rate is presented in Table 1. The clinical pregnancy rate, and LBR per cycle and per ET were significantly reduced in older patients particularly those over 40 years of age (p < 0.01 for all measurements). This age group had approximately half the clinical pregnancy rate of the 35–39-year-old group and one-fourth the clinical pregnancy rate of the ≤ 34-year-old group.

To estimate the effect of age and the number of follicles ≥ 14 mm on IVF outcomes, regression models were performed. Odds ratio and 95% confidence intervals (CIs) are presented in Table 2. We found that having two follicles versus single large follicle significantly increases the number of oocytes collected (p < 0.001), the number of mature oocytes (p < 0.001), and the odds of pregnancy (p < 0.05) by 30%, 27%, and 100% respectively (Table 2).

Both age and the number of mature follicles on the day of hCG triggering significantly contribute to pregnancy outcomes, while for LB, age was the only significant predictor.

The effect of the number of mature follicles (one vs. two) on pregnancy outcome and LB was not dependent on age (the interaction between them was not significant). The odds of pregnancy in a woman having two mature follicles were twice as high as the odds of a woman developing a single mature follicle (OR = 2.03; p < 0.05). Figure 1 demonstrates the effect of the two predictors on pregnancy and LB outcomes.

Fig. 1
figure 1

An interaction model for age and number of large follicles as predictors for pregnancy outcome (A) and live birth outcome (B)

We also sought to examine if the relationship between a women’s age and pregnancy rates or LBR is based on different AFC values. To this end, we generated locally weighted smoother regression scatter plots (Loess, Fig. 2). These plots demonstrate that the change in pregnancy rate or LB as a function of age is dependent on AFC, suggesting that AFC is an important independent predictor which is more significant as age decreases. High AFC increases live birth probability in younger age women as compared to low AFC (< 11). This effect has vanished in women ≥ 35 years of age. Next, we quantified the differences between the two young age subgroups: those with high AFC (> 11) and those with low AFC (less than 11). Among women with AFC less than 11, pregnancy and live birth rates per cycle were 19% and 4%, respectively. While among women with AFC ≥ 11 in this young age group, pregnancy and live birth rates per cycle were 72% and 56%, respectively (chi-square p < 0.005 for LBR, and p < 0.01 for pregnancy rate).

Fig. 2
figure 2

Nonparametric regression (loess) estimates of the relationship between age and probability of pregnancy (A) and live birth (B) stratifying by AFC value (low = 0–5, medium = 5–11, large > 11). Tick marks are drawn at actual age values for each stratum

To test if the effect of AFC on pregnancy rate or LBR (as seen in Fig. 2) is mediated by the number of large follicles, we performed mediation analysis. The results indicate that the effect of AFC on LBR is not mediated by the number of large follicles (p = 0.712). The effect of AFC on pregnancy outcome is partially mediated by the number of large follicles (p < 0.06; 14.8% of the total effect was mediated through large follicles). To further estimate the effect of women age and AFC count on pregnancy and LB outcomes, we performed logistic regression models including age and AFC as continuous independent predictors. Both age and AFC count significantly and independently contribute to pregnancy and LB outcomes (Table 3). The interaction between them was significant only for pregnancy outcome (p < 0.05, Fig. 2) and not LB. A 1-year increase in age reduces the likelihood of pregnancy and LB by 11%, and one-unit increase in AFC count will lead to a 9% increase in the odds of both outcomes. By treating AFC as a categorical predictor (Table 4), we found the odds of pregnancy in the medium and the low AFC groups to be 26% and 64% lower than that of the high AFC age group, respectively. The odds of LB in medium and low AFC groups were even lower compared with the youngest age group (54% and 83% less, respectively). The interaction between them was not significant for either outcome.

Table 4 Age as continuous predictor and AFC as categorical predictor for pregnancy and live birth

Discussion

The criteria for the poor ovarian response (POR), known as the Bologna criteria, were defined by the ESHRE in 2011 [21]. POR to ovarian stimulation usually indicates a reduction in follicular response, resulting in a reduced number of retrieved oocytes. As per the Bologna criteria, two episodes of POR after maximal stimulation are sufficient to define a patient as a poor responder. One episode, in the absence of other risk factors (as low ovarian reserve test, advanced maternal age, among others), raises the clinical suspicion for the diagnosis. It has been believed that high doses of gonadotropins do not improve outcomes in POR [22], but other factors, such as age and normal ovarian reserve tests, seem to be associated with improved outcomes [23, 24]. Herein, we attempted to analyze whether stimulation cycles yielding one or two follicles were worth progressing to OR. The importance of this clinical question cannot be overstated. Indeed, taking into consideration the costs, physical and emotional stress, and high probability of disappointment, the elucidation of the success rates in these clinical scenarios might have an enormous effect on the patient’s decision whether to proceed with the cycle treatment.

In this current study, we divided the studied population into three groups based on age. Since we analyzed cycles with one or two large follicles, it is not surprising that the majority of our cohort consists of older women cycles (57.3% of subjects were ≥ 40 years old). The pregnancy and LB outcomes were significantly better in the younger age group. In women above 40 years of age, the LBR per cycle was 2.7%. This means that, on average, among every 37 patients older than 40 years who yielded 1–2 follicles in the stimulation process, only one will deliver.

Unlike the aforementioned group, in patients under 35 years of age, a low number of follicles, even as low as one to two follicles, still yielded a high rate of pregnancy (36%), and live birth (16%) per cycle. However, the role of ovarian reserve as measured by AFC in this age group cannot be understated, as discussed subsequently. It is important to notice that the percentage of cycles that did not end in ET approached 15% when the patient’s age was less than 40 years and increased to 25% in patients 40 years and above. One may consider performing OR in women at least 40 years of age while keeping in mind the low LBR.

Although less than three follicles cause a dilemma whether to cancel or to proceed to OR, we found that having two large follicles versus single large follicle significantly increases the number of oocytes collected, the number of mature oocytes, and the odds of pregnancy, while odds of LB was not affected. Specifically, women > 40 years of age should be informed of the low live birth rate (2.7%). This rate was similar in both patients’ groups, whether having single large follicle (2.58%) or two large follicles (2.72%). It is difficult to logically explain why this occurs. It is possible that the collection of an additional oocyte does not improve outcomes at IVF in some age groups. The latter makes sense if a large number (9 or 10) of oocytes are collected. This rule seems to hold if the number of oocytes doubles to two from one.

In a sub-group analysis, we divided the studied population into three groups based on AFC: low (0–5), moderate (6–11), and high (> 11). However, when evaluating the pregnancy and LBR in women ≤ 34 years of age (Fig. 2), it seems this group is composed mainly of two populations, one of relatively low AFC (10 and less) and the other with normal AFC (11 and above). The first mentioned group had idiopathic low AFC, since we excluded known causes that may damage the ovarian reserve, and each individual in this group should be further investigated. This group achieved LB rate results similar to their older counterparts. However, those with an AFC of at least 11 achieved similar results to good prognosis ART patients albeit having only 1–2 follicles. Importantly, these patients were stimulated with a considerably high dosage of gonadotropins (six out of nine patients were stimulated with 2200–10,200 IU of gonadotropins, two were stimulated with a total dose of 1800 IU, and one was stimulated with 1650 IU). Using mediation analysis, we showed that the effect of AFC on LBR, which is the main primary outcome, is not mediated by the number of large follicles. Taking together, we conclude that AFC should be taken into consideration especially when a consultation is given to a patient with less than three follicles when they are under 35 years of age.

Reichman et al. [12] found that there was no statistically significant difference in clinical pregnancy or live birth rates in patients who had one or three follicles that continued to IVF compared with those who converted to IUI. However, patients with two follicles were approximately three times more likely to experience a live birth when proceeding with OR as opposed to undergoing IUI. Apart from the fact that Reichman compared proceeding to IVF versus converting to IUI in poor responders, the main difference between the Reichman study and ours is the independent role of ovarian reserve, presented as AFC. This concept, of AFC as affecting LBR in poor responders independently of age which to the best of our knowledge was not addressed previously, makes our article novel. Although it is controversial currently whether ovarian reserve reflects quality or quantity, with many experts feeling it is just reflection of quantity and not likelihood of pregnancy, the statistical analysis done for this study allowed unique quantifications. For the first time, we demonstrated the decrease in live birth rates per unit of AFC when maintaining age stable. This was done in combination with quantification of the decrease in outcomes as women age, at any fixed AFC.

The strength of this study is in its high number of cycles included and lack of selection bias with worst prognosis patients still having a collection. Subjects had an elevated early follicular phase serum FSH and similar low AFCs in the three age groups. Another strength of the study was the inclusion of a variety of different IVF protocols and all doses of stimulation medication. The inclusion of these two groups enabled us to detect the dual role of ovarian reserve and age on outcomes, by taking into consideration women who received a too low gonadotropin dose. The inclusion of the most common IVF protocols allowed us to extrapolate the conclusions to most IVF cycles and not just those with a specific protocol. This study has some limitations worth mentioning. First, this is a retrospective study; hence, it has inherent biases that cannot be avoided. Second, the groups were unequal in size, though it is understandable since the population studied is the one that achieved only 1–2 follicles; this finding is more common as age increases. An analysis of the relationship between pregnancy outcomes with AMH values would have further strengthened our understanding of the poor responders and pregnancy outcomes. However, AMH levels were rarely available for patients in our cohort.

It may be noted that some subjects had more than two MII oocytes collected in spite of having stimulated only two mature follicles. This is not surprising since occasionally mature oocytes can be collected from smaller follicles after hCG triggering [25].

Conclusion

Our data suggest that a shift in reasoning should occur, from age being the sole predictor of outcomes in women with a low response at IVF to both age and ovarian reserve needing to be taken into consideration, particularly in younger women. Up until this point, the thinking was that young patients with poor stimulation have good outcomes. Our data suggests that this is only true if the AFC is high as well. Patients aged less than 35 years old with two mature follicles or less have excellent success rates if the AFC was greater than 10. Young women with AFC of 10 or less or women 35 years of age or older particularly those at least 40 years old should be advised of the low chance of success and a sizable percentage of not having an embryo to transfer when one or two mature follicles have developed with IVF.