Introduction

Antimüllerian hormone (AMH) is a 140-kDa homodimeric glycoprotein from the transforming growth factor β (TGF-β) superfamily [1], which is essential for correct sexual differentiation. AMH seems to modulate ovarian follicular development by the inhibition of the initial recruitment of primordial follicles [2] and to antagonize the FSH-dependent follicular growth [3]. In women, after puberty, AMH is detected in the granulosa cells of primordial follicles and reaches peak concentration in small antral follicles [4, 5]. Serum AMH levels correlate with the number of developing ovarian follicles, hence with the ovarian reserve [4].

The accurate determination of ovarian reserve remains a challenge for the infertility specialist [6]. Achieving this goal allows the possibility to predict the chances of success of assisted reproductive therapy for certain patients. AMH and antral follicle count (AFC) by transvaginal ultrasound appear to be the most informative markers of ovarian reserve currently available [710]. These two markers nearly overlap because they essentially represent the same physiological phenomenon that is the development of small antral follicles, but their reproducibility and reliability may differ because the two methods use very distinct technologies. Thus, it is important to further evaluate the accuracy of AMH and AFC tests to predict clinical outcomes from assisted reproduction technology, before they are more consistently incorporated into daily practice [11].

Previous studies have shown that AMH measurement at day 3 of the menstrual cycle is useful for the prediction of the risk of poor ovarian response [10, 12, 13]. However, the practical utility of this marker to predict pregnancy is currently uncertain, and probably limited by the lack of association of AMH with embryo quality [12]. Only recently, studies have evaluated the ability of day 3 AMH to predict the likelihood of a live birth, and the balanced conclusion would be that the women at the extremes of basal AMH have different live birth rates (8); however, no single AMH cutoff has achieved a satisfactory combination of sensitivity, specificity, and predictive power for this outcome [1419].

Longitudinal studies in healthy women have reported that serum AMH concentration is extremely stable during the menstrual cycle [20, 21]; thus, this feature has been considered to be indicative that the test would be equally informative regardless of the day of blood sampling. However, only a few studies have focused on this hypothesis. In a prospective evaluation of 48 subjects tested at a random day of menstrual cycle shortly before in vitro fertilization (IVF), serum AMH levels correlated with the number of follicles grown and oocytes retrieved; therefore, these values were good predictors of poor ovarian response [22]. Another study sought to compare the performance of AMH measurements at three different menstrual cycle phases in 33 women undergoing IVF; it reported that early follicular, ovulatory and midluteal AMH levels were good predictors of poor ovarian response, while only a borderline significance was reached for prediction of pregnancy [23].

In order to address the prognostic utility of AMH measurements in a cohort of unselected women representing the general IVF population, the present study was designed to test serum AMH measurements at two distinct phases of the menstrual cycle immediately before IVF and compare both tests with AFC for the prediction of cycle cancellation, clinical pregnancy, and live birth. These hard clinical outcomes were chosen rather than intermediate surrogates, such as the number of developing follicles, oocytes, or embryos, in order to address the primary concerns of the patients and clinicians demanding ovarian reserve tests in the context of medically assisted reproduction.

Materials and methods

This study prospectively evaluated 135 consecutive patients referred for conventional in vitro fertilization (IVF) or intracytoplasmic sperm injection (ICSI) at the Division of Human Reproduction of the Academic Hospital of Federal University of Minas Gerais, Belo Horizonte, Brazil. Indications for IVF included male and/or female infertility, previous surgical sterilization, and male partner infected with human immunodeficiency virus (HIV). No woman was using a hormonal contraceptive at the time of enrollment. Informed consent was obtained from all subjects prior to inclusion in the study, which was approved by the local Human Investigation Committee.

Controlled ovarian hyperstimulation and in vitro fertilization

All women underwent pituitary blockade with intranasal nafarelin (400 μg daily) beginning on day 21 of the previous menstrual cycle. Following menstruation, transvaginal ultrasound was performed to confirm blockade of ovarian function as indicated by endometrial thickness <4 mm and absence of follicular growth. Following confirmation of the blockade of ovarian function, controlled ovarian hyperstimulation was induced by recombinant FSH and/or highly purified human chorionic gonadotropin at a dose of 300 IU for 3 days and 150 IU on the following days.

To monitor follicular growth, transvaginal ultrasound was performed every other day beginning on the fifth day of FSH therapy, and daily since the mean follicular diameter reached 14 mm. The cycle was cancelled if there was a poor ovarian response, defined by the development of less than three growing follicles. When at least one dominant follicle reached a diameter of 18 mm, FSH and nafarelin were discontinued, and recombinant hCG was administered at a dose of 250 μg. The follicles were aspirated approximately 36 h after hCG injection using a 17-gauge follicular aspiration needle connected to a transvaginal probe and a Craft suction unit with a negative pressure of 100 mmHg.

The oocytes were identified in a culture dish using a stereomicroscope. Semen was prepared using the swim-up technique and insemination was performed by conventional IVF or ICSI according to sperm analysis/characteristics. Oocyte fertilization was assessed at 18–20 h after insemination by confirmation of the presence and location of two pronuclei, using an inverted microscope. In compliance with Brazilian law, up to four embryos were selected for transfer; the remaining were left in the culture medium until the blastocyst stage (5–6 days) and subsequently cryopreserved. For luteal phase support, 600 mg daily micronized progesterone was administered vaginally for 2 weeks, from the day of embryo transfer until the pregnancy assessment with serum β-hCG; if a pregnancy occurred, this therapy was extended until 8 weeks of gestation. A clinical pregnancy was defined by the presence of fetal heart beat by transvaginal ultrasonography around the sixth week of gestation. The pregnant women were followed until delivery.

The study sample was divided into four groups according to the following outcomes: cancellation for poor ovarian response (n = 23), complete treatment without clinical pregnancy (n = 83), pregnancy ending in miscarriage (n = 10), and pregnancy ending in live birth (n = 19). The characteristics of the four outcome groups are summarized in Table 1.

Table 1 Characteristics of the patient groups

Blood sampling

All blood samples were collected from a peripheral vein on day 3 as well as between day 18 and 20 of the menstrual cycle immediately preceding the IVF cycle. Day 3 was chosen for consistency with previous studies (10, 12, 13), while days 18–20 were chosen to represent the mid-luteal phase. Blood was allowed to clot for 30 min, then centrifuged at 400 g for 10 min at room temperature; the serum was then separated with a disposable pipette, transferred to a cryopreservation tube, and stored at –80 º C.

AMH assay

AMH concentrations were measured using a commercial quantitative sandwich enzyme immunoassay kit (AMH Generation II ELISA; Beckman Coulter, Brea, CA, USA). All samples were handled blindly by one person. Briefly, calibrators, controls, samples, and the assay buffer were added to microtitration plates coated with the capture anti-AMH antibody. After incubation and washing, the detection antibody labeled with biotin was added to each well, incubated, washed-out. Then, the reaction was developed with streptavidin-horseradish peroxidase and tetramethylbenzidine; the plates were then read at 450 nm wavelength. The assay has a linear detection range from 0.08–22.5 ng/ml. The intra-assay coefficient of variation (CV) was 5.7 %, and the inter-assay CV was 7.7 %.

Antral follicle counting

Transvaginal ultrasound was performed on day 3 of menstrual cycle by a single examiner (C.P.R.), using an Aloka equipment with a 3–7.5 mHz endocavitary transducer. Antral follicle counting included all follicles with a mean diameter of 2–10 mm, calculated from two dimensions [24].

Statistical analysis

The sample size calculation indicated that 17 patients per group would allow to detect differences of at least 0.5 ng/ml in AMH levels between the different outcome groups with 80 % statistical power and 95 % confidence. Data were tested for normal distribution using the D’Agostino-Pearson test. Normally distributed variables were summarized as mean ± standard error of the mean and submitted to one-way ANOVA followed by Newman-Keuls test for multiple comparisons. Non-normal variables were expressed as median and interquartile interval and the group medians were compared by Kruskal-Wallis ANOVA followed by Dunn’s test. Normal and non-normal linear correlations were calculated by the coefficients of Pearson and Spearman, respectively, setting the critical significance level at P < 0.05.

Receiver operating characteristic (ROC) curves were calculated for prediction of cycle cancellation, pregnancy and live birth with serum AMH levels at day 3 and day 18–20 of the menstrual cycle as well as AFC. The optimal cutoff points obtained by analysis of the ROC curves were used to calculate the sensitivity, specificity, risk ratio and positive predictive value with their respective 95 % confidence intervals. In addition, the relationship between AMH test results and outcomes was adjusted through logistic regression by the stepwise forward method setting as covariates age and number of oocytes retrieved.

Results

There was a strong correlation between AMH levels measured at day 3 and day 18–20 of the menstrual cycle; this was valid for the whole study sample (r = 0.837; P < 0.0001) and also for each of the outcome groups (Fig. 1-a). There was a weaker, although significant, correlation between AMH and AFC in the entire study population (r = 0.501; P < 0.001) but not in the cancellation group; in this group, both markers were low and AFC appeared to be less discriminatory (Fig. 1-b). As shown in Table 2, both serum AMH measurements and AFC inversely correlated with age and duration of infertility. AMH and AFC also correlated positively with the number of oocytes retrieved and with the number of embryos obtained, but not with the dose of gonadotropins needed for ovarian stimulation.

Fig. 1
figure 1

Correlation plots between serum antimüllerian hormone (AMH) levels at day 3 and day 18–20 of the menstrual cycle and antral follicle count (AFC) in a cohort of 135 women who underwent controlled ovarian hyperstimulation for in vitro fertilization and classified according to the outcome as not pregnant (cancellation or IVF) or pregnant (spontaneous abortion or live birth)

Table 2 Correlation matrix between serum antimüllerian hormone (AMH), antral follicle count (AFC), and reproductive characteristics in the entire cohort (n = 135)

The median serum levels of AMH at different menstrual cycle phases in the four outcome groups are presented in Fig. 2. The only significant distinction observed between groups was the lower AMH levels in the women who had their ovarian stimulation cancelled due to poor response. On day 3 as well as on day 18–20, this group had significantly lower AMH levels, compared to the other groups (P < 0.0001). In addition, the cancellation group had smaller median AFC, compared to the women who completed the IVF cycle (P < 0.0001).

Fig. 2
figure 2

Serum antimüllerian hormone (AMH) levels at day 3 and day 18–20 of the menstrual cycle and antral follicle count (AFC) in a cohort of 135 women who underwent controlled ovarian hyperstimulation for in vitro fertilization and classified according to the outcome as not pregnant (cancellation or IVF) or pregnant (spontaneous abortion or live birth). The box plots represent medians and quartiles and the error bars correspond to the 10th and 90th centiles. * P < 0.0001 vs. cancellation group (Kruskal-Wallis ANOVA and Dunn’s test)

The ROC curves shown in Fig. 3 revealed that day 18–20 serum AMH was comparable to day 3 serum AMH and AFC for the prediction of cycle cancellation. The areas under the ROC curves for prediction of cancellation were 0.84 for day 3 AMH, 0.89 for day 18–20 AMH, and 0.80 for AFC (Table 3). As shown in Fig. 3, the probability of cycle cancellation was significantly higher (P < 0.05) if the patient had day 3 AMH ≤ 0.3 ng/ml (59 %), day 18–20 AMH ≤ 0.3 ng/ml (44 %), or AFC ≤ 6 (37 %) than the pretest probability (23 cancelations out of 135 started cycles, or 17 %).

Fig. 3
figure 3

Receiver operating characteristic (ROC) curves for prediction of cycle cancellation, pregnancy and live birth with serum antimüllerian hormone (AMH) levels at day 3 (blue lines) and day 18–20 (brown lines) of the menstrual cycle, and antral follicle count (green lines). The right panels show the overall prevalence (pretest probability) and the positive predictive value (post-test probability) of each outcome according to the test results. All probabilities are given with their respective 95 % confidence intervals. The cutoff points were obtained by analysis of the ROC curves

Table 3 Sensitivity, specificity, and area under the receiver operating characteristic (ROC) Curve for prediction of cycle cancellation, pregnancy, and live birth with serum antimüllerian hormone (AMH) levels at day 3 and day 18–20 of the menstrual cycle and antral follicle count

AMH both at day 3 and day 18–20 of the menstrual cycle were found to be modest predictors for pregnancy or live birth, with similar performance of the two AMH measurements (Table 3 and Fig. 3). The finding of day 18–20 serum AMH concentrations above the cutoff of 1.6 ng/ml predicted higher probability of clinical pregnancy compared to the finding of AMH levels at or below this cutoff point (35 % vs. 17 %, respectively; risk ratio 2.10 [1.12–3.93], P < 0.05; adjusted odds ratio 2.69, p = 0.026), but this was not significantly different from the overall pregnancy rate in the study, 22 % (Fig. 3). Accordingly, the probabilities of live birth with 18–20 serum AMH concentrations >1.6 ng/ml and ≤ 1.6 ng/ml were, respectively, 17 % and 10 % (risk ratio 2.67 [1.19–6.02], P < 0.05; adjusted odds ratio 3.28, p = 0.020), which was not significantly different from the overall live birth rate in the study, 14 % (Fig. 3). AFC had no predictive value for pregnancy or live birth (Fig. 3 and Table 3).

Discussion

This study was conducted to investigate the value of serum AMH measurements in the month before IVF to predict cycle cancellation, clinical pregnancy, and live birth. We also compared the test performance at two opposite phases of the menstrual cycle. Serum AMH at both cycle phases were lower in women whose treatment was interrupted for poor ovarian response; however, it did not significantly vary between the women who completed the IVF treatment and did not become pregnant and those who achieved a pregnancy and a live birth.

Our data indicate a strong correlation between the two AMH measurements and both directly correlated with the number of oocytes and embryos and inversely correlated with the woman’s age. We also confirmed that AMH is useful to predict the risk of cycle cancellation for poor ovarian response [10, 12, 13]. These findings reinforce the relationship of AMH with ovarian follicular reserve and ovarian response to gonadotropin stimulation, which is coherent with the physiological mechanisms underlying this hormone secretion [4]. Moreover, compared to other ovarian reserve markers, AMH has the additional advantage of being stable throughout the normal menstrual cycle [20, 21] and, as two previous studies [22, 23] and the present data suggest, AMH may be equally effective to predict ovarian response at any menstrual cycle phase. AMH currently still has the disadvantages of high cost and scarcity of laboratories that routinely perform the test. However, this situation may change in the future; thus, making the measurement of AMH more accessible and therefore more widespread in clinical practice.

The use of ovarian reserve markers to predict pregnancy and live birth is obviously limited by the fact that ovulation is only a partial requisite for a successful gestation. As expected, the ROC curves for prediction of clinical pregnancy and live birth with AMH did not reach any point of optimal sensitivity/specificity combination. The best cutoff points identified in the ROC curves yielded significant odds ratios, allowing us to reassure patients with normal AMH levels that they have better odds of success than if their AMH levels were subnormal. However, the absolute probability of pregnancy and live birth depends on their overall prevalence, or pre-test probability. In our study population, with a relatively advanced age and long duration of infertility, the probability of live birth was only 14 % per started cycle, and rose to 27 % with serum AMH level above the chosen cutoff. To extract the best information from the test, possible strategies include its use for specific groups of patients and/or in combination with other relevant prognostic variables. Other studies in more selected IVF populations reported that the greater the proportion of women with diminished ovarian response, the greater the utility of screening them with serum AMH [11, 14, 25, 26]. Because maternal age remains the strongest prognostic factor in IVF, the models based on age and serum AMH appear to be more informative than each marker alone [15, 19].

The cutoffs identified in this study for serum AMH should not be extrapolated to other settings. We used the kit supplied by Beckman Coulter, which is a second generation commercial assay that merges components of the two previously available AMH assays, the Immunotech, and the DSL. After we completed the assays, it came out that the absolute concentrations calculated for the research samples through the standard curve may be underestimated [27]. Because this new assay is the only one remaining on the market, the adoption of threshold values to interpret serum AMH testing in clinical practice should await for an appropriate international standard upon which the age-related normal ranges will be re-established [28].

In this study, AFC had a moderate positive correlation with serum AMH, number of oocytes, and number of embryos; it had a negative correlation with age. However, it had acceptable sensitivity and specificity only to predict cycle cancellation, not pregnancy or live birth. This finding agrees with previous observations in IVF studies [7, 10, 17] and suggests that AFC is only associated with the number of recruitable follicles and not with any aspect of oocyte quality that will have an impact on embryo integrity, implantation or development.

In conclusion, day 18–20 AMH was comparable to day 3 AMH to predict cycle cancellation, clinical pregnancy, and live birth after IVF. Both AMH measurements were accurate to predict cancellation but were much less useful to predict pregnancy and live birth. Thus, the finding of median or high AMH levels prior to IVF may be regarded as a good prognosis marker; however, once the patient has completed IVF treatment with sufficient ovarian response, the AMH level provides no additional contribution to predict the likelihood of complete treatment success.