Introduction: Prevalence of infertility and diminished ovarian reserve

In the human female, oocyte development begins and reaches its maximum in utero. By 20 weeks of gestation, there are approximately 6–7 million germ cells. At birth, only about 1–2 million oocytes remain; thus, the oocyte pool has begun to decline before birth. By puberty, approximately 300,000 oocytes remain.

There are large variations in the number of oocytes within each woman [1]. Unfortunately, scientists are limited in our knowledge of what controls this number and even how to measure it in the clinic. Turner’s syndrome patients prenatally have an adequate number of oocytes, but they undergo accelerated atresia, usually leaving the patient in menopause prior to puberty. How does one compare or assess a typical patient’s oocyte number? In the spectrum of normal aging, what is normal ovarian reserve and what is premature aging? There is no uniform agreement on cut-offs for each commonly used measure of ovarian reserve, and thus, studies are often conflicting.

Reductions in oocyte quantity and quality with advanced age (typically in the mid-40s) are a normal physiologic occurrence termed diminished ovarian reserve (DOR) [2]. Some women experience DOR much earlier and become prematurely infertile (pathologic DOR). Ten percent of women in an infertility clinic, totaling 275,000 women in the USA, are diagnosed with DOR [3, 4]. Recent estimates from the US-based national Society for Assisted Reproductive Technology (SART) system show 32% of in vitro fertilization (IVF) cycles (approximately 66,000 cycles) carry a diagnosis of DOR [5]; however, their definition of DOR is not standardized nor specific. The mandated data reported from individual clinics to the SART system is inadequate for prevalence estimates or for research on DOR. The reason for this shortcoming in the SART data is that their reporting definition of DOR, based on the National ART Surveillance System guidelines, is “Reduced fecundity related to diminished ovarian function; includes high FSH or high estradiol measured in the early follicular phase or during a clomiphene challenge test; reduced ovarian volume related to congenital, medical, surgical or other causes; or advanced maternal age (>40).” The definition of DOR in the Federal Register Notice is “A condition of reduced fecundity related to diminished ovarian function based on clinical assessment; often indicated by FSH>10 mIU/mL or AMH<1.0 ng/mL” [6]. These are clearly very broad guidelines that are subject to variability between clinicians and variability between testing labs [7, 8]. This definition would also include women whose reduced ovarian reserve is normal for her age, such as at ages 45 and older. Research from 2015 using 2 years of SART data concluded that DOR is likely to be over-diagnosed through the SART reporting system [9].

Diagnosis of diminished ovarian reserve

There is currently “no uniformly accepted definition of DOR” as stated by the Practice Committee of the American Society for Reproductive Medicine (ASRM) in 2012 (p 1412) [10]. In clinical practice, however, DOR is diagnosed by abnormal ovarian reserve testing using a variety of methods (such as elevated-but-not-menopausal basal follicle-stimulating hormone [FSH] levels, low anti-mullerian hormone [AMH], low antral follicle count [AFC], or, less frequently, a failed clomiphene citrate challenge test) among women who are still having regular periods [11,12,13].

We acknowledge that the diagnosis of DOR requires clinical judgment [10], as is true throughout Western medicine for many diagnoses. There is no ideal test to evaluate ovarian reserve, and it is not uncommon to have conflicting results among ovarian reserve tests. Providers therefore turn to a variety of tests to assess ovarian reserve. A survey of 796 fertility centers worldwide showed that 51% considered AMH was the best test for measuring ovarian reserve, while 40% reported AFC was the best test and only 6% selected basal FSH. However, when asked which test or factor best predicted pregnancy, the majority of centers, 80%, selected age with a small percentage selecting AMH (4%) or AFC (3%) [14]. It is imperative for providers to understand that when ordering ovarian reserve screening tests, the predictive value of these tests can be low in a younger low-risk population [15]. A common clinical challenge is counseling patients that may have conflicting ovarian reserve testing results. Directly pertinent to DOR diagnoses, 20% of samples from a large national testing center had discordant FSH and AMH results. Specifically, concerning AMH values with reassuring FSH values were found in an age-dependent fashion, affecting 1 in 11 women under age 35 up to 1 in 3 women over age 40 [16]. Because such a high percentage of women may have discordant ovarian reserve testing, it is important for clinicians to not rely on just one ovarian reserve testing modality. In terms of testing, it appears that decreased AMH levels will present earlier than the rise in FSH [17]. Therefore, if baseline FSH and estradiol levels alone were used for ovarian reserve testing and those levels are within normal ranges, a woman may be given false reassurance without knowing her AMH results, too. Very few studies thus far have evaluated discordant FSH and AMH levels, but data does show that the two values together are more useful than only one in terms of patient counseling [18, 19].

It is important to understand that ovarian reserve testing results such as AMH and AFC are predictive of the response to ovarian stimulation regimens, but in general are poor predictors of pregnancy [20]. Therefore, they have a role in patient counseling and choosing medication doses in assisted reproductive technology (ART) cycles, but should not be used to predict inability to conceive, especially in younger patients [21]. In fact, research has shown that the quality of oocytes/embryos in younger (generally < 35 years old) women with DOR is unaffected, even though the quantity of oocytes is diminished [22, 23]. This means that younger women with DOR have a much greater chance of pregnancy with their own eggs if they seek conception earlier than later.

The clinical diagnosis of DOR and the interpretation of ovarian reserve testing are complicated by the changes in AMH labs and processing since 2010. In the 1990s until 2009, the main options for AMH processing were kits from Diagnostic Systems Lab and Immunotech (also branded as Immunotech Beckman Coulter). However, those assays utilized two different primary antibodies against AMH and different standards; consequently, the crude values from Immunotech were higher than from Diagnostic Systems Lab [24]. Those companies consolidated and produced the Beckman Coulter AMH Gen II assay starting in 2009. More recently, other companies have introduced their own AMH kits, some requiring manual testing while others transitioned to automated platforms. Several papers have compared various alternatives, including these articles [8, 24,25,26,27,28]. In general, correlations between the current assays are typically reported to be very good; however, the values themselves are higher or lower across a range of values. For example, the Ansh Labs values were reported to be significantly higher, and the Roche assay values were found to be significantly lower, compared with the results from the Gen II and Beckman-Coulter automated assays (P < 0.05) [8]. The Ansh Labs picoAMH assay has been reported to have an ultralow detectable range, and therefore, it is especially suitable for women with very low AMH concentrations [26, 28]. Because there is no international standard for AMH processing, it is challenging to clinically diagnose when patients present with AMH results from various labs, and it is challenging for researchers to compare findings across studies that used different immunoassays. The bottom line for clinicians and researchers is that the interpretation of AMH test results for the diagnosis of DOR is now clouded due to the multiple AMH assay options and the lack of calibration between the assays. Additionally, there never was an accepted AMH value for DOR diagnostic purposes, although various authors [29,30,31,32,33,34] have attempted AMH-by-age criteria, nomograms, and regression equations.

Differing definitions and nomenclature #1: POI/POF vs. DOR

There are several diagnoses and terms related to DOR, which is a source of confusion for clinicians and others reviewing the scientific literature on this topic. In this and the following sections, we define premature ovarian failure (POF), primary ovarian insufficiency (POI), poor ovarian response (POR), and functional ovarian reserve (FOR). We also discuss the similarities and differences between the definitions of DOR with each of these four related concepts.

Premature ovarian failure (POF) is diagnosed by three characteristics: postmenopausal levels of FSH (> 40 IU/L), four or more months of secondary amenorrhea, and age < 40 years (13). Around 2007–2008, the term primary ovarian insufficiency (POI) was suggested to represent this dysfunction related to very early aging of the ovaries. Readers who peruse the literature are likely to see both POI and POF used, sometimes with the same or slightly different definitions. The terminology of POI is considered to better represent this premature-ovarian-aging condition, considering that women with this condition sometimes spontaneously have follicular development and/or returned menses and/or conceive after the diagnosis is made [35, 36]. In 2008, Welt [37] suggested that POI represents a continuum of ovarian conditions that encompass an “occult” clinical state (reduced fecundity but normal FSH levels and regular menses), “biochemical” state (reduced fecundity, elevated FSH and regular periods), and an “overt” state (approximately corresponding to POF though perhaps with irregular menses). Using her categories, the “biochemical state” most closely corresponds to DOR. In general, this four-state nomenclature is not widely used, and it has not been documented that such a continuum even exists. For further history on the various terms for POF, the reader is referred to Cooper et al. [38].

DOR differs as a clinical diagnosis from POF/POI [10, 39]. As discussed previously, DOR is diagnosed by abnormal but not postmenopausal ovarian reserve testing and regular periods. In contrast, women diagnosed with POI/POF have postmenopausal FSH levels and 4 months without any menses. DOR is a normal physiologic process when it occurs in the mid-40s and is pathologic at younger ages. Women in their early 40s can be diagnosed with DOR but would not be diagnosed with POI/POF. There is no evidence that DOR is a precursor to POF/POI. Note that the Practice Committee of ASRM has stated that DOR is distinct from POF (p 1407) [10].

Differing definitions and nomenclature #2: POR vs. DOR

Many fertility centers use the 2011 European Society of Human Reproduction and Embryology (ESHRE) Bologna criteria for poor ovarian response (POR) [12] to diagnose DOR. POR refers to when a woman has a poor response to IVF stimulation of her ovaries, defined as having at least two of the following three characteristics: (i) maternal age ≥ 40 or any other risk factors for POR; (ii) a previous POR such as a history of cycle cancelation or fewer than four oocytes retrieved after gonadotropin stimulation; and (iii) an abnormal ovarian reserve test [i.e., AFC less than five to seven follicles or AMH below 0.5–1.1 ng/ml]. Among clinics that use the POR criteria in their diagnosis of DOR, anecdotally, it appears there is variation by clinic as to whether they require that more than one criterion is met. Reading the detailed ESHRE report, clinicians will recognize that POR is not the same as DOR, and some researchers have voiced concern about the heterogeneous patients that qualify as POR [40, 41]. For example, pelvic infection is associated with poor ovarian response to stimulation regimens [42, 43], and women with ovarian endometriomas and patients who have undergone ovarian surgery for ovarian cysts potentially have POR [44, 45]. Neither of these examples is considered a cause of DOR [10, 46]. Despite those differences, there are clear overlaps in the diagnoses and the corresponding measures of ovarian reserve.

It was noted earlier that the US SART system has a generous definition of DOR and that an analysis of 2 years of SART data concluded that DOR is likely to be over-diagnosed through that reporting system [9]. Additionally, they reported that 69% of the IVF cycles in 2011 classified as DOR in SART did not meet the Bologna criteria for POR; however, the SART system does not collect AMH or AFC values, thus precluding further investigation into the diagnosis details.

Differing definitions and nomenclature #3: FOR vs. DOR

Functional ovarian reserve (FOR) is another related term that has been primarily used by one group of researchers. In fact, a PubMed search of “functional ovarian reserve” conducted on July 25, 2017 yielded 43 articles in English published since the year 2000, more than 75% of which were authored by that research group and published since 2011 [49]. Based on our review, the article that most clearly introduced and defined FOR as distinct from DOR was published in 2011 [47]: “Ovarian reserve (OR) is a widely used term that has largely remained undefined, and, to some degree, even misused. What is generally referred to as OR, really represents only small components of total ovarian reserve (TOR). A woman’s cumulative hypothetical pregnancy chance is mathematically reflected in her complete follicle pool, her TOR. … TOR mostly consists of NGFs [non-growing follicles] (largely primordial follicles) and to a lesser degree of maturing growing follicles after recruitment. But only the latter reflect the so-called functional OR (FOR), referred to in the literature, when the acronym OR is used. Concomitantly, when the acronym DOR is used, the meaning is to refer to diminished FOR.” Most recently (2016 and 2017 PubMed articles), the term FOR appears to be used interchangeably with ovarian reserve and, in articles from three different research groups, was measured by hormones or AFC of 2–10 mm diameter follicles [48,49,50] Thus, it appears that FOR is a term reflecting the biological measure of ovarian reserve, as distinct from the diagnosis of DOR based on a low antral follicle count and/or hormonal measures of low ovarian reserve.

Conclusion

Infertility/subfertility is not an infrequent occurrence. The estimated worldwide prevalence of primary infertility is 1.9%, and the corresponding percentage for secondary infertility (inability to have a second child) is 10.5% [51]. Forty-one percent of women with infertility/subfertility seek assistance from fertility clinics [52], with more than 60,000 US women attempting an IVF cycle annually [53]. Worldwide, the use of ART in 2010 varied widely by country with a range of 8–4775 cycles/million population [54].

One of the common reasons for infertility in women is DOR. The true evaluation of the quality and quantity of oocytes in human females is not possible, and the methods by which clinicians can estimate ovarian reserve are evolving. Especially in the case of fertility where only a single egg and a single sperm are theoretically needed to result in a child, though in practical terms often multiple mature oocytes and millions of sperm are needed for a natural conception, prediction of the future fertility of any given couple is not possible. When a couple presents in the clinic after unsuccessfully conceiving naturally, health care providers have a variety of assessment methods for determining the mostly likely cause(s) of the fertility delay. In the setting of reduced ovarian reserve for a woman, the provision of medical diagnoses and patient advice is challenging, as reviewed through this article.

As clinicians and researchers, we would like to see greater cohesion, clarity, and consistent use of ovarian reserve terminology. When the POI term was first introduced in 2007–2008, the impetus was patients’ psychological reaction to the term POF, presumably because of the permanent and negative connotation of the word “failure” [55]. But having an independent introduction of a new term to represent an existing diagnosis created confusion for practitioners, inconsistencies in the literature, and unpredictability for medical coding staff regarding official diagnoses that were otherwise titled in ICD medical coding manuals. We suggest in the future that when a group of practitioners or researchers want to propose new terminology for a medical concept or diagnosis, they introduce their idea through a recognized scientific society or organization such as ASRM, ESHRE, Canadian Fertility & Andrology Society, or Society for Reproductive Investigation. In that way, the transition into practice might be more seamless.

We recommend that DOR be added as a medical subject heading (MeSH) by the National Library of Medicine for searching the scientific literature. Within Index Medicus, there are MeSH terms for “ovarian reserve,” “ovarian diseases,” and “ovarian function tests,” but not for DOR. In contrast, “primary ovarian insufficiency(ies),” “premature menopause,” and “premature ovarian failure(s)” are MeSH terms. A quick review of the publication count through PubMed in Sept. 2017 found approximately 75 articles on DOR that were contained within the MeSH terms of ovarian reserve, ovarian diseases, or ovarian function tests, while another approximately 225 human articles on DOR would be missed. [The second set of publications was identified through this search strategy: “diminished ovarian reserve” NOT ((“ovarian reserve”[MeSH Terms]) OR (“ovarian disease”[MeSH Terms]) OR (“ovarian function tests”[MeSH Terms]))]. Thus, roughly 75% of the published articles on DOR among humans cannot be located with MeSH terms.

The diagnosis of DOR is important beyond the patient’s immediate desire for pregnancy and the clinician’s need for evidence upon which to recommend intervention. Given the potential genetic connections underlying DOR [56,57,58], POI/POF [59,60,61], or low AMH [62] (with BRCA in particular [63,64,65,66,67]), clinicians and counselors will need to be knowledgeable about other health implications for the patient or her offspring. Research has found that women with DOR have low bone mineral density (adjusted OR = 22.3, 95% CI 2.0–255.2), increased bone turnover (p = 0.001), and disturbed sleep (adjusted OR = 20.4, 95% CI 2.9–144.1), after controlling for potential confounders [68]. This indicates how a diagnosis of DOR has medical consequences that may require medical intervention after the reproductive goals have been addressed.

The use of ovarian tests for diagnoses is important, whether for patient counseling or for research purposes. The variety of terms related to reduced ovarian function adds to the confusion for patients, researchers, and medical teams. This manuscript has reviewed the nomenclature related to diminished ovarian reserve, and the related issues regarding ovarian reserve testing, in hopes of clarifying these concepts. We offer recommendations for greater medical community involvement in terminology decisions and improved classification options within the National Library of Medicine through a MeSH term specifically for diminished ovarian reserve.