Introduction

Premature ovarian Insufficiency (POI) is defined as loss of ovarian function before 40 years of age. This challenging disorder affects approximately 1% of reproductive-age women [1]. Diagnosis criteria proposed by the European Society of Human Reproduction and Embryology (ESHRE) include < 40 years of age, oligo/amenorrhea of at least 4 months, and FSH > 25 IU/L on two occasions [2]. Currently, the only therapeutic option for POI patients who wish to conceive is in vitro fertilization with donor eggs.

It is estimated that three out of four patients with POI still have a dormant pool of primordial follicles [3]. Recent research has offered insight into the molecular pathways in developing primordial follicles into primary and secondary follicles, which would respond to gonadotropins, thus offering a therapeutic option for POI [4, 5].

In 2013, Kawamura et al. published a promising technique called in vitro activation (IVA) [6]. This technique involves a two-step procedure, beginning with ovarian cortex cryopreservation followed by fragmentation, activation, and autotransplantation. In short, a laparoscopic oophorectomy is performed, where the ovarian medulla is removed ex vivo, leaving only the ovarian cortex. Subsequently, the cortex is cut and vitrified. After the patient recovers from the first surgery, the ovarian cortex strips are warmed and further fragmented into 1–2 mm3 cubes. Then, the cubes are treated in vitro with a PTEN inhibitor and a PI3K activator for 24 h. Finally, they are transplanted back into the patient [6]. The authors suggested that disruption of the ovarian cortex could lead to the deactivation of the Hippo pathway, and Akt pathways could be stimulated in vitro. Clinical applications of these theories led to two live births.

Since then, several series of cases have reported the application of the original technique [7, 8]. Additionally, other authors have published a more straightforward drug-free IVA technique called ovarian fragmentation for follicular activation (OFFA) [9, 10]. In OFFA, after the ovarian cortex strips are dissected, they are cut into 1–2 mm3 cubes and immediately transplanted back into the patient.

Although the technique has shown promising results, it is still considered experimental [11]. There are many questions regarding OFFA and IVA, including which patients are appropriate candidates.

This systematic review and meta-analysis aims to identify basal characteristics of patients who responded to OFFA or IVA and identify if there is a subset of patients with POI who would benefit more than others. Also, the current cumulative results of OFFA and IVA in patients with POI will be shown.

Materials and methods

Search strategy

The protocol for this study was registered in the International Prospective Register of Systematic Reviews (PROSPERO; CRD42023435664). Following the Preferred Instrument for Systematic Reviews and Meta-Analysis (PRISMA), a systematic database search was performed in May 2023 (Fig. 1) [12]. We searched PubMed, Scopus, Google Scholar, Cochrane Library, and clinicaltrial.gov databases. Studies reporting OFFA or IVA performed in patients with POI were identified using the following key terms included in the title/abstract: “follicle,” “follicular,” and “Activation” or “IVA”; and “Premature ovarian failure,” “premature ovarian insufficiency,” or “POI.” All publication years were included. Results were exported to Rayyan [13]. The automatic duplicate finder was applied, and the rest of studies were screened. Additional studies were identified by reviewing references of studies included.

Fig. 1
figure 1

PRISMA flow diagram

Study selection criteria

Studies that evaluated OFFA or IVA performed in patients with POI were included in the systematic review. Only studies using ESHRE criteria for POI were included. Due to limited published data, type of OFFA or IVA performed and etiology of POI were not restricted, nor language, study design, length of follow-up, and year of publication. To be included in meta-analysis, studies were required to compare basal characteristics of patients who developed antral follicles and those who did not. Studies that reported only one case were excluded. For duplicated cohorts, the latest report was included. The primary outcome was resumption of ovarian function, defined by either development of follicles on ultrasound with increase in estradiol, FSH lower than 20 IU/L, or presence of at least three consecutive menstrual cycles. The secondary outcomes were basal characteristics, number of patients with oocyte retrieval, embryo transfer, pregnancy, and pregnancy outcome.

Study selection and data extraction

Two authors (JAEB, MRD) independently reviewed titles, abstracts, and citations. After screening for studies which met inclusion criteria, both authors evaluated the full reports for data extraction. Population characteristics, such as age, etiology of POI, baseline FSH, baseline AMH, duration of amenorrhea, or diagnosis, were extracted. Type of OFFA or IVA (traditional IVA, drug-free IVA, OFFA, only scratch) was also collected. As well as the number of patients with follicles remaining in the biopsy, follicle growth, oocyte retrieval, embryo transfer, pregnancy, and pregnancy outcome. Any discrepancies were discussed.

Statistical analysis

The basal characteristics were analyzed using Review Manager ver 5.3. Heterogeneity was assessed using Higgins I2 [14]. Studies with heterogeneity over 40% I2 were considered heterogeneous and were analyzed using random effects models. Studies with less than 40% I2 were not regarded as heterogeneous and were analyzed using a fixed effects model [15]. Continuous data was analyzed using the inverse-variance method with results expressed in mean difference and 95% confidence intervals (CI). Dichotomous data was analyzed using the Mantel-Haenszel method, and outcomes were reported with odds ratio (OR) and 95% CI. If not reported, mean and standard deviation were estimated from sample size, median, range, and/or interquartile range [16]. Resulting values with associated p values < 0.05 were considered significant. Furthermore, each variable was divided into subgroups for each of the different procedures (OFFA, traditional IVA, and cortex scratch).

The risk of bias in individual studies was assessed using the Cochrane Risk Of Bias In Non-randomized Studies of Interventions (ROBINS-I) [17]. The overall certainty of the study was graded according to the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) Working Group guidelines [18].

Results

Systematic review

As shown in Fig. 1, our search through databases and registers resulted in a total of 390 articles, 375 from databases (PubMed, Scopus, Google Scholar, Cochrane Library) and fifteen from registers (ClinicalTrials.gov). The records were exported to Rayyan, and ten were excluded by an automatic duplicate finder [13]. A total of 380 records were screened by name and abstract. Fourteen articles were sought for retrieval and closely reviewed. Four were excluded as they corresponded to duplicated cohorts, and the most recent publications were included [6, 19,20,21]. One study reported only one case [22]. Two studies were still recruiting [23, 24]. Two studies used different POI diagnostic criteria [25, 26]. One study reported only three patients in the group without follicle growth [27]. One additional record was obtained through the study references list [8]. Thus, five studies were included to review the cumulative results and meta-analysis [7,8,9,10, 28].

The risk of bias in all studies was determined to be serious due to unmeasured confounders according to the ROBINS-I assessment (Fig. 2). The GRADE analysis suggested low evidence of certainty (Table 1).

Fig. 2
figure 2

ROBINS-I of individual studies

Table 1 GRADE assessment

Meta-analysis

Five studies, reporting 164 patients, were included to review the current cumulative results [7,8,9,10, 28]. Patients’ characteristics are shown in Table 2, and the type of procedure and complications can be found in Table 3.

Table 2 Patients characteristics
Table 3 Type of procedure and complications

From the 164 patients analyzed, 26.21% showed follicle growth on ultrasound (43/164). Of patients with follicle growth, pregnancy rate is 25.58% (11/43), and the live birth rate is 20.93% (9/43) (Table 4).

Table 4 Current cumulative results

Basal characteristics of patients who showed follicle growth and those who did not were analyzed. The With Follicle Development group included 43 patients, and the Without Follicle Development group 121.

Age

Age was described in five studies with a total of 43 patients in the with follicle development group and 121 patients in the without follicle development group. The weighted average for age was 32.33 years in the with follicle development group and 31.67 years in the without follicle development group. A meta-analysis of these data revealed a mean difference of 0.95 [−0.51, 2.40], p = 0.20, showing similar ages in the included cohorts allocated to each group.

Regarding the subgroups of each of the procedures included (OFFA, traditional IVA, and cortex scratch), traditional IVA showed a significant difference, but not OFFA and cortex scratch. These findings are displayed in Fig. 3a.

Fig. 3
figure 3

Forest plots. Age (a) and baseline FSH (b)

Baseline FSH

Baseline FSH was described in four studies, with a total of 34 patients in the with follicle development group and 93 patients in the without follicle development group. The weighted average for baseline FSH was 52.87 IU/L in the with follicle development group and 68.82 IU/L in the without follicle development group. A meta-analysis of these data revealed a mean difference of −16.41 [−30.77, −2.05], p = 0.03, suggesting a lower baseline FSH in the with follicle development group. None of the subgroups showed a significant statistical difference. These findings are displayed in Fig. 3b.

Duration of diagnosis or amenorrhea

The duration of amenorrhea or diagnosis was described in five studies, with a total of 43 patients in the with follicle development group and 121 patients in the without follicle development group. The weighted average for duration of diagnosis or amenorrhea was 4.39 years in the with follicle development group and 6.33 years in the without follicle development group. A meta-analysis of these data revealed a mean difference of −1.44 [−2.63, −0.26], p = 0.02, suggesting a shorter amenorrhea or diagnosis duration in the with follicle development group. None of the subgroups showed a significant statistical difference. These findings are displayed in Fig. 4a.

Fig. 4
figure 4

Forest plots. Duration of amenorrhea/diagnosis (a) and patients with follicles remaining in biopsy (b)

Patients with follicles remaining in biopsy

The number of patients with follicles remaining in biopsy was described in five studies, with 43 patients in the with follicle development group and 121 patients in the without follicle development group. The weighted average of patients with follicles remaining in biopsy was 52.86% for patients in the with follicle development group and 21.86% for patients in the without follicle development group. A meta-analysis of these data revealed a risk ratio of 5.52 [2.14, 14.19], p = 0.0004, suggesting the increased likelihood of residual follicles in patients in the with follicle development group. From the different subgroups, the OFFA subgroup did not show a significant difference. In contrast, traditional IVA and cortex scratch subgroups showed a significant difference. These findings are displayed in Fig. 4b.

Discussion

POI does not indicate early menopause but a different condition altogether. This is illustrated by studies showing spontaneous ovarian function in patients with POI. Bachelot et al. reported a resumption of ovarian function in 23% of patients who were followed for up to 29 years [29]. A similar study in patients treated with ovarian stimulation, followed for 7 years, reported a follicle growth rate of 48.3% but only a 5.8% live birth rate [30]. This data demonstrates the remaining follicle pool potential and, thus, a therapeutic target in selected patients, as discussed below.

As mentioned before, from 164 patients treated, 43 patients (26.21%) showed follicle growth on ultrasound. Compared to the cohorts discussed before, where patients were followed for up to 29 years and 7 years, respectively, patients in the studies included were followed from 6 to 12 months [7,8,9,10, 28]. As patients with POI may have intermittent ovarian function, it is expected to find follicular growth the longer patients are followed.

After the procedure, each study had a different ovarian stimulation protocol (OS). However, most studies initiated OS when patients showed a resumption of ovarian function, either by FSH value lower than 20 or follicle development on ultrasound with increased estradiol [7, 8, 28]. Two studies initiated OS 2 days after surgery, followed by a second OS spaced by 2 months, and then patients were followed monthly with serum estradiol and ultrasound [9, 10]. The percentage of patients with oocytes retrieved ranged from 27.77 to 71.42%, with a weighted average of 54.54%. At the moment, it is unknown the reason for the difference between protocols.

During the follow-up, patients with follicle growth showed one to five waves of follicular development. These patients’ pregnancy rate was 25.58% (11/43), and the live birth rate was 20.93% (9/43). All pregnancies were achieved by OS.

Our meta-analysis showed that age was not associated with follicle growth; this was expected as the population was homogeneous. However, lower FSH, lower duration of amenorrhea or diagnosis, and follicles remaining in biopsy were statistically significant for follicle growth. The weighted average of baseline FSH and duration of amenorrhea or diagnosis in patients who responded were 52.87 IU/L and 4.39 years, respectively. In contrast, patients who did not respond had a weighted average of 68.82 IU/L and 6.07 years, respectively. Regarding the inconsistencies between the overall effect and subgroups effect among variables, we hypothesize that the low number of studies per subgroup reduced the statistical power and therefore were not statistically significant.

These findings highlight a subset of patients who have severely decreased ovarian function, as they were diagnosed with POI, but enough remaining dormant follicles to respond to OFFA or IVA. These findings may be explained by current understanding of these procedures. Lower baseline FSH, shorter duration of amenorrhea, or diagnosis and presence of follicles in biopsy are consistent with higher follicular reserve of primordial follicles, which may be activated with IVA protocol, and higher number of secondary follicles, which may be rescued by inhibition of Hippo signaling by fragmentation of the ovarian cortex in OFFA [5].

Baseline FSH and duration of amenorrhea or diagnosis can be used without delay as possible clinical criteria to identify patients who will likely respond to OFFA or IVA. Nevertheless, FSH has the disadvantage of being variable with each cycle. Although patients with follicles remaining in biopsy were the strongest predictor, it has the disadvantage of needing prior biopsy and may be performed in an area without follicles. Therefore, with the information available, shorter duration of amenorrhea or diagnosis, specially before 6 years, may be the most important factor for predicting a follicular response.

Antral follicle count (AFC) and anti-Mullerian hormone (AMH) are normally used as ovarian reserve markers. However, they are of no utility in patients with POI. First, they are not considered diagnostic in POI. Second, most patients do not present antral follicles on ultrasound, making AFC impossible. Third, a great proportion of patients had undetectable AMH levels and when AMH was detectable, it was not associated with follicle development.

This therapeutic option is still considered experimental, and to be used in clinical practice, the benefit must outweigh any risk. As with any other medical procedure, there are risks associated with laparoscopic surgery, such as bleeding, infection, and trauma to adjacent organs, to name a few [31,32,33]. However, based on the cumulative results, none of the trials reported any short-term or long-term complications (Table 3).

This systematic review and meta-analysis included patients subjected to either OFFA or IBA. IVA is a more complete procedure where, in addition to inhibiting Hippo signaling by ovarian cortex fragmentation, the follicles are activated in vitro using PTEN inhibitor and other Akt activators. However, it has the disadvantage of needing two surgical times. Moreover, in a recent systematic review and meta-analysis, Wang et al. concluded that the OFFA approach may be more efficient [34]. Future studies are needed to answer which protocol yields better results.

Strengths of our study include the adherence to the Preferred Instrument for Systematic Reviews and Meta-Analysis (PRISMA). This is the first study to perform a systematic review and meta-analysis comparing basal characteristics of patients treated with OFFA or IVA. This procedure has been described in patients with POI and poor responders or with low ovarian reserve; however we only included patients who met ESHRE criteria for POI. Having a more homogenous population gives strength to our findings. In addition, we have shown the cumulative results of this procedure.

As with any meta-analysis, this article is inherently limited by the quality and the quantity of the available data. No randomized trials were found in our systematic review. All studies were single-arm studies, which are at risk of serious bias. Thus, the certainty of evidence using the GRADE assessment was graded as low. Although a statistically significant difference was found between basal characteristics of patients who responded to OFFA/IVA, at the moment, it is not possible to determine a threshold for baseline FSH and duration of amenorrhea as clinical criteria.

In conclusion, patients with basal characteristics mentioned before may have more chances to show follicle growth after OFFA or IVA. Considering that approximately 20% of patients with follicle growth had a live birth, these results are promising. Given the overall certainty of evidence, future studies are needed to confirm said results.