Introduction

The universal adoption of single embryo transfer (SET) continues to be an elusive goal in human IVF, with implantation rates continuing to convince many to perform multiple embryo transfers to achieve acceptable live birth (LB) rates. Multiple gestation morbidity, therefore, continues to be an iatrogenic complication associated with IVF. This practice has continued, notwithstanding, the significant advances made in the assisted reproductive technologies and methodologies used (i.e., in vitro culture systems, cryopreservation, preimplantation genetic testing), which have seen implantation rates in both fresh and frozen embryo transfers (FET) improve significantly. Multiple embryo transfers continue to be performed, mainly because of the poor ability of embryo selection methods to select embryos with implantation potential. Conventional morphology-based methods have endured as the most widely used embryo selection methods in IVF, notwithstanding their only low-moderate predictive value [1, 2].

The recent advances in analytical technologies have seen a number of different embryo selection methodologies being identified as potential alternatives, each using a different marker of embryo competence (i.e., genomics, transcriptomics, proteomics, metabolomics, and time-lapse imaging) to select embryos. Since the very earliest reports of its promise [3], much of the research has focused on developing preimplantation genetic testing (PGT) as an adjunct selection method, with PGT for aneuploidy’s (PGT-A) suitability based on the reported associations between embryo aneuploidy and female fecundity and age [4, 5]. Opinions on the suitability of PGT-A as an embryo selection method, however, have varied widely, with issues related to reproductive competence protection and predictive accuracy most controversial [6, 7]. PGT has also evolved over time from the use of limited-chromosome analyzing technologies (fluorescence in situ hybridization (FISH)) in the beginning to the current use of 24-chromosome analyzing technologies (comprehensive chromosome testing (CCT)). The use of these different technologies has been found to be associated with harm [8], decreased efficacy [9], and increased efficacy [10, 11] in IVF.

Now 20 years after PGT was first introduced in IVF, IVF with PGT-A has still not become integral to routine IVF practice and, therefore, has not fulfilled on its promise of increasing the use of elective SET—maximizing the probability of embryo implantation by eliminating embryos with low-no implantation potential. This is despite the significant improvements in the diagnostic sensitivity and specificity of CCT technologies and despite the experience gained by laboratories in performing in vitro blastocyst culture, blastocyst vitrification, and blastocyst trophectoderm (TE) biopsy [12, 13]. The lack of high-quality evidence may be one reason why PGT-A has not become integral to routine IVF practice, with evidence of increased efficacy based mainly on three randomized controlled trials (RCT) in which now outdated technologies and methodologies were used [14,15,16]. The objective of the present study, therefore, was to reassess the efficacy of IVF with PGT-A using contemporary technologies and methodologies (i.e., blastocyst culture, freeze-all, TE biopsy, next-generation sequencing (NGS), SET, and FET) in the treatment of young infertile patients (≤ 35 years). In a RCT, selected patient-couples were to be randomized either to an arm in which single euploid best-scoring blastocysts or to an arm in which single unknown-ploidy best-scoring blastocysts were transferred in the first FET cycles following blastocyst freeze-all of patient-couples.

Materials and methods

The present RCT was conducted at a single IVF center, at which blastocyst in vitro development has been the preferred method of embryo selection since 2012 and blastocyst-freeze-all (> 90% of patients) the preferred IVF treatment since 2014. The decision to perform IVF with blastocyst freeze-all was based on the experience that the reproductive outcomes of FET were superior to those of fresh ET [17, 18]. Infertile patient-couples eligible to participate were informed of the RCT in consultation with treating clinicians, with those wishing to participate completing a modified IVF consent. Ethics approval for conducting the RCT was obtained from the Akdeniz University, Faculty of Medicine, Clinical Research Ethics Committee (ref. number 2015/399). The RCT was registered and posted on www.ClinicalTrials.gov (NCT03095053) on the 29 March 2017.

Patients

Patient-couples presenting with the following female characteristics were eligible to participate in the trial: ≤ 35 years, antral follicle count ≥ 5 (AFC; changed from ≥ 10), body mass index (BMI) ≥ 18 to ≤ 35 kg/m2, and no intrauterine and or endometrial abnormalities. All participating patient-couples underwent autologous ICSI blastocyst freeze-all cycles, with patient-couples commencing with treatment on 29 March 2017. Patient-couples eligible for randomization had to have at least two blastocysts with morphological scores of ≥ 2BB on day 5 of in vitro embryo development. Patient-couples who satisfied the inclusion criteria were randomized 1:1 by computer-generated number to the PGT-A or morphology group arm (Fig. 1). The FET drug prescription provided to participants on the confirmation of their inclusion, indicating to which arm they were randomized. Clinicians (n = 3) performing the treatment procedures were unaware to which arm patient-couples were randomized. The reproductive outcomes from only the first FET following patients’ blastocyst-freeze-all cycles were analyzed. All patient-couples were only eligible for SET, according to Turkish law (date of official gazette 30 September 2014, official gazette number 29135), without special compensation.

Fig. 1
figure 1

Patient randomized control trial flowchart. BMI body mass index, AFC antral follicle count, PGT-A pre-implantation genetic testing for aneuploidy. 1The blastocyst with herniating trophectoderm cells and the highest morphological score on day 5 of embryo development

Controlled ovarian stimulation, oocyte pickup, and embryo culture

All patients underwent flexible start gonadotropin-releasing hormone (GnRH) antagonist (0.25 mg; Cetrotide, Merck Serono, Istanbul, Turkey) co-treatment protocol ovarian stimulations (OS), using combinations of recombinant follicle-stimulating hormone (150–375 IU rFSH, Gonal-F, Merck Serono, Istanbul, Turkey) and human menopausal gonadotropin (75 IU hMG, Menopur, Ferring Pharmaceuticals, Mumbai, India). Starting doses of gonadotropin were based on female age, BMI, AFC, and IVF history. All patients underwent final oocyte maturation triggered with GnRH agonist (0.2 mg, Gonapeptyl®, Ferring Pharmaceuticals, India), human chorionic gonadotropin (250μg/0.05 ml hCG, Ovidrel, Merck Serono, Turkey), or a combination of the two when three or more follicles reached ≥ 17 mm.

All patients underwent transvaginal ultrasound (TVS) guided oocyte retrievals 36 h after trigger (follicular aspiration needle, 461230LF, Rheinbach, Germany). Oocyte collections and manipulations were performed using Cook Medical media (Sydney IVF, Brisbane, Australia) and in vitro embryo cultures were performed using SAGE 1-Step medium (67010010A, SAGE, Origio, Malov, Denmark), with no change of media after fertilization check. Incubation conditions were set at 6% CO2, 5% O2, and 37.0 °C (G185 Long Term Flat Bed Incubators, K-Systems, Kivex Biotec ltd, Birkerod, Denmark). All oocyte inseminations were performed using ICSI and embryo assessments performed daily.

Blastocyst scoring and cryopreservation

All blastocysts were scored according to the three-part Gardner scoring system [19], with a blastocyst score including a morphological assessment of blastocyst expansion, inner cell mass (ICM), and TE (i.e., ≥ 2BB = expansion:ICM:TE). For all patient-couples, the blastocysts that developed were ranked according to their morphological score, with only the highest-ranking blastocysts (i.e., best-scoring) selected for use in the RCT. In the PGT-A group, PGT-A prediction of the selected blastocysts resulted in two subgroups, the euploid and the unknown ploidy subgroup. In the euploid subgroup, euploid predicted blastocysts were transferred, and in the unknown-ploidy subgroup, next best-scoring unknown-ploidy blastocysts were transferred. In the morphology group, the best-scoring (unknown-ploidy) blastocysts were transferred. All blastocyst vitrifications were performed on day 5 of in vitro culture, using ultra-rapid technologies (Cryotop, Kitazato BioPharma Co., Ltd., Fuji City, Japan). Blastocysts were warmed and transferred on the same day, with blastocysts transferred 2 h after warming.

Artificial cycle frozen embryo transfers

All warmed blastocysts were transferred in artificial FET cycles according to a previously published protocol (20). On the 14th day of estrogen (Estrofem, Novo Nordisk, Istanbul, Turkey) administration, the endometrial thickness and serum progesterone of patients were measured. All patients included in the trial had > 7-mm endometrial thicknesses and ≤ 2-ng/mL serum progesterone levels. Progesterone administration (90 mg, twice-a-day, Crinone® 8%, Merck Serono, Turkey) was started on the 15th-day estrogen and transfers performed on the 6th day of progesterone. All blastocyst transfers were performed with trans-abdominal ultrasound guidance.

Biopsy and comprehensive chromosome testing

All TE biopsies were performed on day 5 of in vitro embryo development, with prior laser zona pellucida opening (ZILOS-tk, Hamilton Thorne Inc., Beverly, MA, USA). In the PGT-A group, best-scoring blastocysts according to morphological-score rank underwent TE biopsy, with biopsied cells processed and transported on the day of the biopsy to an independent genetics laboratory and analyzed for PGT-A prediction within a week of the TE biopsy being performed.

NGS

Whole-genome amplification (WGA) procedure was performed using the SurePlex DNA Amplification System (Illumina, San Diego, CA, USA), according to the manufacturer’s recommendations. The procedure involves sequentially performing cell preparation and cell lysis, followed by DNA extraction, preamplification, and amplification procedures on biopsy and control samples. The amplified samples were then processed with VeriSeq PGT Kit (Illumina, San Diego, CA, USA), involving NGS on a MiSeq, again according to the manufacturer’s recommendations. Prediction of ploidy was performed using Bluefuse Multi software algorithms (Illumina, San Diego, CA, USA), with all blastocysts predicted to be aneuploid or mosaic excluded from transfer.

Outcomes and statistics

The primary outcome measure was LB (changed from ongoing pregnancy). The primary outcome measure registered in www.clinicaltrials.gov was changed to LB after the last patient randomized had undergone FET, because LB outcome was reported to be of greater evidential value in RCT [21]. A LB was defined as a pregnancy cycle with a live infant delivered at > 20 weeks of gestation. A clinical pregnancy was defined as a pregnancy cycle with a fetal sac observed on ultrasound at > 5 weeks of gestation. A miscarriage was defined as a pregnancy cycle in which a clinical pregnancy was lost at < 20 weeks of gestation. All patient variable and treatment outcome data were collected from the IVF database of Antalya IVF, with data collection started > 20 weeks after the last patient randomized had undergone FET. Data analysis of the patient and treatment data collected was commenced on the 25 April 2018.

Statistical Package for Social Science 11.5 (SPSS version 11.5) was used for the statistical analysis of patient data (i.e., p values, odds ratios, and 95% confidence intervals). Continuous data were analyzed with either the Student’s t test or the Mann-Whitney U test, depending on data normality (Shapiro-Wilk). Categorical data were analyzed using either the chi-squared test or Fisher’s exact test, depending on sample size. A multiple logistic regression, with LB as the dependent variable and female age, infertility duration, AFC, blastocyst rate, and ploidy (reference variable) as independent variables, was performed to adjust for any possible differences. The sample size for the RCT was estimated at 80% power and an alpha significance of 0.05 (type I error rate) for an expected absolute increase in LB of 20% from a reference rate of 55% to be 89 patients in each arm; to accommodate for aneuploidy prediction, the aim was to randomize approximately 110 patients to each arm. The absolute difference was based on the difference in favor of IVF with PGT-A reported in the three previously published RCT [14,15,16] and the expected aneuploidy rate (≈20%) of the patient group [22].

Results

Three-hundred and two patient-couples consenting to participate in the RCT underwent autologous ICSI blastocyst freeze-all cycles and were followed up for inclusion. Of the 302 patient-couples who underwent treatment, 82 were excluded (Fig. 1). Two-hundred and twenty patient-couples satisfied the inclusion criteria (i.e., female age ≤ 35 years, two-day 5 ≥ 2BB blastocysts) and were randomized: 109 patient-couples to the PGT-A and 111 to the morphology group arm. The patient and in vitro embryo development variables of the patient-couples included in the RCT are presented in Table 1, with no significant differences observed between any of the patient and treatment variables.

Table 1 Patient characteristics and in vitro culture outcomes

All 255 blastocysts (n = 109, PGT-A group; n = 35, unknown-ploidy subgroup; and n = 111, morphology group) which underwent warming for transfer survived the vitrification-warming process. The 109 blastocysts that underwent TE biopsy and PGT-A were all diagnosed, with 80 (73.4%) blastocysts diagnosed as euploid, 25 as aneuploid, and 4 as mosaic. The patient-couples randomized to the PGT-A group were sub-grouped according to the PGT-A ploidy prediction: 80 to the euploid subgroup and 29 to the unknown-ploidy subgroup. The excluded blastocysts had the following chromosomal errors: nine monosomies (i.e., on chromosomes 9, 8, 18, 20, 21, and 22), ten trisomies (i.e., on chromosomes 1, 9, 13, 14, 15, 16, 20, and 22), one tetrasomy, seven complex aneuploidies, and two 45X karyotypes. In the unknown-ploidy subgroup, 23 patient-couples underwent single blastocyst FET with the next best-scoring unknown-ploidy blastocysts selected for transfer and 6 patient-couples were excluded from the analysis because in consultation they requested the transfer of two blastocysts, which violated the inclusion criteria (Fig. 1). In the euploid subgroup, single best-scoring euploid blastocysts and in the morphology group, single best-scoring unknown-ploidy blastocysts were transferred in the first FET cycles following the freeze-all-IVF cycle.

The outcomes of the first FET cycles performed are presented in Table 2, with the euploid subgroup and morphology group compared statistically and the unknown-ploidy subgroup presented for observational purposes only. The endometrial thickness was statistically different (p = 0.04) between the two groups; however, the difference may not have impacted outcomes as the thicknesses (9.88 ± 1.97 vs 9.29 ± 1.98) reported have previously been reported to have LB rates that were not statistically different [23]. The LB rate of the euploid subgroup was found not to be statistically different to that of the morphology group (56.3% vs 58.6%), with relative odds for LB of 0.91 (95% CI 0.51–1.63, p = 0.750). In a multiple logistic regression, euploid blastocyst transfer was also found not to be a statistically significant predictor of LB when adjusting for the variables of female age, infertility duration, AFC, blastocyst quality, and blastocyst rate (OR = 0.91, 95% CI 0.50–1.66, p = 0.760).

Table 2 Reproductive outcomes of single blastocyst frozen transfers

Discussion

The single-minded commitment of IVF practitioners to improve pregnancy outcomes may motivate many to introduce new technologies without high-quality supporting evidence [6, 7, 24]. Although a number of retrospective and observational studies have investigated IVF with PGT-A, the evidence of increased efficacy has been based mainly on the outcomes of three RCT [14,15,16]. In a recent opinion paper, the quality of evidence from the three RCT was reviewed to be low, with all three studies reported to have major methodological and technical limitations [13]. In the present RCT, using contemporary technologies and methodologies, PGT-A blastocyst selection was found not to be able to enhance the LB rate significantly in a patient group with a similar prognosis (i.e., female age ≤ 35 years, two ≥ 2BB blastocysts). This inability to enhance LB was confirmed in a multiple logistic regression analysis adjusting for female age, infertility duration, AFC, blastocyst quality, and blastocyst rate (OR = 0.91, 95% CI 0.50–1.66, p = 0.760). In the RCT all transfers were performed in FET, avoiding confounding associated with OS [25], and all blastocysts transferred had morphological scores of ≥ 2BB, avoiding confounding associated with the TE biopsy and vitrifying-warming of lesser quality blastocysts [26, 27]. Of interest, albeit non-significant the pregnancy outcomes indicated that miscarriage rate of euploid blastocyst transfers may be lower than that of unknown ploidy blastocyst transfers. In the present RCT, PGT-A showed that the best-scoring blastocysts of the selected infertile patient (≤ 35 years) group had a 73.4% chance of being euploid.

While the clinical assumptions regarding the reproductive benefits of IVF with PGT-A are compelling, euploid blastocyst transfers do not guarantee LB despite the use of CCT [28,29,30]. IVF with PGT-A may fail as the result of both misdiagnosis and iatrogenesis. Misdiagnosis may occur as the result of the technology limitations of CCT (i.e., sample analysis errors), TE biopsy sampling (i.e., errors related to the number of cells sampled and the location of cells sampled), and embryonic mitotic mosaicism (i.e., errors related to the location of cells sampled) [31, 32]. Adverse iatrogenesis may also result from TE biopsy damage, with the number of cells biopsied and the disruption caused to the TE potentially affecting the implantation potential of blastocysts [27, 33, 34]. NGS as used in the present RCT is regarded as one of the most accurate technologies in terms of sensitivity and specificity, with increased but not absolute sensitivity to detect mosaicism. Mosaicism, therefore, remains a risk for potential misdiagnosis, albeit a low risk [30, 35,36,37,38,39]. Moreover, even though 5–10 TE cells are routinely biopsied from blastocysts, there are those of the opinion that this number may not be enough to accurately predict blastocyst ploidy [7, 40, 41]. Furthermore, a euploid diagnosis based on a TE biopsy assumes absolute concordance between the TE and ICM [40]. In the present RCT, the euploidy rate (73.4%) was higher than what would normally be expected, possibly because only the best-scoring blastocysts underwent PGT-A.

Morphology-based scoring systems continue to be the most widely used embryo selection methods in IVF, notwithstanding the only low-moderate associations found between blastocyst morphology scores and LB, blastocyst euploidy, and euploid blastocyst implantation [2, 42, 43]. In a retrospective study, in which the implantations of 215 euploid blastocysts were investigated, it was found that euploid blastocyst implantation was similar irrespective of morphological score [43]. While all three blastocyst morphology parameters (i.e., blastocyst expansion, TE morphology, and ICM morphology) effect LB, only TE morphology was a significant predictor of LB after adjusting for significant confounders [26]. This significant contribution of TE to LB outcomes may have important implications for IVF with PGT-A, as TE biopsy is invasive and may disrupt the integrity of TE. In the present RCT, all patients included in the randomization had at least two blastocysts with a score of ≥ 2BB (i.e., > fair blastocysts). This inclusion criteria were chosen to limit confounding associated with the lower potential of early and or poor-quality blastocyst to survive vitrifying-warming with full competence [26] and to avoid TE biopsy compromising the implantation potential of lesser quality blastocysts [27].

Two systematic reviews have been published [10, 11], in which the clinical effectivity of IVF with PGT-A and indirectly the potential of PGT-A as an adjunct embryo selection method was analyzed. In both the reviews the same three RCT formed the core of the analysis [14,15,16], with both reviews concluding that IVF with PGT-A enhanced implantation in good prognosis patients, as compared to standard-IVF with morphology-based blastocyst selection. The quality of the evidence from the three RCT has since been questioned because all three were subject to certain methodological and technological limitations [13]. On the contrary, however, a reanalysis of the data reported to the Centers for Disease Control and Prevention in the USA found that LB rates per fresh ET were significantly better in non-PGT than in PGT cycles (46.2% vs 39.3%) [9]. More recently, two multi-center RCT were published, in which IVF with PGT-A was investigated in patients with advanced maternal age (38–41 years, day 3 embryo biopsy, aCGH, and fresh and frozen blastocyst ET) [44] and in a range of maternal age (25–40 years, TE biopsy, NGS, and blastocyst FET) [45]. In the first RCT, patients who underwent PGT-A had a higher delivery rate (52.9% vs 24.2%), lower miscarriage rate, and reduced time to pregnancy. In the second RCT, in the overall analysis no difference in ongoing pregnancy rates was found between the PGT-A and the control arm (49.6% vs 45.9%); whereas in an analysis of older female patients (35–40 years), the ongoing pregnancy rate was significantly higher in the PGT-A arm (50.8% vs 37.2%). In the present RCT, using NGS, FET, and best-scoring blastocysts (≥ 2BB) in young patients (≤ 35 years), no significant difference was found between the LB rates of the euploid subgroup and the morphology group (56.3% vs 58.6%). Even though patients in the unknown-ploidy subgroup had lower scoring unknown-ploidy blastocysts transferred, their LB rate was also not too dissimilar (47.8% vs 56.3%). The LB outcomes obtained under the conditions of the present RCT contradicted the reproductive outcomes obtained in the three previously published single-center RCT investigating similar fertility prognosis patients [14,15,16].

In the present RCT, the expectation was that LB rates would be enhanced in the euploid subgroup, because of the accuracy of NGS [37, 38], the balanced randomization, and because only the best-scoring blastocysts were transferred (euploidy x ≥ 2BB implantation potential ≫ unknown-ploidy x ≥ 2BB implantation potential; 100% × 60% ≫ 73.4% × 60%). Moreover, an outcome of increased effectivity in terms of LB was expected based upon the evidence that blastocyst ploidy rather than morphological score determines implantation [43]. It could be suggested that the failure to achieve a higher LB rate in the euploid subgroup was the result of poor technical proficiency in the laboratory. However, the clinical pregnancy rate was similar (61.3% vs 63.8%) to that previously reported for PGT-A using NGS [35] and the aneuploidy rate (26.6%; 29/109) was as expected for the selected patient group [22]. An alternate explanation could be that the technical and biological errors related to TE biopsy outweigh the technical errors related to morphology-based blastocyst scoring. If the LB rate of the morphology group was calculated according to the number of potentially euploid blastocysts transferred, the LB rate of the morphology group would be 93.8% (76/81). Moreover, the no difference found between the LB rates of the euploid subgroup and the morphology groups may suggest that PGT-A results in a 37.5% (theoretical 93.8% minus actual 56.3%) reduction in the LB rate. This corresponds to a theoretical 20–40% embryo loss rate calculated by Paulson, with the rate dependent on the implantation and aneuploidy rates chosen [46]. The outcomes of these theoretical calculations suggest that the overall effectivity of TE biopsy requires further critical investigation.

In all likelihood, increasing numbers of IVF centers worldwide will invest in PGT-A, with PGT-A increasingly becoming part of multidimensional embryo selection strategies at IVF centers. This strategy is designed to maximize the probability of embryo implantation by eliminating embryos that will result in implantation failure. For this reason, therefore, it is critical for those wishing to invest to know which variables (i.e., technologies, methodologies, and expertise) are most critical to ensure having an effective PGT-A program. In comparison to the morphology-based embryo selection method, PGT-A selection will always remain expensive and invasive method, with its true cost-effectivity uncertain because of the uncertain effect of TE biopsy [13, 46]. While the skills and experience of the operators performing TE biopsies at the laboratory involved in the RCT could be regarded as a major limitation in the present RCT and a reason for the unexpected reduced outcome, the evidence from the present RCT rather suggests that TE biopsy may be a major weakness in PGT-A blastocyst selection. TE biopsy is inherently an imperfect technique, with technical and biological errors potentially occurring as the result of cell sampling procedure, with mitotic mosaicism increasing the potential for sampling errors. These errors result in implantation failure, as the result of blastocyst misdiagnosis and damage to the developmental competence of blastocysts. In conclusion, in young (≤ 35 years) infertile patients with at least two ≥ 2BB blastocysts, PGT-A blastocyst selection does not result in an enhanced LB rate, with the evidence suggesting that the effectivity of PGT-A may be limited by the effectivity of TE biopsy.