Abstract
Purpose
Our objective was to design an automated deep learning model that extracts the morphokinetic events of embryos that were recorded by time-lapse incubators. Using automated annotation, we set out to characterize the temporal heterogeneity of preimplantation development across a large number of embryos.
Methods
To perform a retrospective study, we used a dataset of video files of 67,707 embryos from four IVF clinics. A convolutional neural network (CNN) model was trained to assess the developmental states that appear in single frames from 20,253 manually-annotated embryos. Probability-weighted superposition of multiple predicted states was permitted, thus accounting for visual uncertainties. Superimposed embryo states were collapsed onto discrete series of morphokinetic events via monotonic regression of whole-embryo profiles. Unsupervised K-means clustering was applied to define subpopulations of embryos of distinctive morphokinetic profiles.
Results
We perform automated assessment of single-frame embryo states with 97% accuracy and demonstrate whole-embryo morphokinetic annotation with R-square 0.994. High quality embryos that had been valid candidates for transfer were clustered into nine subpopulations, as characterized by distinctive developmental dynamics. Retrospective comparative analysis of transfer versus implantation rates reveals differences between embryo clusters as marked by poor synchronization of the third mitotic cell-cleavage cycle.
Conclusions
By demonstrating fully automated, accurate, and standardized morphokinetic annotation of time-lapse embryo recordings from IVF clinics, we provide practical means to overcome current limitations that hinder the implementation of morphokinetic decision-support tools within clinical IVF settings due to inter-observer and intra-observer manual annotation variations and workload constrains. Furthermore, our work provides a platform to address embryo heterogeneity using dimensionality-reduced morphokinetic descriptions of preimplantation development.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
Owing to the inherent biological heterogeneity in the developmental potential of embryos, decreasing the risks that are associated with multiple pregnancy while shortening time to pregnancy relies on transferring the embryo(s) of the highest developmental quality. To address this important need, various assisted reproductive technologies (ARTs) have been developed [1]. Embryos that are generated via in vitro fertilization (IVF) and harbor chromosomal abnormalities that are associated with a decreased potential to implant and generate a live birth can be screened via preimplantation genetic testing for aneuploidy (PGT-A) and for structural chromosomal rearrangements (PGT-SR) [2]. However, PGT is invasive and requires obtaining cellular biopsies from the embryos. In addition, false negative assessments may arise due to insufficient genetic amplification and chromosomal mosaicism [3, 4].
Complementary to PGT, various non-invasive technologies have been developed during the past decade for assessing embryo quality and developmental potential in real time [5]. Such approaches are based on the assumptions that embryo quality can be linked with certain metabolic markers, which can be assessed by probing molecular signatures in the spent culture medium [6,7,8,9]. Other methodologies relay on the mechanical changes in the viscoelastic properties of preimplantation embryos that are associated with high developmental quality and can be defined by measuring the stress–strain relationships under applied load [10]. However, predicting embryo quality based on its visualization made the highest impact in the clinic. The first protocols scored embryo potential based on certain morphological grading criteria of specific developmental stages [11,12,13,14]. The utilization of time-lapse incubation systems in IVF clinics provided dynamic and continuous visualization of the embryos as they monotonically advance between the states of preimplantation developmental while maintaining the embryos under optimal culture conditions (Fig. 1). Using these video recordings, the embryos can be characterized by the specific time points from fertilization at which discrete developmental events occur [15]. These so-called morphokinetic events are defined by the transition between embryo states and include times of pronuclei appearance (tPNa) and fading (tPNf), the cleavage of two-to-eight blastomeres (tN, N = 1 to 8), the compaction of the morula (tM), and start of blastulation (tSB) [14, 16]. The generation of large datasets of morphokinetically annotated and clinically labeled embryos facilitated the development of classification algorithms that predict the potential for embryo implantation [17, 18] and live birth [19,20,21]. Parallel efforts took advantage of the size of the available time-lapse datasets to train convolutional neural network (CNN) based classifiers that assess embryo potential using the raw video files in an unbiased annotation-independent manner [22, 23]. However, training such deep learning models is challenging due to the size of the video files (~ 100’s Mb), which would require a sufficiently large sample number [24, 25].
Morphokinetic evaluation of the developmental potential proved highly efficient in de-selecting for transfer poor quality embryos. However, annotation by trained embryologists of all the embryos from each oocyte collection cycle is time-consuming. Moreover, manual morphokinetic annotation can introduce inter-observer and intra-observer variations. Multiple retrospective studies report “…generally good, although not optimal…” inter-observer and intra-observer agreement between embryologists [26,27,28,29]. These limitations stimulated the development of automated classifiers that extract the developmental stages of the embryos and the time of morphokinetic events [30,31,32,33,34]. To support the automated and standardized morphokinetic assessment of embryo developmental potential with high accuracy that will allow clinical implementation, we used a large dataset of manually-annotated and clinically-labeled embryo video files, and trained a CNN classifier to infer the developmental states that are recorded in each frame. For each individual frame, we allowed the classification of multiple developmental states in a manner that the confidence in the prediction of these states is weighted (the sum of all probabilities is one). By allowing this superposition of multiple predicted states at different probabilities, we included in the model the morphokinetic potential uncertainties which may improve accuracy. We then constrained chronological development as appears by the time-lapse frame-wise series of developmental states to obtain the morphokinetic profiles of the embryos. Indeed, we predict the developmental states of the single frames with 97% accuracy and the series of morphokinetic profiles of the embryos from time of pronuclei appearance to start of blastulation with R-squared coefficient of determination 0.994 as validated across 1918 test set embryos. Using our automated classifier, we provide unparalleled temporal statistics of preimplantation development of 67,707 embryos. Focusing on 14,159 high-quality embryos that would correspond to valid candidates for transfer, we apply unsupervised clustering into nine distinctive cohorts. We define distinctive patterns of early and late morphokinetics and reveal cluster-specific correlations with the rate of embryo transfer and the rate of embryo implantation. Our work thus provides a standardized platform for assessing the developmental potential of pre-implantation embryos. By supporting single embryo transfer policies, our work is expected to decrease the medical risks that are associated with multiple pregnancies and shorten time to pregnancy.
Methods
Dataset
Here we used a previously assembled database [23]. In short, we assembled a large dataset of 67,707 video files of preimplantation embryo development that were recorded on eleven time-lapse incubation systems (EmbryoScope Time-Lapse System, Vitrolife) located in four medical centers. The dataset includes 20,253 embryos that were morphokinetically manually-annotated by trained embryologists adhering to established protocols as we reported previously [23]. Time-lapse images were recorded with an average 18 min time interval for 3-to-6 days. At each time point, seven Z-stack frames were recorded; however, only the central focal plane was used here. The embryos in the dataset were randomly distributed such that ~ 20% of the frames were dedicated to serve as an uncontaminated test set (Table 1). The remaining frames were divided between a train set (~ 85%) and a test set (~ 15%). Frames belonging to individual embryos were not shared between the test set and the train/validation set.
The number of embryos that were transferred to the uterus 3 days (Day-3), 4 days (Day-4), and 5 days (Day-5) from the time of fertilization is provided per medical center in Table 2. We specify the number of embryos with positive and negative known implantation data (KID) as well as unknown KID embryos. The latter corresponds to multiple transfers in which the identity of the implanted embryos is not known.
Embryo-state frame labeling
The morphokinetic events characterize the preimplantation dynamics of the embryos and are not a property of the individual time-lapse frames. Hence, we first converted the manually-annotated morphokinetic profiles of the embryos into the so-called developmental embryo state labels of each individual frame. Given the monotonic nature of preimplantation development, the conversion of the manually annotated morphokinetic profiles into embryo state labels was performed in straightforward manner as specified in Table 3. Frames that overlap the manually annotated morphokinetic events and the frames that were recorded just after were excluded from the train set. Notably, here, we discriminate between \(FO\) and \(PNFZ\) embryo states despite being morphologically-identical. Hence, \(FO\) and \(PNFZ\) frames were re-labeled one cell (\(1C\)) for the purpose of network training.
Frame preprocessing
The Embryoscope time-lapse incubator (Vitrolife) records 8-bit grayscale images that are composed of 500 × 500 grayscale pixels. To decrease dimensionality, a 256 × 256 pixels region of interest (ROI) of the embryos in each frame was cropped using a U-net segmentation network as we reported previously [23]. In addition, all train-set frames were further augmented by applying [\(90^\circ , 180^\circ , 270^\circ ]\) rigid rotations, horizontal flipping, and vertical flipping.
Inference of the frame-wise embryo-state probability vector and the embryo probability matrix
To infer the developmental states of the embryo as visualized in each individual frame, we trained a ResNet18 CNN [35], using the train, validation, and test sets of the time-lapse frames labeled by the embryo developmental states as described above. Since these are grayscale image files, we modified the first convolutional layer to input a single channel instead of three. The CNN model was implemented using TorchVision in PyTorch with a categorical cross-entropy loss function, and optimized using Rectified Adam (RAdam). A 0.0005 learning rate was set, reaching convergence within ten epochs.
Let us consider the time-lapse sequence of size \(n\) of a given embryo, where \({t}_{i}\) is the time from fertilization of frame \(i=1,\dots ,n\). The classifier infers the probability to find the embryo at any of the developmental states, from 1C to BL, as obtained by the eleven output neurons (Fig. 2). In this manner, the weighted and superimposed developmental state of frame \(i\) is defined by the embryo state probability vector (ESPV) of the output neuron coordinates:
The ESPVs of five representative snapshots at ascending order are shown in Fig. 3A. To obtain a probability-weighted whole-embryo dynamic representation of preimplantation development, we define the embryo-state probability matrix (ESPM) by concatenating all the ESPVs of the embryo in a chronological order (Fig. 3B(i)).
Automated annotation of the morphokinetic events
To extract the discrete time points of the morphokinetic events, the uncertainty in the developmental states of the embryo, which are formulated by the ESPM, should first be projected onto discrete temporal states. Hence, we project ESPM onto \(\widehat{ESM}\in {\left\{\mathrm{0,1}\right\}}^{11\times n}\) as follows:
Hence, \(\widehat{ESM}\) provides a discrete description of the temporal states of the embryo. However, it does not satisfy the monotonicity of preimplantation development. To this end, we apply a weightless isotonic regression of \(\widehat{ESM}\):
The binary and discrete matrix \(\widehat{y}\in {\left[\mathrm{0,1}\right]}^{11\times n}\) is defined by:
We recall that FO and PNFZ are two embryo states that share the 1C morphological label which is used here. Hence, we expect to obtain at least two temporally-separated regions in the ESPM of value 1C (see Fig. 3B(i)) that will be propagated to \(\widehat{ESM}\) but converged in \(\widehat{y}\). Based on trial-and-error, we find that FO and PNFZ are best captured by the earliest and the latest 1C regions in \(\widehat{y}\), respectively. Hence, we replace the FO and PNFZ time regions (first row) in \(\widehat{y}\) and obtain the binary and developmentally-monotonic embryo state matrix \(ESM\in {\left\{\mathrm{0,1}\right\}}^{11\times n}\) (Fig. 3B(ii)).
With increasing \(i\) (\(ESM\) columns), there are up to eleven transitions between the embryo states that correspond to the morphokinetic events of that embryo, which we extract in a straightforward manner. The time of the morphokinetic event \(i\) is thus defined by the transition from embryo state \(j\) to the consecutive one (most frequently \(j+1\)). In the case of direct equal cleavage from \(m\) cells, \(m>1\), to \(\left(m+2\right)\) cells, the morphokinetic events \(\left(m+1\right)C\) and \(\left(m+2\right)C\) will converge. The vector of time points of the morphokinetic events is the automatically annotated morphokinetic profile of that embryo (Fig. 3C).
Unsupervised clustering
Unsupervised clustering of the embryos’ morphokinetic profiles was performed via K-means with Euclidian metric using the scikit-learn Python package. The number of clusters was determined to optimize the interplay between minimizing cluster number and maximizing variance as presented by the elbow plot (Fig. 8A).
Results
Inference uncertainty of the frame-wise embryo states
Using a probabilistic presentation of the embryo states, as presented by the ESPM, we propagate the information that is stored by the uncertainty in each frame (Fig. 4A – top panels). We find that the regions of high uncertainty change between embryos; however, noise tends to be high during the second (3C to 4C) and third (4C to 8C) embryo cleavage blocks and during morula compaction (Fig. 4B). In the process of projection of the ESPM onto single embryo states, as presented by the (ESM), we take into account the temporal neighborhood constrains of the temporal monotonicity of preimplantation development (Fig. 4A – bottom panels and Fig. 4B).
Classification model performance
To estimate the accuracy in the classification of the embryo states, we calculated the confusion matrix between the ESM and the manually annotated ground truth as averaged across 66,2021 frames of 1918 test set embryos (Fig. 5A). Indeed, most embryo states were inferred in agreement with the ground truth, with 93% precision and 93% recall, whereas disagreements were localized to developmentally-adjacent states. We next address the accuracy of morphokinetic annotation by plotting the predicted versus the manually annotated morphokinetic events of the test set embryos (Fig. 5B). Consistent with the classification accuracy of the embryo developmental states, we obtain 0.994 R-goodness of fit. Scatter plots and linear regressions for each event are provided in figure S2.
The average and standard deviation values of the temporal differences between the automatically and manually annotated events are summarized in Table 4 together with the percentile-wise error distributions. The mean error in the prediction of the pronuclei events (tPNa-tPNf) and the cleavage events (t2-t8+) was smaller or comparable with the time-lapse interval (20 min), which sets the minimal sampling error for morphokinetic annotation owing to the discrete nature of video recordings. However, tM and tSB mean errors spanned two time intervals or more, which may be indicative of the inherent ambiguity in the visual determination of start of morula compaction and start-of-blastulation, respectively. These disagreements between manually annotated and automatically annotated events are illustrated by four representative embryos in figure S1. Consistent with the statistical error analysis that we present in Table 4, automated annotation tends to be either in agreement with manual annotation, or to be separated by one or two sequences. Obviously, these four embryos are presented only to provide a visualization of the embryo states that is complementary to the statistical analysis that we performed.
Above, we presented the accuracy in the prediction of individual embryo states and morphokinetic events as we evaluated across a large dataset of embryos. However, clinical applications would also require quantitative assessment of the annotation accuracy of the morphokinetic profiles from tPNa to tSB of individual embryos, which will allow consideration of classification generality. To this end, we defined the normalized absolute temporal difference (NATD) between the automated and manual annotations as:
where \({A}_{i}^{j}\) and \({M}_{i}^{j}\) are the automatically and manually annotated morphokinetic profiles, denoted by index event \(i\) of embryo \(j\). For each embryo, it provides the sum of time differences between the predicted and ground-truth events normalized to time of event. Since differences in the IVF protocols may vary between medical centers and since maternal age is linked with temporal morphokinetic profiles [36, 37], we calculated the NATD histograms as evaluated for test set embryos stratified by clinic (Fig. 6A) and by maternal age (Fig. 6B). The distributions of the basal sampling error per embryo, as determined by one time-lapse interval difference between \({A}_{i}^{j}\) and \({M}_{i}^{j}\) (~ 18 min), are presented. Satisfyingly, we find that the NATD error distributions were broader yet statically-significantly smaller than the basal sampling error. We complement our error analysis by excluding potential confounding contributions due to systematic differences in the automated and manual annotations and in maternal age between the clinics (Fig. S3).
In summary, we report unprecedented accuracy statistics of automated morphokinetic annotation ranging from single-frame embryo state prediction, to inference of population-level morphokinetic events and whole-embryo morphokinetic profiles. Automated morphokinetic annotation is demonstrated to be robust to medical center and maternal age, thus satisfying generality.
“Big data” analysis of preimplantation embryo development
Automated morphokinetic annotation provides means for analyzing preimplantation development of a large number of embryos to provide statistical characterization, which would have been practically impossible otherwise. To address all preimplantation stages, we annotated 24,644 embryos that have been cultured inside time-lapse incubators for 120 h or more. The generality of our analysis is based on the fact that each of such time-lapse incubator culture plates includes multiple embryos of different developmental potential with no apriority bias of maternal age, clinic, or number of retrieved oocytes per collection cycle. At each hour from time of fertilization to 120 h, each embryo was labeled by its developmental state from 1 (FO) to 12 (BL) and the 25th, 50th, 75th, and 95th percentiles were calculated (Fig. 7A). In this manner, developmentally arrested embryos were accounted for according to their corresponding embryonic state. The 25th percentile is defined by the embryos with the slowest dynamics, which became arrested prior to morula compaction. In comparison, the 50th, 75th, and 95th percentile dynamics reach tM by end of Day-4 (96 h), late Day-3 (68 h), and early Day-3 (51 h), respectively, thus demonstrating the temporal variation between embryos. Complementary to the embryo state dynamics, we calculated the temporal distributions of the morphokinetic events, the morphokinetic cell-cycle intervals, and the morphokinetic cell-synchronization intervals, as defined by the transitions between states (Fig. 7B(i,ii)) [38]. Here, the number of embryos that reached each morphokinetic event decreased with preimplantation development due to prior-arrest of some of the embryos. The temporal dispersion of the morphokinetic events across the embryos, as evaluated by the coefficient of variation, was significant, indicating the inherent heterogeneity between embryos of similar genetic background (denoted in Fig. 7B(i,ii)). Importantly, these temporal variations decreased with preimplantation development, thus demonstrating the developmental convergence of non-arrested embryos.
To address embryo-to-embryo variation, we performed unsupervised K-means clustering of the morphokinetic profiles. Since low-quality embryos have poor clinical significance, we included only the high-quality embryos that are generally considered as valid candidates for transfer by excluding the ones that failed to reach 8C by 66 h from fertilization (8C− embryos). We clustered the embryos into K = 9 cohorts based on their tPNa-to-t8 morphokinetic profiles (discarding tM and tSB events), thus allowing high variation while limiting the number of clusters (Fig. 8A). The clusters C0 to C8 are sorted from early to late tPNa (Fig. 8B). The size of clusters C0 to C8 is listed in Table 5 and compared with the cohort of 8C− low-quality embryos that failed to reach 8C by 66 h from fertilization. Importantly, we find that the differences in preimplantation dynamics between the clusters are not associated with differences in maternal age (Fig. 8C) nor in the rate of blastulation (Fig. 8D).
Clustering analysis reveals distinctive morphokinetic patterns that are characteristic of embryo subtypes (Fig. 9A). To characterize preimplantation dynamics, we consider the first, second, and third embryo cleavage blocks (Fig. 9B). All clusters maintain their order from fast to slow during the first blastomere cleavage round (PN to 2C). However, C0 embryos entered the first blastomere cleavage round early but completed the third blastomere cleavage round late. Next, we calculated the second (S2 = t4-t3) and third (S3 = t8-t5) cell synchronization intervals, which quantify the degree of mitotic synchronization between the blastomeres in the corresponding cleavage blocks (Fig. 9C). C3, C6, and C8 embryos are the least synchronized. With respect to the third mitotic cycle, the synchrony of these clusters is as poor as the low quality 8C− embryos.
The developmental potential of the embryo clusters
The rate of embryo transfer into the uterus reflects the developmental quality of the embryos as estimated by the clinicians. To assess the relationship to the developmental potential of the embryos, we plot the average implantation-versus-transfer rates of the clusters of Day-3 (Fig. 10A), Day-4 (Fig. 10B), and Day-5 (Fig. 10C) transferred embryos, and include 8C− low-quality embryos. As expected, the implantation rates of Day-5 transferred embryos are highest and of Day-3 transferred embryos are lowest. The decrease in embryo transfer rates between Day-3 and Day-5 transfers is due to the decrease in the number of transferred blastocysts per cycle. Despite the fact that the cluster distributions of the available embryos per oocyte collection cycle are not specified, the average implantation rates are positively correlated with the transfer rates, thus conforming the capacity of morphokinetic profiling in predicting embryo quality. C1 and C2 clusters consistently show the highest implantation rate and high transfer rates. C3, C6, and C8 clusters have the lowest implantation rates and low transfer rates during embryo cleavage (Day-3) and morula compaction (Day-4) stages, which is consistent with the poor cell cleavage synchronization of these embryos during the third mitotic cycle as shown above (Fig. 9C). Hence, our unsupervised clustering analysis confirms the role of S3 as a marker of embryo developmental potential [17, 39]. We note that blastocyst selection for transfer on Day-5 significantly improves the implantation rates of C3 and C6 clusters (C8 embryos have zero implantation rate).
Discussion
Temporal profiling of the morphokinetic events has been demonstrated to support the evaluation of the developmental potential of embryos and improve implantation rates by allowing the selection of the highest-quality embryos for transfer [40]. Computationally, morphokinetic annotation provides means to effectively reduce the dimensionality of the video representations of the embryos from ~ 100 Mb to ~ 100 bytes, thus preventing overfitting and improving accuracy given a finite dataset and computational power. Despite its now-established efficacy, the utilization of morphokinetically-based embryo evaluation algorithms in clinical settings is hindered by the substantial workload that is required for performing manual annotation, and by potential intra- and inter-observer variations [26,27,28,29]. Computer-based automated algorithms are required in order to lift these hurdles and facilitate the utilization of morphokinetic-based decision support tools in clinical settings.
To date, various machine learning methodologies were designed that can automatically annotate blastomere cleavage [30,31,32] as well as morula compaction and blastocyst expansion [34]. Recently, automated annotation spanning the entire course of preimplantation development was reported, which can potentially support embryo selection for Day-5 transfers. Feyeux et al. incorporated image processing tools to reach 92% accuracy whereas Leahy et al. reports 87.9% accuracy by training multiple CNNs [33]. For both cases, clinical utilization as a stand-alone automated decision-support tool would likely require improved accuracy [41]. To improve accuracy, we assembled an expansive retrospective dataset to train a CNN for assessing the embryo state in each frame while allowing a probability-weighted superposition of multiple states. The individual states are determined via monotonic regression, thereby integrating morphokinetic dynamics across a finite temporal vicinity around each event under temporal monotonic constrain. In this manner, we demonstrate fully automated annotation from tPNa to tSB with unprecedented accuracy (R-square 0.994), whose error distribution is largely generated by the discrete nature of time-lapse imaging.
Multiple combinations of morphokinetic events and intervals have been selected by various classifiers for evaluating the developmental potential of embryo [42]. In the case of Day-5 transferred embryos, tSB was retrospectively shown to discriminate between high quality (tSB < 96 h) and lower quality (tSB > 96 h) embryos that were selected for transfer based on morphological criteria and assessed via implantation outcome [43]. However, other algorithms also include events that follow tSB [44]. Late events include the time of full blastocyst formation (tB) and the time of expanded blastocyst (tEB). tB refers to state in which the blastocoel is filling the embryo with < 10% increase in diameter [45], or alternatively the frame in which a crescent-shaped area began to emerge from the morula [38]. tEB refers to the state in which the blastocyst’s diameter increases by > 30% concomitant to initiation of zona thinning [46]. Evidently, these definitions of tB and tEB are morphologically complex, which should be considered for practical purposes. The latest event that our algorithm automatically annotates is tSB. However, expanding automated annotation to include additional events is likely feasible provided that suitable manually annotated datasets are available for training.
In this study, we investigated embryo-to-embryo variations among morphokinetic profiles using automated annotation. A total of 24,644 embryos were cultured and recorded for over 120 h. To ensure statistical robustness, we excluded low-quality embryos and only analyzed high-quality embryos that were 8C + at 66 h from fertilization and are considered as valid candidates for transfer. Using unsupervised K-means clustering of 14,159 8C + embryos, we identified nine subtypes with distinctive morphokinetic dynamics, including fast and slow developing embryos, as well as embryos that start fast and finish slow. Notably, a recent report performed self-supervised clustering of the time-lapse video files in order to assess embryo viability, which demonstrates the utility of entire frame-wise clustering approaches [47]. The maternal age distributions of the clusters overlapped and were statistically indistinguishable from low-quality 8C- embryos, supporting the notion that maternal age may affect the size of the oocyte collection cycle but not the inherent heterogeneity of embryos within each cycle [48]. We found that similar blastulation rates were exhibited by the clusters, which is consistent with the high developmental quality of the embryos. However, the implantation rates of the clusters varied, which is indicative of the degree of association between the morphokinetic properties of the embryos and their developmental potential. In particular, three clusters that were characterized by low implantation rates were distinctively characterized by poor synchronization of the third mitotic cell cleavage cycle, thus defining a potential potent marker of embryo quality.
Following this retrospective study, clinical implementation of automated annotation will require its employment and validation in prospective studies. Below, we present in conceptual terms an optional design for a clinical study that is expected to demonstrate safety and clinical utility while satisfying regulatory requirements and overcoming ethical limitations. A multicenter prospective embryo transfer study will include a nonselection arm followed by a controlled and randomized trial in healthy patients [49]. No age or ethnic restrictions should be applied for preliminary screening. The selection of embryos for transfer will be performed according to an established policy that employs a regulatory-approved morphokinetic-based decision support tool. The IVF protocols and the decision making process for determining the day-of-transfer and number of transferred embryos per cycle will not be modified. For the nonselection study, automated annotation will be performed blindly parallel to manual annotation. The implantation potential of the embryos will be based only on the latter and the embryos with the highest predicted developmental potential will be selected for transfer. Once the implantation outcomes of the transferred embryos are determined, the automated annotations will be unblinded, their implantation potential will be evaluated using the same morphokinetic criteria, and the corresponding predictive value will be calculated and compared with the existing policy. Unless the predictive value that is generated by automated annotation is significantly inferior to the current policy, a randomized controlled trial will be performed next. The embryos in the treatment and the control groups will be annotated via automated and manual annotation, respectively, their implantation potential will be predicted, and the embryos for transfer will be selected as described above. The clinical benefit of automated annotation will be calculated by comparing the implantation rates of the control and treatment groups. To mediate ethical constrains, a preliminary study will be excluded to a subgroup of high-quality embryos that will be prescreened based on their predicted developmental potential. Given the obvious advantages of automated morphokinetic annotation over manual annotation, prospective studies are required to demonstrate predictive value and clinical benefit that are either comparable or exceeding manual annotation. In addition to overcoming the obvious limitations of manual annotation in the busy laboratory settings and eliminating inter-observer and intra-observed variations, clinical utilization of automated morphokinetic annotation will provide standardization of embryo selection protocols, improve implantation and live-birth rates while shortening time-to-pregnancy and support single embryo transfer policies.
Data availability
The copyrights of the code are owned by Yissum–the technology transfer company of The Hebrew University of Jerusalem. Requests can be sent to A.B. The clinical data are owned by Hadassah Medical Center and by Clalit Health Services. Restrictions apply to the availability of these data, which were used anonymously under ethical agreements with each clinic separately for this study, and so are not made publically available. Access requests can be directed to A.B.M. (Hadassah Medical Center), Y.O. (Kaplan Medical Center), I.H.V (Soroka University Medical Center), Y.S. (Women's Hospital, Rabin Medical Center).
References
Gardner DK, Meseguer M, Rubio C, Treff NR. Diagnosis of human preimplantation embryo viability. Hum Reprod Update. 2015;21(6):727–47.
Consortium EP, Group SI-EBW, Kokkali G, Coticchio G, Bronet F, Celebi C, et al. ESHRE PGT consortium and SIG embryology good practice recommendations for polar body and embryo biopsy for PGT. Hum Reprod Open. 2020;2020(3):hoaa020.
Practice Committees of the American Society for Reproductive M, the Society for Assisted Reproductive Technology. Electronic address aao, practice committees of the american society for reproductive M, the Society for Assisted Reproductive T. The use of preimplantation genetic testing for aneuploidy (PGT-A): a committee opinion. FertilSteril. 2018;109(3):429–36.
Taylor TH, Gitlin SA, Patrick JL, Crain JL, Wilson JM, Griffin DK. The origin, mechanisms, incidence and clinical consequences of chromosomal mosaicism in humans. Hum Reprod Update. 2014;20(4):571–81.
Sanchez T, Seidler EA, Gardner DK, Needleman D, Sakkas D. Will noninvasive methods surpass invasive for assessing gametes and embryos? Fertil Steril. 2017;108(5):730–7.
Katz-Jaffe MG, Gardner DK. Symposium: innovative techniques in human embryo viability assessment. Can proteomics help to shape the future of human assisted conception? Reprod Biomed Online. 2008;17(4):497–501.
Brison DR, Houghton FD, Falconer D, Roberts SA, Hawkhead J, Humpherson PG, et al. Identification of viable embryos in IVF by non-invasive measurement of amino acid turnover. Hum Reprod. 2004;19(10):2319–24.
Sturmey RG, Bermejo-Alvarez P, Gutierrez-Adan A, Rizos D, Leese HJ, Lonergan P. Amino acid metabolism of bovine blastocysts: a biomarker of sex and viability. Mol Reprod Dev. 2010;77(3):285–96.
Gardner DK, Lane M, Stevens J, Schoolcraft WB. Noninvasive assessment of human embryo nutrient consumption as a measure of developmental potential. Fertil Steril. 2001;76(6):1175–80.
Yanez LZ, Han J, Behr BB, Pera RAR, Camarillo DB. Human oocyte developmental potential is predicted by mechanical properties within hours after fertilization. Nat Commun. 2016;7:10809.
Gardner DK, Balaban B. Assessment of human embryo development using morphological criteria in an era of time-lapse, algorithms and ‘OMICS’: is looking good still important? Mol Hum Reprod. 2016;22(10):704–18.
Nasiri N, Eftekhari-Yazdi P. An overview of the available methods for morphological scoring of pre-implantation embryos in in vitro fertilization. Cell J. 2015;16(4):392–405.
Desai NN, Goldstein J, Rowland DY, Goldfarb JM. Morphological evaluation of human embryos and derivation of an embryo quality scoring system specific for day 3 embryos: a preliminary study. Hum Reprod. 2000;15(10):2190–6.
Alpha Scientists in Reproductive M, Embryology ESIGo. The Istanbul consensus workshop on embryo assessment: proceedings of an expert meeting. Hum Reprod. 2011;26(6):1270–83.
Wong CC, Loewke KE, Bossert NL, Behr B, De Jonge CJ, Baer TM, et al. Non-invasive imaging of human embryos before embryonic genome activation predicts development to the blastocyst stage. Nat Biotechnol. 2010;28(10):1115–21.
Rubio I, Galan A, Larreategui Z, Ayerdi F, Bellver J, Herrero J, et al. Clinical validation of embryo culture and selection by morphokinetic analysis: a randomized, controlled trial of the EmbryoScope. Fertil Steril. 2014;102(5):1287-94 e5.
Motato Y, de los Santos MJ, Escriba MJ, Ruiz BA, Remohi J, Meseguer M. Morphokinetic analysis and embryonic prediction for blastocyst formation through an integrated time-lapse system. Fertil Steril. 2016;105(2):376-84 e9.
Milewski R, Kuc P, Kuczynska A, Stankiewicz B, Lukaszuk K, Kuczynski W. A predictive model for blastocyst formation based on morphokinetic parameters in time-lapse monitoring of embryo development. J Assist Reprod Genet. 2015;32(4):571–9.
Meseguer M, Herrero J, Tejera A, Hilligsoe KM, Ramsing NB, Remohi J. The use of morphokinetics as a predictor of embryo implantation. Hum Reprod. 2011;26(10):2658–71.
Petersen BM, Boel M, Montag M, Gardner DK. Development of a generally applicable morphokinetic algorithm capable of predicting the implantation potential of embryos transferred on Day 3. Hum Reprod. 2016;31(10):2231–44.
Blank C, Wildeboer RR, DeCroo I, Tilleman K, Weyers B, de Sutter P, et al. Prediction of implantation after blastocyst transfer in in vitro fertilization: a machine-learning perspective. Fertil Steril. 2019;111(2):318–26.
Huang B, Zheng S, Ma B, Yang Y, Zhang S, Jin L. Using deep learning to predict the outcome of live birth from more than 10,000 embryo data. BMC Pregnancy Childbirth. 2022;22(1):36.
Kan-Tor Y, Zabari N, Erlich I, Szeskin A, Amitai T, Richter D, et al. Automated Evaluation of Human Embryo Blastulation and Implantation Potential using Deep-Learning. Adv Intell Syst. 2020;2(10):2000080.
Verleysen M, editor Learning high-dimensional data. In: Advanced research workshop on limitations and future trends in neural computing, 22–24 October 2001. Siena (Italy), 2001.
Oseledets IV, Tyrtyshnikov EE. Breaking the Curse of Dimensionality, or How to Use Svd in Many Dimensions. Siam J Sci Comput. 2009;31(5):3744–59.
Rienzi L, Capalbo A, Stoppa M, Romano S, Maggiulli R, Albricci L, et al. No evidence of association between blastocyst aneuploidy and morphokinetic assessment in a selected population of poor-prognosis patients: a longitudinal cohort study. Reprod Biomed Online. 2015;30(1):57–66.
Sundvall L, Ingerslev HJ, Breth Knudsen U, Kirkegaard K. Inter- and intra-observer variability of time-lapse annotations. Hum Reprod. 2013;28(12):3215–21.
Storr A, Venetis CA, Cooke S, Kilani S, Ledger W. Inter-observer and intra-observer agreement between embryologists during selection of a single Day 5 embryo for transfer: a multicenter study. Hum Reprod. 2017;32(2):307–14.
Adolfsson E, Andershed AN. Morphology vs morphokinetics: a retrospective comparison of inter-observer and intra-observer agreement between embryologists on blastocysts with known implantation outcome. JBRA Assist Reprod. 2018;22(3):228–37.
Liu ZH, Huang B, Cui YQ, Xu YF, Zhang B, Zhu LX, et al. Multi-task deep learning with dynamic programming for embryo early development stage classification from time-lapse videos. Ieee Access. 2019;7:122153–63.
Malmsten J, Zaninovic N, Zhan QS, Rosenwaks Z, Shan J. Automated cell division classification in early mouse and human embryos using convolutional neural networks. Neural Comput Appl. 2021;33(7):2217–28.
Raudonis V, Paulauskaite-Taraseviciene A, Sutiene K, Jonaitis D. Towards the automation of early-stage human embryo development detection. Biomed Eng Online. 2019;18(1):120.
Leahy BD, Jang WD, Yang HY, Struyven R, Wei D, Sun Z, et al. Automated measurements of key morphological features of human embryos for IVF. Med Image Comput Comput Assist Interv. 2020;12265:25–35.
Lukyanenko S, Jang WD, Wei D, Struyven R, Kim Y, Leahy B, et al. Developmental stage classification of embryos using two-stream neural network with linear-chain conditional random field. Med Image Comput Comput Assist Interv. 2021;12908:363–72.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. p. 770–8.
Akhter N, Shahab M. Morphokinetic analysis of human embryo development and its relationship to the female age: a retrospective time-lapse imaging study. Cell Mol Biol (Noisy-le-grand). 2017;63(8):84–92.
Lebovitz O, Michaeli M, Aslih N, Poltov D, Estrada D, Atzmon Y, et al. Embryonic Development in Relation to Maternal Age and Conception Probability. Reprod Sci. 2021;28(8):2292–300.
Chamayou S, Patrizio P, Storaci G, Tomaselli V, Alecci C, Ragolia C, et al. The use of morphokinetic parameters to select all embryos with full capacity to implant. J Assist Reprod Genet. 2013;30(5):703–10.
Cetinkaya M, Pirkevi C, Yelke H, Colakoglu YK, Atayurt Z, Kahraman S. Relative kinetic expressions defining cleavage synchronicity are better predictors of blastocyst formation and quality than absolute time points. J Assist Reprod Genet. 2015;32(1):27–35.
Xi Q, Yang Q, Wang M, Huang B, Zhang B, Li Z, et al. Individualized embryo selection strategy developed by stacking machine learning model for better in vitro fertilization outcomes: an application study. Reprod Biol Endocrinol. 2021;19(1):53.
Feyeux M, Reignier A, Mocaer M, Lammers J, Meistermann D, Barriere P, et al. Development of automated annotation software for human embryo morphokinetics. Hum Reprod. 2020;35(3):557–64.
Ciray HN, Campbell A, Agerholm IE, Aguilar J, Chamayou S, Esbert M, et al. Proposed guidelines on the nomenclature and annotation of dynamic human embryo monitoring by a time-lapse user group. Hum Reprod. 2014;29(12):2650–60.
Soukhov E, Karavani G, Szaingurten-Solodkin I, Alfayumi-Zeadna S, Elharar G, Richter D, et al. Prediction of embryo implantation rate using a sole parameter of timing of starting blastulation. Zygote. 2022;30(4):501–8.
Reignier A, Girard JM, Lammers J, Chtourou S, Lefebvre T, Barriere P, et al. Performance of Day 5 KIDScore morphokinetic prediction models of implantation and live birth after single blastocyst transfer. J Assist Reprod Genet. 2019;36(11):2279–85.
Campbell A, Fishel S, Bowman N, Duffy S, Sedler M, Hickman CF. Modelling a risk classification of aneuploidy in human embryos using non-invasive morphokinetics. Reprod Biomed Online. 2013;26(5):477–85.
Campbell A, Fishel S, Bowman N, Duffy S, Sedler M, Thornton S. Retrospective analysis of outcomes after IVF using an aneuploidy risk model derived from time-lapse imaging without PGS. Reprod Biomed Online. 2013;27(2):140–6.
Kragh MF, Rimestad J, Lassen JT, Berntsen J, Karstoft H. Predicting embryo viability based on self-supervised alignment of time-lapse videos. IEEE Trans Med Imaging. 2022;41(2):465–75.
Aizer A, Haas J, Shimon C, Konopnicki S, Barzilay E, Orvieto R. Is There Any Association Between the Number of Oocytes Retrieved, Women Age, and Embryo Development? Reprod Sci. 2021;28(7):1890–900.
Tiegs AW, Tao X, Zhan Y, Whitehead C, Kim J, Hanson B, et al. A multicenter, prospective, blinded, nonselection study evaluating the predictive value of an aneuploid diagnosis using a targeted next-generation sequencing-based preimplantation genetic testing for aneuploidy assay and impact of biopsy. Fertil Steril. 2021;115(3):627–37.
Funding
A.B. greatly appreciates support from the European Research Council – Proof of Concept Grant (PoC 966830) and the European Research Council Grant (ERC-StG 678977). The funders did not play any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical approval
This research was approved by the Investigation Review Boards of the data-providing medical centers: Hadassah Hebrew University Medical center IRB number HMO 558–14; Kaplan Medical Center IRB 0040–16-KMC; Soroka Medical Center IRB 0328–17-SOR; Rabin Medical Center IRB 0767–15-RMC.
Competing interests
N.Z., N.S., Y.O., Z.S., Y.S., D.R., and A.B. declare no financial or non-financial competing or other conflict of interest.
I.H.V. and A.B.M. declare having no conflict of interest during the time of scientific collaboration and data collection that are relevant to this work. Since January 2020 A.B.M. serve as CTO and Chief Medical Officer, respectively and since March 2020 I.H.V. serves as the Scientific Director of Fairtility LTD, which is a company that incorporates AI into different stages in fertility treatment.
Y.K.T. declares no financial or non-financial competing or other conflict of interest. Since September 2021, Y.K.T. discloses being employed by IBM-Research.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zabari, N., Kan-Tor, Y., Or, Y. et al. Delineating the heterogeneity of embryo preimplantation development using automated and accurate morphokinetic annotation. J Assist Reprod Genet 40, 1391–1406 (2023). https://doi.org/10.1007/s10815-023-02806-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10815-023-02806-y