Keywords

Introduction

Conventional methods have not been very successful in endometriosis-specific biomarker discovery, and to date there are no reliable non-invasive or minimally invasive diagnostic markers for endometriosis. Therefore, there is a considerable need for non-invasive biomarkers, because due to the non-specific symptoms , the average delay between the onset of symptoms and the surgical diagnosis is almost 7 years [1]. The delayed diagnosis may in turn lead to more severe complications and is associated with remarkable healthcare costs [2]. Objective and reliable non-invasive diagnostic biomarkers would not only avoid the unnecessary laparoscopy in suspicious cases but would also make it possible to get the diagnosis of endometriosis earlier and thus provide an easy strategy for monitoring the disease treatment efficacy and recurrence [3]. However, despite extensive research in this field during the past 10 years, there are still no reliable non-invasive diagnostic markers for endometriosis [4, 5], and numerous women with nonspecific complaints, such as infertility and pelvic pain, undergo diagnostic laparoscopy. Thus, ‘omicsʼ-level studies using both easily assessable materials like blood, urine, and menstrual blood but also endometrium and endometriotic lesions are one of the top research priorities in the field.

The high-throughput techniques provide massive data from the genome (variability in DNA sequence in the genome, i.e. genomics), epigenome (epigenetic modifications of DNA, i.e. epigenomics), transcriptome (variability in composition and abundance of mRNA and miRNA levels, i.e. transcriptomics), proteome (variability in composition and abundance of the proteins, i.e. proteomics), and metabolome (variability in composition and abundance of metabolites, i.e. metabolomics/metabonomics). The major advantage of ‘omics’ studies is that the data can be collected without existing hypotheses, and a primary research question is not always needed (first experiment-then-hypothesise approach) [6]. This could be particularly useful when studying complex diseases with unknown pathogenesis, such as endometriosis. There are still many missing pieces in the puzzle of endometriosis, and the new ‘omicsʼ studies promise to add new biological knowledge transferrable into the development of disease-specific biomarkers. The considerable increase (15 publication in 1999–2006, 104 publication in 2007–2016, altogether 118 studies) in ‘omicsʼ research is a definite sign that the ‘omicsʼ revolution in endometriosis is actively ongoing.

In this chapter, we take a closer look at the high-throughput studies applied in endometriosis research, namely, genomics, transcriptomics, epigenomics, proteomics, and metabolomics (Fig. 1). We summarise the existing information concerning endometrium, endometriotic lesions, blood, and body fluids in order to describe the potential disease-specific biomarkers. Also, future perspectives of single-cell ‘omicsʼ in endometriosis biomarker research will be provided. And finally, we discuss the importance of sample collection and proper study design in high-throughput studies.

Fig. 1
figure 1

‘Omicsʼ publications in endometriosis studies. Literature search was performed in PubMed up to December 2016. Only publications that were in English were considered. The keyword ‘endometriosis’ was one-by-one searched with terms: ‘endometrium + microarrayʼ, ‘miRNA + microarrayʼ, ‘sequencingʼ, ‘microarrayʼ, ‘gene expression + microarrayʼ, ‘exome sequencingʼ, ‘GWASʼ, ‘CNVʼ, ‘genomicsʼ, ‘proteomicsʼ, ‘metabolomicsʼ, ‘DNA methylation + microarrayʼ, ‘DNA alterations + microarrayʼ, and ‘proteomeʼ. Some of the eligible studies were identified using the reference list of appropriate review articles. In total 118 ‘omicsʼ studies were included into this review chapter

Search for Endometriosis Biomarkers: ‘Omics’ Studies and Endometrium

Endometrium is not just a uniform tissue that undergoes cyclical changes under the influence of endogenous hormones, cytokines, and chemokines but an assortment of different cells, each with their own special functions responsible for tissue differentiation, desquamation, and regeneration. It is evident that eutopic endometrium of women with endometriosis functions normally and has almost comparable responsiveness to steroid hormones; however, there is evidence from epigenomic, transcriptomic, and proteomic studies that endometrial tissue from patients with endometriosis and healthy women is differently regulated at the molecular level. Therefore, understanding the complex mechanisms controlling the changes within the endometrium is crucial to find endometrial biomarkers for endometriosis.

Genomic studies focusing only on eutopic endometria of endometriosis patients have not been very popular, and to date, only two studies have investigated somatic DNA mutations in endometrium (Table 1). Guo et al. found a number of individual chromosomal losses and gains in laser capture microdissection (LCM)-harvested endometrial epithelial cells and hypothesised that these genomic alterations could be the proximate cause of endometriosis [7]. Li et al. conducted whole-exome sequencing of blood DNA and LCM-harvested endometrial cells from eutopic and ectopic endometria of 16 endometriosis patients and eutopic endometria of 5 healthy women [8]. They found that DNA originating from healthy endometria contains thousands of somatic mutations that are absent in blood DNA. Furthermore, the general somatic mutation spectrum in endometria of women with and without endometriosis was very similar and authors proposed that most of the mutations are probably benign and irrelevant to endometriosis pathogenesis [8].

Table 1 ‘Omicsʼ studies in endometriosis

Aberrant DNA methylation is shown to contribute to many human diseases, and there is accumulating data from DNA methylation studies that methylation alterations in certain genes could contribute to the pathogenesis of endometriosis (reviewed [9]). So far, three studies have applied genome-wide microarray-based DNA methylation analysis to eutopic endometria of endometriosis patients [10,11,12] (Table 1). The study by Naqvi et al. described several aberrantly methylated and expressed genes, among them MGMT, DUSP22, CDCA2, ID2, TNFRSF1B, ZNF681, and IGSF21 have previously not been associated with endometriosis [10]. Although several formerly known genes with altered methylation levels, including MAFB, HOXD10, and HOXD11, were highlighted, alterations in DNA methylation levels in other genes (PR-B, CYP19A1, SF1, COX2, and ER-β) previously associated with endometriosis were not confirmed. The study by Saare et al. showed that the endometrial DNA methylation profiles were highly similar between endometria of patients and controls but largely influenced by the menstrual cycle phases [11]. Authors suggested that DNA methylation differences are likely not the main reason for endometriosis development, but it is crucially important to take into account the normal epigenetic changes across the menstrual cycle when looking for disease-specific methylation differences in endometrium. A subsequent study by Houshdaran et al. compared endometrial DNA methylation patterns and associated gene expression levels in endometriosis patients and healthy controls across the menstrual cycle and found a small number of differentially methylated loci between the patients and controls [12]. The differences in endometrial DNA methylome were most contrasting between the patients and controls in the mid-secretory phase (137 CpG sites, corresponding to 125 loci), followed by proliferative (58 CpG sites, corresponding to 58 loci) and early-secretory phase (39 CpG sites, corresponding to 36 loci). Interestingly, there were no overlapping differentially methylated genes in all three genome-wide studies [10,11,12]. Based on these results, it can be proposed that the normal physiological fluctuations during the menstrual cycle may have larger impact on endometrial DNA methylation signature than disease/non-disease status, and thus, the DNA methylation changes in endometria of patients is probably not the primary cause for endometriosis development.

Several transcriptome studies have used mRNA microarray technology to resolve the question whether there are any differences between endometria of patients with endometriosis and healthy women [13,14,15,16,17,18,19,20,21,22,23,24,25] (Table 1). While a majority of these studies have yielded numerous candidate genes, the amount of genes which have consistently been shown as up- or downregulated has remained small. Aghajanova and Giudice provided evidence that also the endometria from patients with different endometriosis stages have differences on the molecular level [18]. Further, the authors proposed that the influence of menstrual cycle phase on endometrial transcriptome could be larger than the presence or absence of endometriosis. Still, dysregulation of progesterone and/or cyclic adenosine monophosphate (cAMP) -regulated genes and genes related to thyroid hormone action and metabolism between endometria of patients with different endometriosis stages and menstrual cycle phases was found. Also, upregulation of epidermal growth factor receptor (EGFR) and extracellular matrix proteoglycan versican (VCAN) during the early secretory phase was found in severe versus mild disease [18]. The pathway analysis of differently expressed genes in endometria of patients with severe endometriosis exhibited dysregulation of PI3K/AKT, JAK/STAT, SPK/JNK, and MAPK pathways that have been associated with endometriosis pathogenesis in several studies [26]. The study conducted by Tamaresis et al., comparing endometria of patients and controls, found 18 upregulated and 11 downregulated genes in all three studied menstrual phases, and also a number of genes were dysregulated in patients with different stages of the disease [19]. They used gene expression data of 148 women to develop a molecular classifier that distinguishes endometria of women with and without endometriosis and found that the best performing classifiers, enabling identification of endometriosis with 90–100% accuracy, were mostly menstrual phase specific and utilised relatively few genes to determine the presence and severity of the disease. Multiple pathways were found to be activated in the proliferative and early secretory phase endometrium (JAK/STAT, EGF/PGF/DGF, PI3K-AKT signalling, p53 signalling, integrin-mediated cell adhesion) of women with moderate-severe endometriosis compared to minimal-mild endometriosis [19], and this was in good concordance with the previous results [18]. Dysregulation of the RAS/RAF/MAPK and PI3 kinase signalling pathway genes, which participate in a wide variety of cellular functions and cell survival, is identified in several studies [18,19,20, 25], referring to a link between these pathways and disease pathogenesis. Ahn et al. noticed that based on the unsupervised hierarchic clustering analysis, the overall gene expression signature of endometria from patients and controls was similar [23]. Still, 91 differentially expressed genes involved in regulation of decidualization, cellular adhesion, cytokine-cytokine receptor interaction, apoptosis, and complement pathway were found. In the latest study by Zhou et al., mid-secretory endometria from patients and controls were analysed and 357 differentially expressed mRNAs were found to be involved in signalling pathways such as the JNK/MAPK, PI3K-AKT, p53, adherens junction, and calcium signalling pathway [20]. In addition to studies reporting distinct endometrial molecular signatures of endometriosis patients, there are also evidence that endometrial receptivity gene signature during the implantation window is similar in patients with endometriosis and healthy women [22, 27].

Several microarray-based microRNA (miRNA) studies concentrating on eutopic endometria have been performed [28,29,30,31,32]. Burney et al. studied eutopic endometria from patients and controls to reveal a disease-specific endometrial miRNA signature [28]. They found six downregulated miRNAs from miR-9 and miR-34 families in eutopic endometria of endometriosis patients and suggested that downregulation of miR-34 family could be involved in maintaining the molecular fingerprint in proliferative endometrium and mediate the delayed proliferative to secretory transition observed in women with moderate-severe endometriosis [28]. A following study by Laudanski et al. reported a lower expression of miR-483-5p and miR-629* in the eutopic endometrium of women with advanced ovarian endometriosis compared to controls [29]. They suggested that expression changes of these miRNAs are a consequence of an early defect in the physiological activity of the proliferative endometrium, ultimately resulting in the overgrowth of this tissue outside the uterus [29]. Subsequently, Laudanski et al. utilised a more comprehensive array and reported the presence of 136 upregulated miRNAs in the eutopic endometrium of patients with endometriosis compared with the healthy women [30]. However, after validation, only three out of 11 validated miRNAs revealed borderline significance. In the study by Braza-Boils et al., both eutopic endometria from patients and controls and endometriotic tissues were studied, and only five miRNAs were found to be differentially expressed in eutopic endometria of endometriosis patients compared to healthy endometrium [31]. Thirty-six downregulated miRNAs in endometria of patients were also reported by Shi et al. [32]. However, the comparison of all results from aforementioned miRNA studies showed a minute overlap, and only two miRNAs (miR-9* [28, 32] and miR-636 [31, 32]) were reported in at least two studies. Therefore, as different miRNA studies have reported different candidate miRNAs, the potential application of endometrial miRNAs as endometriosis biomarkers is still limited. Clearly, our knowledge about the endometrial miRNome and its physiological and pathophysiological significance in association with endometriosis is scarce and remains to be unravelled.

The functional interpretation and understanding of the proteome is one of the current challenges in biology due to the presence of sequence variations, alternative splicing, and epigenetic and post-translation modifications [33, 34]. The complexity of the proteome is illustrated by the fact that there is a poor correlation between the transcript levels and the abundance of the corresponding proteins [35, 36]. Proteomic research in endometriosis is currently a ‘hot topicʼ, and a number of endometrial proteome studies have been performed in endometriosis patients [13, 35, 37,38,39,40,41,42,43,44] (Table 1). A long list of potential disease-related proteins has been proposed but only a few of them, like vimentin, peroxiredoxin, HSP70, HSP90, annexins, actins, and 14-3-3 family proteins (phosphoserine- or phosphothreonine-binding proteins), are consistently identified as differentially expressed in patients in at least three different studies. A recent excellent review by Siva et al. summarises the current situation in the proteomic research field—so far no clear biomarker or therapeutic targets have been discovered [45]. Nevertheless, as the protein synthesis is the final result of the gene expression and is directly linked to the phenotype, the endometrial proteome studies do hold a great promise for future biomarker discovery.

Taken together, the large-scale ‘omicsʼ studies have provided clear evidence that the endometrial genome, epigenome , transcriptome , and proteome are differently regulated in endometriosis. Although the concordance between different ‘omics’ studies has been moderate, some potential biomarkers such as miR-9 and miR-636 family; disease-related pathways PI3K/AKT, JAK/STAT, SPK/JNK, and MAPK from transcriptome studies; and proteins like vimentin, peroxiredoxin, HSP70, HSP90, annexins, actins, and 14-3-3 family members from proteome studies have been proposed.

‘Omicsʼ Studies of Endometriotic Lesions and Possible Biomarkers

When it comes to biomarkers research, endometriotic lesions are a less-favoured study object than endometrial biopsies , as lesions do not provide direct non-invasive or minimally invasive biomarkers for clinical use. Nevertheless, the studies using lesions are crucial for detecting molecular alterations involved in the disease development and pathogenesis and thereby provide valuable information for biomarker research.

Microarrays and single nucleotide polymorphism (SNP) genotyping technologies together with recent advances in high-throughput sequencing have led to a rapid progress in genomic studies in endometriosis and provided new evidence about the genetic background of the disease. However, genome-wide studies have provided no clear consensus about the somatic DNA alterations either in endometriotic lesions and/or eutopic endometria. A number of studies have reported chromosomal alterations, more frequently gains in chromosomes 1p, 3p, 6q, 17q, and Xq and losses in chromosomes 1p, 5p, and 6q [46,47,48,49,50], while other studies have found no chromosomal aberrations in ectopic endometrial tissue or eutopic endometrium [51, 52], thus raising a question about the relevance of DNA genomic imbalance in the pathogenesis of endometriosis (Table 1). Saare et al. used SNP microarrays instead of traditional array comparative genomic hybridization (aCGH)] to compare the same patients’ blood, endometria, and LCM-harvested cells of endometriotic lesions and found no evidence of disease-specific somatic DNA copy number alterations (SCNAs) [52]. The authors suggested that some SCNAs identified in previous studies may be related to the detection methodology (CGH or array-CGH) as it has been shown that some G-C-rich chromosomal regions (1p and 16p and chromosomes 19 and 22) tend to give false-positive results [50, 53].

The identification of epigenetic biomarkers for endometriosis diagnostics is definitely an emerging, challenging, and still largely uncovered field of investigation. In the recent years, researchers have turned their major interest from transcriptome to epigenome , and a few large-scale epigenome studies, describing the lesion-specific DNA methylation profiles, have been performed (Table 1). The genome-wide DNA methylation profiles of endometriotic lesions or stromal cells originating from lesions have been described in three studies [54,55,56]. Borghese et al. published the first study describing the global DNA methylation profile of different endometriotic lesions, including ovarian endometriomas, deep infiltrating endometriosis, and superficial endometriosis, and showed that global methylation pattern was similar in different lesion types and eutopic endometria [55]. When global methylation data of eutopic endometria was compared to ovarian endometriomas in combination with gene expression data, 35 genes were found to share alterations both in methylation and expression patterns [55]. Specific regions were consistently hypermethylated (or hypomethylated) in all subtypes of the disease, and other regions were strictly altered in one endometriosis type only, and variation in methylation was more likely to occur at discreet loci across the genome. The later study by Dyson et al. found more than four thousand differentially methylated CpGs when stromal cells from eutopic endometria were compared to stromal cells from endometriomas, and the authors concluded that endometriotic cells possess a unique epigenetic fingerprint [54]. The analysis of differentially methylated and expressed genes identified 403 genes that were aberrantly methylated and differentially expressed in endometriosis, among them are genes from the HOXA cluster, ESR1, NR5A1, and GATA family transcription factors [54]. Although a different study design (entire lesions vs. cultivated stromal cells) and platforms were applied to interrogate DNA methylation, both investigations [54, 55] reported different methylation of ADAP1, HPCAL1, PRKAG2, PRKCZ, RIPK1, SEC61A1, ZNF22, and HOXD10 genes. Most recent study by Yamagata et al. analysed stromal cell cultures from endometriomas and eutopic endometria of patients and controls and found that methylation profiles of eutopic endometria were very similar but significantly different from stromal cells originating from endometriomas [56]. The genes with altered methylation in endometriomas were related to signal transduction, molecular functions of receptors and signalling molecules, and cytokine-cytokine receptor interactions and development. Comparison of datasets from Yamagata et al. and Dyson et al. revealed four overlapping genes: HOXD10, BST2, GATA4, and TCF21, but when all three DNA methylation studies were compared to each other, only HOXD10 was seen to be differentially methylated.

The first transcriptome studies in endometriosis applying microarray technology were conducted already in 2002, and since then, many studies have been carried out to reveal the specific gene expression profile of endometriotic lesions. Transcriptome studies in lesions can be divided into two groups—studies performed in 2002–2007 [57,58,59,60,61,62,63,64,65] that used less comprehensive microarrays [up to 23 thousand (23K) probes] and studies conducted in the recent years using advanced large-scale microarrays covering 44K or 60K probes [23, 30, 51, 66,67,68,69,70,71] (Table 1). Although the list of candidates in each study contains a remarkable number of dysregulated genes in ectopic endometrial tissue compared to eutopic endometria, there is little concordance in the reported genes between studies. However, altered expression of genes belonging to RAS, MAPK, and PI3K signalling pathways was proposed in several studies (reviewed in [26]). Khan et al. found 50 differently expressed genes associated with immunological, neurocrine, and endocrine functions and gynaecological cancers (CHEK1, ERBB family, laminin gamma, and Ki-67), but there was no overt oncogenic potential in endometriotic tissue [66]. Also, they reported a list of 28 novel genes that were not previously associated with endometriosis, representing potential markers for ovarian endometriosis. The following studies by Monsivais et al., Crispi et al., and Suryawanshi et al. found many dysregulated genes that belong mostly to tissue and organ development pathways [68], pathways regulating metabolism and action of prostaglandins and glucocorticoids [72], and complement pathway [24]. Sun et al. used a microarray comprising of probes for long non-coding RNAs (lncRNA) and mRNAs and found hundreds of dysregulated lncRNAs and thousands of mRNA transcripts in ectopic endometrial tissues compared to paired eutopic endometrial tissues [67]. Authors proposed that many dysregulated lncRNAs may participate in biological pathways related to endometriosis through cis- and trans-regulation of target protein-coding genes. In the latest study by Ahn et al., a large number of differentially expressed genes involved in cytokine-cytokine receptor interaction, cellular adhesion, immune cell recruitment, apoptosis, cell signalling, T-cell cytotoxicity, and regulation of inflammatory responses were found [23].

To date, eight high-throughput miRNA studies describing the miRNome of the whole endometriotic lesion biopsies or cultured stromal cells from lesions have been performed [31, 32, 73,74,75,76,77,78] (Table 1). Each study has identified a subset of miRNAs that has been differently expressed in ectopic lesions compared to eutopic endometria. As there is a large variability between studies in the terms of design, analysis methods, and selection of controls, the concordance has been moderate, and only 22% of reported miRNAs are consistent between studies [79]. In addition to microarrays, next-generation miRNA sequencing technology has also been applied in endometriosis studies [73, 76]. Hawkins et al. compared specimens from endometrioma and normal endometrium and found several miRNAs that were upregulated (miR-29c, miR-100, miR-193a-5p, miR-202, miR-485-3p, miR-509-3-5p, miR-708, and miR-720) or downregulated (miR-10a, miR-34c-5p, miR-141, miR-200a/b/c, miR-203, miR-375, miR-429, miR-449b, miR-504, and miR-873) in endometriomas [73]. The following study by Saare et al. investigated paired samples of peritoneal endometriotic lesions and matched healthy surrounding tissues together with eutopic endometria of the same patients and found five miRNAs (miR-34c-5p, miR-449a, miR-200a, miR-200b, and miR-141) that were significantly overexpressed in lesions compared to healthy surrounding tissues [76]. Although majority of these miRNAs were reported to be associated with endometriosis pathogenesis in Hawkins et al. [73] and Ohlsson Teague et al. [75] studies, Saare et al. [76] concluded that these miRNAs rather reflect the presence of endometrial cells in the peritoneal tissue than are associated with pathologic events. Furthermore, the authors suggested that the miRNA profile of peritoneal endometriotic lesions is largely masked by the surrounding peritoneal tissue present in biopsy samples, challenging the discovery of an accurate lesion-specific miRNA profile [76]. Still, it should be pointed out that according to the results of all these miRNome studies, there was only one single miRNA (miR-200b) that was differentially expressed in all six studies in whole lesions [32, 73,74,75,76,77] but not in endometriotic stromal cells [78]. miR-200b, member of the miR-200 family, could be an attractive molecular marker that can be easily linked to disease pathogenesis because of its important role in cell migration and epithelial-mesenchymal transition (EMT). It could be proposed that altered expression of miR-200b in lesions changes the well-balanced network of EMT and leads the endometrial epithelial cells to acquire mesenchymal phenotype with higher migratory capacity.

To summarise, transcriptome studies of endometriotic lesions have already provided some clues about the pathogenesis of endometriosis, though the concern still remains about using whole-tissue biopsies instead of pure populations of endometrial epithelial and stromal cells from lesions to reveal transcriptome changes inside the lesion.

Only a few studies on endometriotic tissue proteomics have been conducted so far [80,81,82,83,84] (Table 1). However, it could be hypothesised that if disease-specific proteins with high concentration in affected tissues exist and are secreted from lesions into the blood stream, they could also be monitored in the body fluids [81] and thereby could offer potential for discovery of non-invasive markers. The results of proteome studies have shown that some proteins are differentially expressed (e.g. SM-22α and Rab37), modified (e.g. haptoglobin and Rho-GDIα,), or localised (e.g. haptoglobin) in endometriotic lesions compared to eutopic endometria of patients or healthy women [80]. Also, significant increase in transforming growth factor β-1, calponin-1, and emilin-1 [81] in ovarian endometriomas has been reported. In the recent study by Kasvandik et al., metabolic reprogramming of ectopic endometrial stromal cells with extensive upregulation of glycolysis and downregulation of oxidative respiration was noticed [82].

The above-discussed examples from genomic, epigenomic, transcriptomic, and proteomic levels have provided inconsistent evidence about the possible molecular changes occurring in the endometriotic lesions. The genomic studies in endometriotic lesions have reported chromosomal alterations more frequently in chromosomes 1p, 3p, 5p, 6q, 17q, and Xq; epigenome studies have provided evidence on altered DNA methylation in HOXD10; transcriptome studies have found altered expression of genes belonging to RAS, MAPK, and PI3K signalling pathways, and proteome studies have found upregulation of glycolysis and downregulation of oxidative respiration and differently expressed proteins like SM-22α, Rab37, haptoglobin, and Rho-GDIα. Although the ‘omicsʼ studies in endometriotic lesions require invasive procedures and will not provide biomarkers directly translatable into the clinical practice, the knowledge obtained from these studies enables more complex insight into the possible mechanisms of endometriosis pathogenesis.

‘Omicsʼ Studies on Blood and Body Fluids and Novel Endometriosis Biomarkers

The ultimate goal of ‘omicsʼ studies in endometriosis is to find robust and specific non-invasive biomarkers with acceptable sensitivity and specificity and preferably from easily assessable sources like blood and body fluids. Endometriosis biomarkers have been sought from peripheral blood (whole blood, plasma, serum), menstrual blood, endometrial fluid, peritoneal fluid and urine samples, and more than 100 markers, among them annexin V, VEGF, CA-125 and sICAM-1/or glycodelin, glycoproteins, inflammatory and non-inflammatory cytokines, and angiogenic and growth factors have been reported but with inconsistent and contradictory results [4, 5, 85, 86].

The first large-scale genomic study from blood was published in 2010, and since then, numbers of SNP microarray-based genome wide association studies (GWAS) from genomic DNA, together with following replication studies, have been conducted to reveal associations between common SNPs and endometriosis [87,88,89,90,91,92,93,94,95]. Previous meta-analysis included more than 11,506 patients and 32,678 controls and found six loci (rs12700667 on 7p15.2, rs7521902 near WNT4, rs10859871 near VEZT, rs1537377 near CDKN2B-AS1, rs7739264 near ID4, and rs13394619 in GREB1) that had consistent effects across different populations [96]. In a recent study by Steinthorsdottir et al., 1840 women with endometriosis and 129,016 controls were included into GWAS, and in addition to the previously reported susceptibility loci [96], also two new loci on 4q12 (rs17773813) upstream of KDR encoding vascular endothelial growth factor receptor 2 (VEGFR2) and rs519664 in TTC39B gene on 9p22 were identified [95].

Although GWAS have provided valuable information about novel candidate genes and genome regions, the effect sizes for the associated variants are quite moderate (odds ratios between 1.0 and 1.2), and apparently these common variants do not provide any molecular markers for direct diagnostic or prognostic tests. However, it is very likely that in the case of a common diseases, such as endometriosis, the rare variants (minor allele frequency <0.05) could contribute to the risk of the disease [97].

In addition to GWAS , high-resolution SNP arrays provide the opportunity to assess inherited copy number variations (CNVs) that normally exist in all tissues and may potentially contribute to genetic predisposition of common diseases. Although many disease-related CNVs have been described, large population-based CNV studies have also found substantial variability in CNV distribution in healthy individuals [98,99,100], challenging the findings of CNVs responsible for the development of a complex diseases. Genomic CNVs in endometriosis have been investigated so far only in two large-scale studies [101, 102]. Chettier et al. conducted a case-control study of 2126 surgically confirmed endometriosis cases and 17,974 controls of European ancestry and found no significant differences in CNV counts, excess of large CNVs, and gene-based CNVs between controls and patients [101]. However, the locus-specific analysis revealed 22 rare CNVs that were detected in 6.9% of the affected women compared to 2.1% of the general population. Three out of 22 CNVs passed a genome-wide P-value threshold, namely, a deletion at SGCZ on 8p22, a deletion in MALRD1 on 10p12.31, and a deletion at 11q14.1 [101]. Recently, six sub-telomeric putative novel CNV loci in regions 1p36.33, 16p13.3, 19p13.3, 20p13, 17q25.3, and 20q13.33 from pooled DNA samples of 100 patients and 50 controls were reported by Mafra et al. [102]. Though the genomic studies have not been very successful in uncovering the pathogenesis of endometriosis or finding disease-specific biomarkers, it is very likely that the availability of more advanced methodologies (exome sequencing and whole-genome sequencing) will provide more detailed information about the genomic background of endometriosis in the near future.

Transcriptome studies have focused on determining circulating miRNAs in blood serum or plasma. miRNAs are considered as good candidates for biomarkers because cell-free miRNAs are shown to be stable in different body fluids [103]. In healthy individuals, the miRNA profile in serum is similar to that of circulating blood cells, but in the case of physiological or pathological changes, the levels of miRNAs in serum may differ [103]. Thus, alterations in the miRNA levels may possibly be used as biomarkers for endometriosis [104]. Although more than 200 potential non-invasive miRNA biomarkers have been proposed for endometriosis [105,106,107,108,109,110], the results are still inconsistent between studies, and only 12 miRNAs (miR-548b-5p, miR-92a, miR-320d, miR-139-3p, miR-122, miR-145*, miR-15b, miR-21, miR-572, miR-9*, miR-199a-5p, miR-342-3p) have been found to be differentially expressed in at least two studies. Not a single miRNA alteration has been confirmed in all studies. As there are only six global miRNA studies published to date with moderate numbers of participants involved, thus the real diagnostic potential of miRNAs in endometriosis is not fully discovered, and further studies, using additionally more advanced sequencing techniques to fully describe the wide spectrum of miRNAs, are needed.

Proteomics analyses have been extensively conducted to identify endometriosis-specific biomarkers from blood plasma [111, 112], serum [37, 44, 113,114,115,116,117,118,119,120,121,122,123,124], urine [125,126,127,128], endometrial fluid aspirate [129], menstrual blood [130], and peritoneal fluid [128, 131,132,133,134,135,136,137]. Although the number of serum-based studies is quite remarkable, the correlation between the different studies is relatively small. Nevertheless, some of the studies have identified a signature of peptides/proteins that could discriminate patients from controls with relatively high sensitivity and specificity. Jing et al. found two proteins (5830 m/z and 8865 m/z) in serum samples that were significantly more abundant in patients than in healthy women, and the signature of these two proteins offered diagnostic potential with a sensitivity of 86.7% and specificity of 96.7% [115]. A following study by Zheng et al. found three peptide peaks (5988,7; 7185,3; 8929,8 m/z) in serum that distinguished endometriosis patients in test and training sets with a sensitivity of 89.3–91.4% and a specificity of 90–95% [113]. A study conducted by Long et al. reported 13 serum proteins with significantly different levels between controls and patients and proposed one promising peptide (4180 Da) with 100% specificity and 100% sensitivity [114]. In the most recent study by Dutta et al., two proteins, HP and A1BG, were found to be effective for the diagnosis of stage II, III, and IV endometriosis, with a sensitivity of 68–92% and a specificity of 84–96% [124]. In addition to the large number of serum proteome studies, a few plasma studies have been published [111, 112]. Fassbender et al. found that a model based on five protein/peptide peaks (2058; 2456; 3883; 14,694 and 42,065 m/z) discriminated ultrasonography-negative endometriosis with a sensitivity of 88% and a specificity of 84% [111]. A study by Liu et al. found 20 protein peaks that were up- or downregulated in the plasma of endometriosis patients [112]. Overall, the blood-based proteomic studies have brought out several potential biomarkers with relatively high sensitivity and specificity. However, it should be pointed out that the methods most commonly used in biomarker identification, SELDI-TOF-MS and MALDI-TOF-MS, have several shortcomings like limited mass range, and these methods do not provide direct protein identities. Therefore, for further biomarker discovery, other methods that enable identification of proteins (such as tandem mass spectrometry) are needed [138].

From the clinical perspective, urine that is simple to collect would be the most preferable source for disease-specific biomarkers . Similarly to other fluids, urine contains peptides and proteins that may reflect disease status and can be easily measured by proteomic methods. Indeed, proteomic studies from urine [125,126,127,128] have provided promising results. The study by Tokushige et al. found that cytokeratin-19 levels were not influenced by the menstrual cycle phase or disease severity and were elevated in patients with endometriosis [126]. The following study by El-Kasti et al. identified six peptides influenced by disease severity and menstrual cycle phase when controls were compared with moderate-severe endometriosis patients, and seven peptides when patients with minimal-mild disease were compared to moderate-severe endometriosis patients [125]. The study by Cho et al. found 22 protein spots with differential expression in patients, among them is urinary vitamin D-binding protein (VDBP) . However, the diagnostic potential of VDBP in endometriosis alone or combined with serum CA-125 remained moderate [127]. No specific urine proteomic biomarkers that could discriminate patients and controls were found in a study by Williams et al. [128].

The full potential of menstrual blood as a source of biomarkers is yet to be found. Although the collection of menstrual blood is fairly complicated, it could reflect the physiological and molecular environment of the pathologically altered endometrial cells of endometriosis patients more precisely compared to the peripheral blood (reviewed in [138]). However, there is only one study using menstrual blood for biomarkers research published so far [130]. This study identified three differentially expressed proteins as endometriosis-specific markers: collapsin response mediator protein 2, ubiquitin carboxyl-terminal hydrolase isozyme L1, and myosin regulatory light polypeptide 9 [130]. Additionally, the study reported higher expression of stem cell marker gene transcripts (Oct-4, CXCR4, SOX2, and c-MET) in the menstrual blood of patients with endometriosis [130]. The higher expression of stem cell markers in menstrual blood of women with endometriosis may indicate the importance of these markers in implantation process of endometriotic lesion.

Microenvironment of the peritoneal cavity is thought to be one important factor influencing the capability of endometrial cells to implant into the peritoneal cavity. Further, there is strong evidence that dysregulation of peritoneal immunological and proinflammatory systems, and also alterations in angiogenesis processes may play a crucial role in the progression of the disease. Therefore, a number of proteome studies of peritoneal fluid have been conducted, and several potential biomarkers have been proposed [128, 131,132,133,134,135,136]. Silicano et al. studied peritoneal fluid in patients with different disease stages and found a pattern of peptides corresponding to fibrinogen alpha chain that were more frequently present in women with moderate-severe endometriosis [131]. Study by Wölfler et al. identified 11 differentially regulated proteins that might have an impact on the development and establishment of endometriotic lesions [136]. Ferrero et al. found nine proteins mostly involved in the immune response (e.g. serotransferrin, complement C3, serum amyloid P-component, alpha-1-antitrypsin, and clusterin) with significantly higher expression in infertile endometriosis patients than in infertile controls [135]. Williams et al. detected a number of proteins with metabolic functions, such as proteins involved in glucose metabolism (phosphoglycerate kinase-1, fructose-bisphosphate aldolase A, transaldolase, triosephosphate isomerase, malate dehydrogenase) and glutathione S-transferase P which is involved in detoxification [128]. Summing up, although studies in peritoneal fluid have provided some insight into the pathogenesis of endometriosis, the concordance between the results of different studies is non-existent, and to date, there are no specific reliable biomarkers for diagnostic purposes. Also, the diagnostic potential of peritoneal fluid biomarkers in clinical practice is debatable due to the invasive nature of the peritoneal fluid collection.

Beside other ‘omicsʼ technologies, metabolomics has great potential to become a new frontier in endometriosis biomarkers research, as global changes in measurable, low-molecular-weight products of metabolism are thought to be good indicators of health status. Although the concentration of circulating metabolites provides integrative information about the tissue function within the larger context of the organism, the global metabolic profile is influenced by a number of dependent variables, such as environmental factors, altered activities or levels of enzymes, genetic factors, and lifestyle factors including diet, drugs, exercise, microbiota, hormonal homeostasis, and age (see review [139]), thereby challenging the finding of disease-specific metabolites. To date, only a few global metabolome studies using either blood serum or plasma of patients and controls have been conducted [140,141,142,143]. These studies have proposed several potential metabolites with a good diagnostic potential for endometriosis. Vouk et al. proposed a model including hydroxyl sphingomyelin (SMOH C16:1) and phosphatidylcholine/ether-phospholipid ratio (PCaa C36:2/PCae C34:2) that discriminates the ovarian endometriosis patients from controls with a sensitivity of 90% and a specificity of 84% [142]. Dutta et al. found 13 metabolites that discriminated minimal-mild endometriosis patients from healthy women with a sensitivity of 82% and specificity of 91% [141]. The study by Jana et al. found 15 metabolites showing a sensitivity and specificity of 92.83% and 100%, respectively [140]. In the most recent study by Ghazi et al., several metabolites such as 2-methoxyestradiol, 2-methoxyestrone, dehydroepiandrostion, androstenedione, and cholesterol showed significant increase in the endometriosis group compared to control group [143]. As all these four studies reported different metabolites that discriminate diseased and healthy women with high sensitivity and specificity, future studies including larger number of participants and different types of endometriosis are needed.

The research in the field of non-invasive biomarkers has been comprehensive and continuously ongoing, and valuable knowledge will be obtained piece by piece. The results from transcriptome , proteome , and metabolome studies are encouraging and hold a great promise for endometriosis biomarker discovery.

Perspectives in Single-Cell ‘Omics’

The remarkable progress in ‘omicsʼ technologies using DNA or RNA from small amount of cells or even single cells has revolutionised our conceptual understanding of biological diversity of human cells and has allowed to take a closer look into the single-cell genome , proteome , epigenome , and proteome in health and disease [144].

So far all ‘omics’ studies in endometriosis have operated on the level of systemic or multicellular analysis, and results of these studies inevitably show any changes in a greatly diluted fashion, and therefore the potential of single cell technologies in endometriosis research is yet to be realised. The traditional approaches in endometriotic lesion research have not provided any clear consensus about the genetic, transcriptomic, and epigenetic changes inside the lesions, and therefore, the ‘omicsʼ information from single cells or from homogenous cell populations from lesions (e.g. endometrial epithelial and stromal cells, stem cells, endothelial cells, monocytes, NK cells, lymphocytes, and dendritic cells) could offer new prospects to reveal the true disease-specific molecular changes (Fig. 2).

Fig. 2
figure 2

Single-cell ‘omicsʼ in a search of endometriosis-specific biomarkers

The major challenge of using single-cell technologies is not related to the research methodology or data analysis per se but is rather associated with obtaining specific single cells or cell populations from lesion biopsy. There are already some good examples of using fresh tissue biopsies for single-cell RNA sequencing [145,146,147], and the methodology can be transferred from these studies to endometriosis research. Also, there is great potential to use combinations of fluorescently labelled antibodies (e.g. CD10, CD9, and CD13 are previously shown to be markers of endometrial stromal and epithelial cells [148]) and fluorescence-activated cell sorting (FACS) for isolating single cells or cell populations from lesions. Furthermore, cell populations could be isolated from lesions using LCM technique, but as DNA and RNA quality obtained by this methodology varies a lot [149, 150], this method would probably not be the first choice. In addition to the lesion and endometrial single-cell studies, specific cells originating from blood or body fluids (like monocytes, NK cells, lymphocytes, granulocytes, endothelial cells, and progenitor cells) could offer new interesting perspectives to uncover pathologic changes related to the disease.

Although the advantages of single-cell ‘omicsʼ research in endometriosis are not utilised yet, it could be assumed that traditional transcriptome and epigenome studies will progress from the multicell level to single-cell level in the nearest future .

Systems Biology and Integrative ‘Omics’ Studies in Endometriosis

The number of ‘omicsʼ studies in endometriosis is rapidly increasing, and the massive amount of ‘omicsʼ data is becoming an immense challenge to the researchers. However, no single ‘omicsʼ analysis can fully resolve the complexity of the biology behind the disease development. Therefore, one of the main tasks for systems biology is the integration of large amounts of different types of data in order to understand the functional principles and dynamics of cellular systems and unravel the complexity of molecular networks in health and disease. Integration of multiple layers of ‘omicsʼ data requires network analysis and annotation of the involved pathways to capture meaningful information from genomics, epigenomics, transcriptomics, proteomics, metabolomics, and other ‘omicsʼ studies. After integration of single ‘omicsʼ data with computational engineering, modelling, and simulation and visualisation tools, new theoretical models can be created that provide comprehensive interpretation of integrated data (Fig. 3).

Fig. 3
figure 3

Integrated systems biology approach for potential endometriosis biomarker discovery. The idea of systems biology and concept of integration of ‘omicsʼ data in a search for endometriosis specific biomarkers are illustrated

In endometriosis research, the first attempts to use systems biology and integrated data and knowledge from different ‘omicsʼ studies have been done [12, 13, 54,55,56, 67, 73, 151]. Most of these studies have combined epigenome data with gene expression data in order to reveal correlations between DNA methylation and gene expression. Borghese et al. found 35 genes that shared alterations both in methylation and gene expression levels [55]. The following study by Yamagata et al. observed a relationship between altered methylation and mRNA expression in 75 genes from cultured stromal cells from eutopic endometria and endometrioma, including steroidogenesis-related genes NRA1, STAR, STAR6, and HSD17B2 [56]. The results of this integrative analysis suggested that aberrant DNA methylation of the key steroidogenesis-related genes causes aberrant gene expression leading to the development of endometriosis. The transcriptome and epigenome data interaction modelling of stromal cells originating from eutopic and ectopic endometria of patients together with stromal cells from healthy endometria was performed by Dyson et al. [54]. They identified hundreds of genes, correspondent to thousands of differentially methylated CpGs that were aberrantly methylated and differentially expressed, among them the HOXA cluster, ESR2, NR5A1, and PGR genes. Furthermore, the authors classified the list of these genes by protein function and found that the only proteins that reached a statistical significance were transcription factors, such as GATA transcription factors and transcriptional coregulators of the GATA family. They proposed that GATA2 could regulate genes important for the decidualization process, whereas GATA6 promotes an endometriotic phenotype via regulation of steroidogenesis in endometriotic cells [54]. Sun et al. combined transcriptomic data of eutopic and ectopic endometria from mRNA and lncRNA microarray and found hundreds of lncRNAs that were co-expressed with thousands of mRNAs [67]. Hawkins et al. performed the first transcriptome-miRNome analysis of endometriomas and eutopic endometrium in order to narrow down genes that were functionally targeted by miRNAs [73]. The combined analysis revealed several potential biologically important pathways involved in cellular development, connective tissue development and function, and cellular growth and proliferation including TGFβ and mitogen-activated kinase 1 [73]. Completely novel approach to uncover regulatory changes in endometriosis was applied by Yang et al. [151]. They used integrative analysis of gene expression data from two different datasets combined with data from transcription factor (TF) gene regulatory interactions database (identified from ChiP-seq or ChIP-chip experiments available from ChEA database) to find endometriosis-associated TFs. This data was further combined with data from protein-protein interaction database (Interologous Interaction Database (I2D) and data from Ravasi et al. [152]) to create integrated regulatory network for understanding molecular mechanisms involved in endometriosis [151]. Authors identified a network of known TFs such as androgen receptor and estrogen receptor α and β participating in endometriosis pathogenesis. Also, several new TFs, such as FOXA2 and TFAP2C, were identified and validated in mRNA and protein level [151].

In conclusion, the ‘omics’ studies in endometriosis have provided the researchers with a substantial amount of data, but clearly, the full potential of this data is not entirely utilised. Although the amount of ‘omics’ datasets in publicly available repositories (Gene Expression Omnibus – GEO, etc.) is considerable, researchers still prefer using their own small datasets [153]. While the ‘omicsʼ studies in endometriosis are relatively small in size, future studies should use the advantage of the publicly available pre-existing data to raise the power, credibility, and reliability of the findings and also to find new biomarkers without the wet-lab costs .

Study Design in ‘Omics’ Studies in Endometriosis

The outcomes of ‘omicsʼ studies have also highlighted the shortcomings related to study design that may be the reason for the poor overlap between the results from different studies.

Large variability in tissue collection, processing, and storage methodology, poor description of patient phenotype data, small size of study groups, and differences between ‘omicsʼ platforms, together with data analysis and interpretation differences, are likely to result in bias and measurement error between studies. Standardised sample collection is crucial in biomarker discovery, as small differences in the processing and handling of biological samples can lead to the pre-analytical bias and might have a huge effect on analytical reliability and reproducibility [154]. The importance of harmonising standard operating procedures for collecting phenotypical data, and for sample collection, processing and storage is discussed in detail in the publications of the World Endometriosis Research Foundation Endometriosis Phenome and Biobanking Harmonisation Project (WERF EPHect) [155,156,157]. The main goals of the harmonised data and sample collection are not only to facilitate large-scale international collaborations and decrease cross-centre variability but also to collect information for future studies addressing specific research questions on different patient subtypes. The minimum and standard recommendations for surgical phenotype data collection include information about menstrual cycle, hormone treatment, history of previous endometriosis surgery, as well as any imaging findings before the procedure and the type and duration of the procedure; the extent, exact location, and colour of endometriotic lesions; and video/photo documentation of surgery [157]. In addition, the WERF EPHect provides precise recommendations for collecting, processing, and storing fluid biospecimens (plasma, serum, saliva, urine, endometrial/peritoneal fluid, and menstrual effluent) and tissue specimens (ectopic and eutopic endometrium, peritoneum, and myometrium) and for collecting biospecimen data, including information about the menstrual phase on the day of the sample collection, menstrual cycle regularity, timing of the next menstrual cycle, administration of any premedication or anaesthetics, and also information about the weight, height, and waist and hip circumference [155, 156]. It is strongly advised to follow the WERF EPHect published guidelines for designing new studies, as the well-characterised phenotypic datasets could be used to answer various current and future research questions in endometriosis.

One of the concerns related to the design of ‘omicsʼ studies is the definition of the endometriotic lesion . Majority of the studies have compared biopsied lesion samples and endometria; however, endometriotic lesions often contain only a small proportion of endometrial glands and stromal cells and a large proportion of surrounding tissue. Histological evaluation of biopsies is routinely used in everyday practice, and several studies have demonstrated that approximately 30–50% of surgical specimens removed during laparoscopy are not confirmed by histological assessment [158,159,160,161]. Even in the case of histologically confirmed biopsied endometriotic lesions, the proportion of endometrial tissue is variable. As different tissue types have their own molecular signature, the comparison between endometrial tissue and lesions that contain a mixed population of endometrial and peritoneal or ovarian tissue will not reveal disease-specific alterations but may rather reflect molecular signature of the surrounding tissue [76]. Therefore, future ‘omicsʼ studies of lesions should focus on pure cell populations or single-cell analyses instead of studying the entire endometriotic lesion to reveal the true molecular signature of endometriotic cells.

Selecting proper controls for biomarker discovery studies is one of the key issues that need to be resolved. Currently, endometriosis studies have applied different strategies for choosing controls and have included either fertile women undergoing laparoscopic sterilisation, women undergoing laparoscopy because of infertility, hospital-based controls (women with various indications for laparoscopic procedure), and also self-reported disease-free population-based controls [162]. Selecting controls for searching biomarkers from blood and body fluids is problematic, and there is never one ideal control group, as all above-mentioned options have their pros and cons. For example, laparoscopically confirmed endometriosis-free controls usually suffer from other diseases; a self-reported disease-free population-based control group could contain a number of undiagnosed cases and thereby dilute the disease risk factor effects.

The cost of the high-throughput technologies is relatively high, the sample sizes in ‘omicsʼ studies are rather small, and therefore the studies are underpowered to take into account the variance of individual measurements. Thus, it is strongly encouraged to maximise the sample sizes in ‘omicsʼ studies by collaborating with other workgroups and sharing either phenotypically well-described samples or data. In addition, there are other study type-specific standards for minimum information, including MIAME (Minimum Information about a Microarray Experiment), MIAPE (Minimum Information about a Proteomics Experiment), MIGS-MIMS (Minimum Information about a Genome/Metagenome Sequence), MIMIx (Minimum Information about a Molecular Interaction eXperiment), MINISEQE (Minimum Information about a high-throughput Nucleotide Sequencing Experiment), and CIMR (Core Information for Metabolomics Reporting). These guidelines must be followed to ensure the interpretability of the experimental results generated using ‘omics’ technologies [163, 164].

In conclusion, a good study design in endometriosis ‘omicsʼ studies includes setting an innovative study hypothesis, defining phenotypically well-described controls/cases, calculating study power that takes into account individual measurement variance, acceptable false-positive rate, and desired power of the used platform, collecting phenotypical data and biospecimens according to the guidelines of WERF EPHect, identifying risk factors and confounders, assessing sample quality and quantity, following the protocols for ‘omicsʼ technologies according to the specific guidelines (MIAME, MIAPE etc.), considering technical duplicates and statistical methods, describing databases for data analysis, validating results using alternative technologies, presenting data (e.g. GEO database), and addressing limitations/strengths of the study.

Conclusions and Future Perspectives

The ‘omicsʼ revolution in endometriosis research is ongoing, and around 120 studies applying the high-throughput ‘omicsʼ technologies have been published to date. Significant advances in ‘omicsʼ technologies have been made to discover potential biomarkers for endometriosis; however, most of the results have not been replicated in other studies, and the practical value of the proposed biomarkers is still limited. Though the genomic studies in endometriosis have not been very successful for finding potential biomarkers, the transcriptomic studies of endometriosis have provided some clues about the potential disease related pathways. Also, the results from proteome studies have been encouraging and hold a great promise for non-invasive biomarker discovery. Furthermore, great perspectives for future endometriosis biomarkers discovery are related to metabolomics and epigenomics as these fields are still poorly covered and harbour immense opportunities. Also, the advantages of single-cell transcriptome and epigenome studies should be carefully considered when planning future research.