Introduction

Preeclampsia (PE) is a severe pregnancy-specific cardiovascular disorder that affects 2 to 8% of women and is a significant cause of maternal and fetal mortality (Aneman et al. 2020; Burton et al. 2019). This condition is characterized by newly-developed hypertension after 20 weeks of gestation, along with proteinuria, and multiple organ injuries (Cirkovic et al. 2020). Depending on the onset of clinical symptoms, it can be categorized as early-onset preeclampsia (EOPE) or late-onset preeclampsia (LOPE). EOPE, which occurs before 34 weeks of gestation, is more severe than LOPE and accounts for only 5–20% of PE cases. While the causes of early-onset and late-onset PE differ (Roberts and Hubel 2009), identifying potential biomarkers is crucial to aid in the early detection and successful treatment of EOPE, which often requires immediate pregnancy termination (Kelemu et al. 2020). Therefore, it is essential to identify new potential biomarkers to aid in the early identification, precise diagnosis, and successful treatment of EOPE, which will ultimately lower the risks of poor maternal and fetal outcomes.

Although several studies have used microRNA sequencing and other analytical tools to examine placenta development, the molecular reasons behind poor placentation are still unclear. Therefore, it is critical to identify the candidate genes associated with EOPE and elucidate their interaction mechanism for effective treatment and management of this life-threatening condition. In this study, we utilized bioinformatics to analyze differential mRNA expression in EOPE using the GEO database. We also examined immune infiltration and confirmed the diagnostic potency of key genes on clinical specimens (Fig. 1). Our findings may help identify potential biomarkers for predicting and preventing EOPE. Overall, the study sheds light on the molecular mechanisms behind EOPE and highlights the importance of further research on this topic. By identifying potential biomarkers, medical professionals can diagnose and treat EOPE more effectively, ultimately improving patient outcomes.

Fig. 1
figure 1

Study design and the flow chart of materials and methods

Materials and Methods

Acquisition and Processing of Gene Expression Data

To retrieve microarray gene expression data for our study, we utilized the Gene Expression Omnibus (GEO) repository, which contains high-throughput sequencing and microarray-based datasets. Specifically, we retrieved microarray gene expression data from GSE44711 (Blair et al. 2013) which used GPL10558 Illumina HumanHT-12 V4.0 expression bead chip and contained 16 samples (8 EOPE and 8 control samples), and GSE74341 (Liang et al. 2016), which used the Agilent-039494 SurePrint G3 Human GE v2 8 × 60 K Microarray (GPL16699) and contained 12 samples (7 EOPE and 5 control samples) that compared EOPE and gestational age-matched control placental tissue samples. We then downloaded the necessary data for these studies' analysis (Table S1). We used the "limma" R package to process and analyze the data for background correction, probe summary, and quartile normalization.

Identification of the Differentially Expressed Genes

To identify differentially expressed genes (DEGs) in EOPE samples, we utilized the “limma” R package (version 3.6.3) (Ritchie et al. 2015). DEGs were genes with an adjusted p-value < 0.05 and |log2 fold change (FC)|> 1. Among them, adjust p-value < 0.05 and log2FC > 1 were up-regulated DEGs, adjust p-value < 0.05 and log2FC < -1 were down-regulated DEGs. We used the "heatmap" (Gu et al. 2016) and “ggplot2” (version 3.3.3) R packages to generate heat maps and volcano plots, respectively. The shared DEGs among the datasets were identified using the Venn online tool (http://bioinformatics.psb.ugent.be/webtools/Venn/).

Analysis of Functional Enrichment and Pathway

Gene Ontology (GO) (Mi et al. 2019) is a widely-used method for functional enrichment studies, including cell compositions (CCs), biological processes (BPs), and molecular functions (MFs). Molecular function is the molecular activity of genes. Biological processes are the more significant cellular or physiological roles genes play in coordination with other genes. Cell components are the locations where genes perform their functions. The Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa and Goto 2000) is a database that contains information about genomes, biological pathways, diseases, and drugs. Using the human genome as a background reference, adjusting p-value < 0.05 and counting ≥ 2 as the cut-off. Gene Set Enrichment Analysis (GSEA) is a computational technique to compare a specified set of genes in different biological states. The following parameters were utilized in the GSEA: seed 2020, 10,000 computations, 10 minimum and 1000 maximum number of genes in each gene set, and adjust p-value correction method of Benjamini-Hochberg (BH). The MSigDB database's “c2.cp.all.v2022.1.hs.symbols.gmt [all canonical pathways](3050)” gene set was utilized by GSEA for enrichment analysis. The biological activities of the identified DEGs were performed by the R package “ClusterProfiler” (version 3.14.3) (Yu et al. 2012) and visualized by the R package “ggplot2” and “GOplot”. Statistically significant enrichment was defined as a function or pathway term with a false discovery rate (FDR) < 0.25 and adjust p-value < 0.05.

Protein-Protein Interaction Network and Analysis

The protein–protein interaction (PPI) network is composed of interacting proteins. The STRING V11.0 database (https://string-db.org/) is valuable for identifying connections between predicted and established proteins. In this study, we constructed a PPI network related to DEGs using the STRING database (minimum needed interaction score: low confidence (0.150) as the standard) and visualized it with Cytoscape software (version 3.9.1, UC, San Diego, CA, USA). The genes in this network were ranked based on their degree centrality values using the CytoHubba plugin. In addition, the Comparative Toxicogenomics Database (CTD, http://ctdbase.org/) was used to explore the connection between hub genes and the risk of EOPE. The CTD is a powerful tool that synthesizes data on chemical gene/protein interactions, chemical disorders, and gene-disease correlations to generate hypotheses related to disease mechanisms.

To assess the diagnostic effect of key module gene's expression in PE. The receiver operating characteristic (ROC) curve (Park et al. 2004) analyzes coordinate schema that can choose the best model, eliminate the second-best model, or establish the best threshold within the same model. The ROC curve is a composite measure of sensitivity and specificity for the continuous variables and the correlation between the two is demonstrated by the composition technique. The ROC curves of key module genes in EOPE were plotted using the R package “pROC”, and the AUC of the ROC curve was computed. The ROC curve’s AUC typically ranges from 0.5 to 1. The diagnostic performance is better the closer the AUC is to 1. The accuracy was low when AUC was between 0.5 and 0.7, moderate when AUC was between 0.7 and 0.9, and high when AUC was above 0.9.

Immune Infiltration Analysis and Immune Scores

CIBERSORT (https://cibersortx.stanford.edu/) (Newman et al. 2015) is based on linear support vector regression to deconvolute the transcriptome expression matrix to estimate the composition and abundance of immune cells in mixed cells. We calculated the proportion of 22 immune cell types from the GSE44711 dataset using the CIBERSORT algorithm to verify the relationship between EOPE and the immune microenvironment. The data on immune infiltration was visualized using heatmaps, bar plots, and additional heatmaps generated with the R package “pheatmap”. The enrichment score was set to > 0.

Patient Recruitment and Tissue Collection

We collected placental samples from eighteen participants, nine women with EOPE and nine preterm controls, who gave birth at the First Affiliated Hospital of Fujian Medical University in China between January and June 2022. Placental tissues were collected from the maternal side and immediately stored at −80 °C for future use. The EOPE group was characterized as having high blood pressure (systolic blood pressure ≥ 140 mmHg and/or diastolic blood pressure ≥ 90 mmHg) during 20–34 weeks of pregnancy, along with proteinuria > 300 mg/24 h. Control samples were obtained from preterm patients (between 28 and 34 completed gestational weeks) or late abortion (between 20 and 28 completed gestational weeks) due to cervical insufficiency or uterine malformations. Our study was approved by the institutional human experiment and ethics committee of the First Affiliated Hospital of Fujian Medical University (Approval No. MRCTA, ECFAH of (2020) 329), and all participants provided written consent.

Reverse Transcription Quantitative Real-Time PCR

We extracted total RNA using the TRIzol reagent and reverse transcribed it into cDNA using the SuperScript III RT transcription kit (ABI-Invitrogen, Thermo Fisher). The SYBR Premix Ex Taq (Invitrogen, Thermo Fisher) was used, following the instructions provided, and reaction conditions were set for pre-denaturation at 95 °C for 2 min, 40 cycles at 95 °C for 20 s, 60 °C for 20 s, and 72 °C for 30 s. We performed the procedure three times to confirm the accuracy and calculated the relative mRNA expression levels using the 2−ΔΔCt method. The primer sequences used were as follows: β-actin primer F: 5'-GACAGGATGCAGAAGGAGATTACT-3', primer R: 5'-TGATCCACATCTGCTGGAAGGT-3'; COL1A1 primer F: 5'-GTTGTGCGATGACGTGATCTGT-3', primer R: 5'-TTGGTCGGTGGGTGACTCTGA-3'; SPP1 primer F: 5'-TTTCACTCCAGTTGTCCCC-3', primer R: 5'-GTGATGTCCTCGTCTGTAGC-3'; and THY1 primer F: 5'-CAGGACGGAGCTATTGGCACCAT-3', primer R: 5'-ACGGCAGTCCAGTCGAAGGTTCT-3'.

Statistical Analysis

All statistical charts and analyses were conducted using R (Version 4.2.0). Continuous variables were represented by the mean ± standard deviation (SD). Continuous values and count data were analyzed using t-tests and chi-squared tests. Utilize Spearman correlation analysis to calculate the correlation coefficient between various variables. The pROC package was also utilized to perform ROC analysis and the widely-used binary evaluation. A p-value < 0.05 was regarded as statistically significant in each test.

Results

DEG Identification

We downloaded the GSE44711 and GSE74341 gene expression datasets from the GEO databases. The GSE44711 dataset comprised 8 placenta samples from EOPE and 8 placenta samples from gestational age-matched preterm controls, while the GSE74341 dataset included 7 EOPE placenta samples and 5 samples from gestational age-matched controls. Our analysis identified 807 DEGs in the GSE44711 dataset (451 upregulated and 356 downregulated) and 787 DEGs in the GSE74341 dataset (446 upregulated and 341 downregulated). We presented the top 40 DEGs with the highest p-values in Fig. 2A–E for GSE44711 and Fig. 2F–J for GSE74341. A Venn diagram analysis of both datasets revealed 64 upregulated DEGs and 49 downregulated DEGs common to both, as shown in Fig. 2K–L.

Fig. 2
figure 2

Identification of DEGs in the GSE44711 and GSE74341 datasets. a–e GSE44711 datasets; f–j GSE74341 datasets. a, f Boxplot of gene probe expression levels among samples. b, g Volcano plot of all DEGs by the limma R package. c, h Principal-component analysis (PCA). d, i uniform manifold approximation and projection (UMAP). e, j An expression heat plot of the top 40 DEGs. k DEGs upregulated in the EOPE group in both the GSE44711 and GSE74341datasets. l DEGs downregulated in the EOPE group in both the GSE44711 and GSE74341 datasets

Analysis of Functional Enrichment and Pathway

We conducted GO, KEGG, and GSEA analyses to investigate the functional enrichment and pathways associated with the DEGs. The GO analyses enriched in biological processes such as extracellular structure organization, extracellular matrix (ECM) organization, and multi-multicellular organism process. Similarly, the molecular functions enriched included ECM structural constituent, platelet-derived growth factor binding, cytokine activity, and receptor-ligand activity. Additionally, the cellular components such as collagen-containing ECM, endoplasmic reticulum lumen, collagen trimer, and the anchored component of the external side of plasma membranes were enriched in DEGs.

In the KEGG pathway analysis, we found enrichment of DEGs in pathways such as ECM-receptor interaction, focal adhesion, and PI3K-Akt signaling pathway. The results of all these analyses are presented in Table S2. In addition, GSEA revealed that DEGs were significantly enriched in cell cycle mitosis, asparagine N-linked glycosylation, signaling by NTRKs, cell surface interactions at the vascular wall, and signaling by TGF β family members. The graphical representations of these findings are illustrated in the figures for GO analysis (Fig. 3A–B, Fig. 3E–F), KEGG pathway (Fig. 3C–D, Fig. 3E–F), and GSEA pathway (Fig. 3G–H).

Fig. 3
figure 3

Enrichment analysis results. a, b GO term enrichment analysis results. a Bar graph, Z-score results for the top 12 GO terms, including the top 4 BPs, CCs, and MFs, b Loop graph, enrichment results for DEGs and the top 12 GO terms. Z-scores were defined as (upregulated genes-downregulated genes)/total genes. c, d KEGG pathway enrichment results. c Chord diagram, relationships between DEGs and the top 6 enriched KEGG pathways, d Cluster plot, the top 6 enriched KEGG pathways. e, f Visualization of GO-KEGG pathway enrichment. e Bar Graph, f Network Chart. The top 3 terms of BPs, CCs, MFs and KEGG were selected based on p-value rankings. g, h Enrichment plots by GSEA. g GSEA visual analysis, h GSEA ridge plot

PPI Network Construction and Hub Gene Identification

To construct a PPI network of the DEGs, we utilized the STRING database, and the resulting network is depicted in Fig. 4A–C. The top 10 hub genes were selected using nine topological methods available in the CytoHubba plugin in Cytoscape (v3.9.1), and these genes are listed in Table S3. We also analyzed the interaction of all screened hub genes and PE using the CTD, as shown in Fig. 4D. Out of genes that satisfied both inference scores > 50% in the CTD and appeared simultaneously in all nine topological methods. COL1A1, SPP1, and THY1 were selected for further analysis.

Fig. 4
figure 4

Key DEGs interactions and diagnostic power in EOPE patients. a The network of the DEGs was downloaded from the STRING database. b DEGs PPI network was constructed using Cytoscape, with upregulated and downregulated genes shown in blue rectangle and pink diamond, respectively. c The top 10 genes with the highest MCC values were identified using the CytoHubba plug-in. These genes were ranked in descening MCC order from red to orange to yellow. d The CTD analysis between potential hub genes and preeclampsia. e, f The independent and combined diagnostic power of COL1A1, SPP1 and THY1 in EOPE by ROC curve. g The expressions of COL1A1, SPP1 and THY1 in GSE44711 datasets. h–j The relative expression of COL1A1, SPP1 and THY1 in the placental samples. The sample sizes (Control: n = 18, EOPE: n = 16), *** p-value < 0.001, **p-value < 0.01, *p-value < 0.05 compared with the control group. The error bars represent the standard deviation of the measurements based on triplicate runs

Diagnostic Efficacy Analysis of Key Genes

We assessed the potential predictive value of three key genes in EOPE by ROC curves. Our analysis revealed that when considered independently, the AUCs for COL1A1, SPP1, and THY1 were 0.875 (sensitivity: 0.855, specificity: 1), 0.781(sensitivity: 0.825, specificity: 1) and 0.859 (sensitivity: 0.875, specificity: 0.75), respectively (Fig. 4E), these results indicate that COL1A1, SPP1, and THY1 showed a certain accuracy (0.7 < AUC < 0.9), and COL1A1 and THY1 have higher accuracy for prediction than SPP1. Interestingly, when these genes were combined, the diagnostic efficiency was significantly improved (AUC = 0.922) (Fig. 4F). These results indicate that COL1A1 and THY1 have a higher accuracy for prediction than SPP1. To further verify our findings, we analyzed data from the GSE44711 dataset to examine the expression of COL1A1, SPP1, and THY1 in EOPE placenta tissues. Our results showed that the expressions of COL1A1, THY1, and SPP1 were downregulated in EOPE compared to controls (all p-values < 0.05, Fig. 4G).

Immune Cell Infiltration

To explore the potential relationship between EOPE and the immune response, we analyzed the 22 categories of immune cell phenotypes in the placenta of pregnant women with EOPE in the GSE44711 dataset using the CIBERSORT algorithm. The proportion bar chart revealed that macrophages M2, B cells naive and T cells CD8 showed a higher abundance of infiltration than other immune cells (Fig. 5A).

Fig. 5
figure 5

Immune cell infiltration analysis. a The relative percentage of immune cells in samples from the GSE44711 dataset. b Heatmap of immune cells. c Correlation analysis of immune cells. d Comparison of immune cell infiltration between preterm controls and patients with EOPE

After excluding immune cells with undetectable abundance, we analyzed 19 immune cell types and visualized their correlation using a heatmap (Fig. 5B–C). Specifically, our results showed significant negative correlations between B cell naive and T cells CD8, B cell naive and NK cells activated, Plasma cell and Dendritic cells activated, T cells CD8 and T cells CD4 native, T cells CD8 and T cells follicular helper, T cells CD8 and T cell gamma delta, T cells CD4 native and NK cells activated, T cells CD4 native and Mast cells resting, T cells follicular helper and NK cells activated, T cells follicular helper and Mast cells resting, T cell gamma delta and NK cells activated, and T cell gamma delta and Mast cells resting (all p-value < 0.05). Furthermore, several immune cell types were positively correlated, such as B cell naive and T cells CD4 naive, B cell naive and T cells follicular helper, B cell naive and T cell gamma delta, Plasma cell and NK cells activated, T cells CD8 and NK cells activated, T cells CD4 native and T cells follicular helper, T cells CD4 native and T cell gamma delta, T cells follicular helper and T cell gamma delta, and T cells CD4 memory activated. Neutrophils were also positively correlated (all p-value < 0.05).

Furthermore, we evaluated data from 8 patients with EOPE and 8 preterm controls in GSE44711 to examine differences in immune response between these groups. Our findings suggest that two immune cell types displayed the highest differential infiltration between the EOPE and control groups. Specifically, compared to control samples, macrophage M1 was significantly upregulated (p-value = 0.01), while macrophage M2 was downregulated considerably (p-value = 0.0007) in patients with EOPE (Fig. 5D).

mRNA Expression of COL1A1, SPP1 and THY1 in the EOPE Placenta

To investigate the expression levels of three key genes in EOPE, we analyzed RT-qPCR using placental samples from 16 patients diagnosed with EOPE and 18 gestational age-matched controls. The clinical characteristics of the patients are presented in Table 1. The two groups had no significant differences in maternal and gestational age at delivery. The systolic and diastolic blood pressures, MA/UCREA, were significantly higher than the control groups. However, neonatal birth weight and Apgar at 1minter were substantially lower in the EOPE group. Our analysis revealed a significant downregulation in the expression levels of COL1A1, SPP1, and THY1 in the placental samples of patients with EOPE compared to the control group (Fig. 4I–J).

Table 1 Clinical information of the patients

Discussion

EOPE is a complex pregnancy complication that poses significant risks to the health of both mothers and fetuses, often leading to preterm birth, fetal growth restriction, and other complications with far-reaching consequences (Roberts and Cooper 2001). Aberrant gene expression has been implicated in the pathology of EOPE, with immune cell infiltration being a crucial factor. However, the critical driver genes responsible for the onset and progression of this condition have yet to be fully explained. This study aimed to identify hub genes associated with EOPE to improve our understanding of its pathogenesis and develop a diagnostic logistic model. Additionally, we investigate immune cell infiltration in EOPE patients.

Our study identified 807 DEGs (451 upregulated and 356 downregulated) in the GSE44711 dataset and 787 DEGs (446 upregulated and 341 downregulated). We used a Venn diagram to identify 64 upregulated DEGs and 49 downregulated DEGs common to both datasets. Then, GO analysis results revealed the enrichment of critical biological processes, including ECM organization, collagen-containing ECM, and ECM-receptor interaction; the molecular functions included ECM structural constituent, and the cellular components included collagen-containing ECM. Previous studies have demonstrated that ECM plays a vital role in regulating trophoblast invasion and contributes to the remodeling of decidua at the maternal–fetal interface (Guillomot et al. 2014). Other studies have investigated the effect of ECM on EOPE (Yi et al. 2021). For example, it has been discovered that CyPA promotes ECM production of trophoblast via enhancing the TGF-β/Smad3 pathway in PE (Hu et al. 2022a).

Further KEGG analyses showed that the PI3K-Akt, ECM-receptor interaction, and focal adhesion signaling pathway significantly enriched. The PI3K-Akt signaling pathway is a pro-survival pathway that regulates cell proliferation, differentiation, and apoptosis (Li et al. 2020). PI3K-Akt signaling is typically dysregulated in numerous types of cancer and thus has become an important target for anticancer treatment (He et al. 2021). In a recent study, the role of the PI3K-Akt signaling pathway in EOPE has also been reported. Cudmore et al. (2012) proposed that inhibition of the PI3K-Akt signaling pathway increased circulating soluble endoglin release and relieved endothelial dysfunction in PE. Li et al. found that STX2 activates the PI3K-Akt pathway to facilitate trophoblast growth, migration, and invasion in PE (Li et al. 2021a). Similarly, Xu et al. demonstrated that ANXA4 promotes trophoblast invasion via the PI3K/AKT/eNOS pathway in PE (Xu et al. 2019). Researches show that COL4A1 may function by targeting the TGF-β pathway, decreasing TGF-β, activating the PI3K-Akt pathway, and eventually causing PE (Li et al. 2021b). Additionally, some reviews have reported that PI3K-Akt, which may contribute to the pathology of EOPE, plays vital roles in trophoblast cell metabolism, proliferation, invasion, and the mediation of inflammatory processes. These findings suggest that targeting the PI3K-Akt pathway may be a promising therapeutic strategy for EOPE. These findings are consistent with previous research.

After identifying DEGs, we screened three key genes, namely COL1A1, SPP1 and THY1, which not only co-occurred in the nine topological methods of the CytoHubba plugin in Cytoscape but also had an influence score > 50% in the CTD analysis. Moreover, the expressions of these three genes in EOPE and control pregnancies from GSE44711 datasets revealed significant differences. Using these findings, we constructed a diagnostic logistic model that included COL1A1, SPP1, and THY1 as screening markers. This model accurately distinguished patients with EOPE from women with control pregnancies, with an AUC > 0.9. To further confirm our findings, RT-PCR analysis revealed that these three genes were downregulated in patients with EOPE. Overall, our research supports the potential of these three genes as biomarkers for EOPE with promising diagnostic accuracy.

As we all know, many types of collagen are widely involved in placenta development and trophoblast function. The COL1A1 gene belongs to the collagen family, encoding the α1 chain of type I collagen fiber, which plays a vital role in trophoblast adhesion at the maternal–fetal interface. Studies have shown that overexpression of COL1A1 is associated with the progression and prognosis of various cancers, such as esophageal, lung, and breast cancer (Fang et al. 2019; Liu et al. 2018). Notably, in ovarian cancer, COL1A1 expression is significantly up-regulated in metastatic lesions (Li et al. 2017). However, few studies have examined its role in EOPE. Vitro studies have shown that SPARC affects the invasion of human trophoblast cells by regulating COL1A1 (Jiang et al. 2013). Upregulation of COL1A1 in the endometrial stromal cell-derived extracellular vesicles is critical for embryo implantation and trophoblast invasion during placentation (Gurung et al. 2021). A large-scale study showed that the expression of COL1A1 was closely associated with PE (p-value = 0.0011) (Goddard et al. 2007). This study found that COL1A1 expression was markedly lower in the EOPE group than in the control group, indicating its potential as a biomarker for EOPE. Moreover, we identified COL1A1 as a hub gene for EOPE, underscoring the need for further research to understand its impact on the disease.

The SPP1 gene, a member of the SIBLING family, encodes the SPP1 protein, which has been extensively studied for its role in tumors. However, recent research has focused on its specific function at the maternal–fetal interface, particularly in preeclampsia (Hernandez et al. 2013; Herington and Bany 2007). Preeclampsia is characterized by impaired adhesion of trophectoderm cells to uterine luminal epithelial cells, resulting in placental insufficiency and compromised embryonic and maternal circulation (Ma et al. 2014). SPP1, acting as a multifunctional extracellular matrix protein, binds to integrin αvβ6 and αvβ3, thereby facilitating trophoblast cell adhesion and migration (Erikson et al. 2009). It is involved in the invasion of cytotrophoblasts into the maternal vasculature/extracellular matrix during normal placental development (Gabinskaya et al. 1998). and is downregulated in the placenta of patients with PE (Xia et al. 2009). SPP1 triggers the PI3K-Akt signaling pathway when it binds to integrin, which influences the proliferation and adhesion of trophoblast cells. We hypothesize that low expression of SPP1 may be associated with M2 polarization of macrophages at the maternal–fetal interface's microenvironment, leading to inflammation and indirectly inhibiting trophoblast cell invasion and adhesion.

The THY1 gene encodes the protein CD90, a cell surface protein that rangs from 25 to 37 kDa and is anchored to glycosyl phosphatidate inositol (GPI) at its carboxyl-terminal. Recent studies have shown that overexpression of THY1 in the tumor microenvironment contributes to tumor proliferation and metastasis (Mancarella et al. 2022). However, despite extensive research on THY1’s role in cancer, its potential impact on EOPE remains largely unexplored. Therefore, further investigation is necessary to understand the influence of THY1 on the development of EOPE. It will be an important area for future research.

Overall, our research supports the potential of these three genes as biomarkers for EOPE with promising diagnostic accuracy. However, we acknowledge that placental bed biopsy is not feasible during pregnancy. Our proposed diagnostic tool based on EOPE-related DEGs expression analysis must be further validated and refined before implementingit in clinical practice. This potential diagnostic study can serve as a starting point for developing more practical and non-invasive diagnostic analyses. In addition, this approach may provide insight into the underlying mechanisms of EOPE, which may facilitate the development of new treatment strategies.

In this study, we observed the biological processes related to immune cell infiltration as identified via GO analysis, such as mononuclear cell migration, macrophage migration, and macrophage chemotaxis. Specifically, we investigated the association between immune cells to better understand the link between EOPE and immune cell infiltration. Our research indicated that patients with EOPE exhibited upregulation of M1 macrophages. Previous studies have shown that macrophages play a critical role in regulating immune response, which is strongly implicated in the pathogenesis of EOPE (Hutter et al. 2013). Furthermore, research has been conducted to elucidate the function of immune cells in the context of PE, revealing that macrophages mediate the apoptosis of extravillous trophoblasts and maintain maternal–fetal tolerance (Aneman et al. 2020).

In recent decades, growing evidence of immunological dysregulation in PE has emerged. For example, CD4 + memory T cell activation decreases (Hu et al. 2022b), and the dysfunction of regulatory T cells affects the systemic immune responses (Ibrahim et al. 2017; Salvany-Celades et al. 2019); macrophages regulating immune response is vital in the pathogenesis of PE (Hutter et al. 2013). However, the immune cell activation pattern in EOPE remains uncertain. We applied CIBERSORT to assess the immune infiltration comprehensively to identify the function of immune cell infiltration in EOPE. We discovered macrophages M2, B cells naive, and T cells CD8 showed a higher abundance of infiltration than other immune cells in EOPE, and there are significant differences in the infiltration of M1 and M2 macrophages between EOPE and control groups. Several studies suggested that macrophages mediate extravillous trophoblasts' apoptosis and maintain maternal–fetal tolerance (Aneman et al. 2020; Yang et al. 2019). As is well known that there are two phenotypes of M1 macrophages (classically activated macrophages) and M2 (alternatively activated macrophages). M1 macrophages are known for their involvement in phagocytosis, micro-biocidal actions, and pro-inflammatory responses, while M2 macrophages play a crucial role in immunomodulation, maternal tolerance, inflammation resolution, tissue remodeling, and cell proliferation (Ning et al. 2016; Faas and De Vos 2018). Therefore, in a normal healthy pregnancy, there is a preference for the M2 phenotype in macrophages, whereas, in cases of EOPE, there is a noticeable shift towards the M1 phenotype (Faas et al. 2014). It has been observed that M1 macrophages secrete soluble fms-like tyrosine kinase-1 (sFlt-1), thus contributing to impaired angiogenesis in EOPE (Schonkeren et al. 2011). These findings are consistent with our analysis, which supports that macrophage dysfunction and changes in polarity can potentially induce EOPE.These results further suggest that that immune cell infiltration and immunological responses play significant roles in the pathogenesis of EOPE. Overall, it is important to understand the distinct functions of M1 and M2 macrophages in pregnancy, particularly their impact on preeclampsia. Further examination of macrophage phenotypes and their role in immune responses during pregnancy would significantly contribute to our understanding of EOPE.

In this study, via a series of analyses, including GO, KEGG, PPI network, and CTD, we identified three key genes associated with EOPE. We constructed a diagnostic logistic model with COL1A1, SPP1, and THY1. These genes exhibited high diagnostic efficiency for EOPE and were related to the immune infiltration of M2 and M1 macrophages. RT-PCR analysis revealed that these genes were downregulated in EOPE patients. Future studies should further investigate the functional and diagnostic values of these genes in EOPE to better understand the pathogenesis of the disease. These findings provide novel insights into potential avenues for future research on EOPE.

There are some limitations of the present study that should be acknowledged. The validity of our conclusions mainly rests on the reliability of the original microarray dataset. We applied placental tissues obtained in EOPE (GSE74341 and GSE44711) to explore differentially expressed genes and hub genes. However, the results were limited since the small sample size and the validation results were limited similarly. Moreover, We should further perform experiments on the function of these hub genes and associated regulatory mechanisms in EOPE.

Conclusions

This study significantly adds to our understanding of the pathogenesis of EOPE, offering insights into potential targets for effective therapy and diagnostic methods for early detection. Further investigation of the functional and diagnostic values of COL1A1, SPP1, and THY1 in EOPE is crucial for future research.