Introduction

Although systemic lupus erythematosus (SLE) is generally recognized as a complex autoimmune disease induced by environmental and genetic factors [1], the detailed etiology is still obscure. SLE can affect almost any organs and tissues, resulting in highly diversified clinical manifestations, mostly including fever, rashes, oral ulcer, arthritis, vasculitis, and life-threatening nephritis [1]. It is more prevalent in African, Asian, Hispanic, and American, and the prevalence of SLE in China is about 30 per 100,000 [2]. Due to the high morbidity and mortality, it is critically important and highly demanded to uncover the underlying molecular mechanisms and discover the reliable biomarkers for early diagnosis and effective therapy.

Over the past five decades, a lot of efforts have been made in the understanding of the pathogenesis and development of SLE. From a cellular perspective, lymphocytes were considered to play a central role in the pathogenesis of SLE. Aberrantly activated T cells or imbalance of T cell subsets (e.g., decreased production of Treg cells and increased production of Th 17 cells) [3, 4] mediate inflammatory responses and activate B cells to differentiate and produce more autoantibodies, contributing to pathogenesis and development of SLE. Recently, the role of innate immune cells in the pathogenesis of SLE has attracted more attention. Plasmacytoid dendritic cells (pDCs) have been found to infiltrate in the renal and skin lesions of SLE patients [5]. Plasmacytoid dendritic cells produce a large number of type I IFN that is often found elevated in SLE patients [6]. Type I IFNs, as proinflammatory cytokines, can enhance immune responses by modulating T cell and B cell function [7,8,9]. Thus, pDCs have been suggested to be implicated to play a key role in the pathogenesis of SLE. On the other hand, neutrophils have been reported with decreased clearance of apoptotic material and increased synthesis and release of various inflammatory cytokines in SLE [10]. Neutrophils undergo a new cell death process to release neutrophil extracellular traps (NETs), a network of chromatin filaments coated with histones, proteases, and cytosolic proteins [11]. The release of NETs provides a source of autoantigens and enhances inflammatory and immune responses. These observations have provided massive evidence on the contribution of neutrophil to the pathogenesis of SLE. However, the underlying molecular mechanisms on the pathogenesis of SLE remain to be deeply elucidated.

With the wide application of gene detection technique, more and more microarray data profiling studies on SLE were performed and hundreds of differentially expressed genes (DEGs) were identified. Integrating and re-analyzing these data by bioinformatics methods can provide new and valuable ideas for understanding the molecular mechanisms and identifying reliable diagnostic and therapeutic targets of SLE. Therefore, in this study, we downloaded three original datasets GSE72509, GSE20864, and GSE39088 from NCBI-GEO, including 198 SLE patients and 108 healthy controls from three different racial populations. We integrated and re-analyzed the data and identified 321 common DEGs between SLE patients and healthy people. We performed gene ontology (GO) function and pathway enrichment analyses of the common DEGs with DAVID, Gene Ontology, and KEGG PATHWAY, and constructed a protein-protein interaction (PPI) network of DEGs. Finally, we validated the expression levels of these candidate genes by reverse transcription-quantitative polymerase chain reaction (RT-qPCR). The results of this study may help to improve our comprehension of the underlying molecular mechanisms of SLE and provide potential targets for diagnosis and therapy of SLE.

Materials and methods

GEO data information

Whole blood gene expression profile of GSE72509, GSE20864, and GSE39088 from SLE patients and healthy controls was obtained from NCBI-GEO (available online: https://www.ncbi.nlm.nih.gov/geo). The GSE72509 was a RNA-seq data sequenced by Illumina HiSeq 2500. It contained 99 SLE patients and 18 healthy controls [12]. The GSE20864 was a microarray data generated by Hitachisoft AceGene Human Oligo Chip 30K 1 Chip Version. It included 21 SLE patients and 45 healthy controls [13]. And the GSE39088 was generated by Affymetrix Human Genome U133 Plus 2.0 Array. The 78 non-treatment SLE patients and 46 healthy controls were chosen to be analyzed [14, 15]. The datasets represented three different areas: the GSE72509 conducted in the USA, the GSE20864 generated in Japan, and the GSE39088 data from Belgium.

Identification of differentially expressed genes

The expression data of GSE72509, GSE20864, and GSE39088 were directly imported into Qlucore Omics Explorer 3.2 (QOE) software. To identify the DEGs between SLE patients and healthy controls, p < 0.01 (two-group comparisons, two-tailed) was considered significant between the gene expression differences. The heat maps were drawn using the QOE with the normalization of mean = 0 and variance = 1 for each genes.

Gene ontology and pathway enrichment analyses

GO term enrichment analysis and network analysis were using Cytoscape ClueGO plugin with default setting. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a collection of online databases for the systematic analysis of gene functions in light of networks of genes and molecules. To identify enrichment pathways of DEG, the data were analyzed using Database for Annotation, Visualization and Integrated Discovery (DAVID) [16, 17]. The thresholds were Count (minimum count) ≥ 2, ESAE (modified Fisher exact p value) ≥ 0.1 (default).

Analysis of protein-protein interaction network

To further investigate the underlying molecular mechanisms of SLE, a protein-protein interaction (PPI) network for the DEGs was constructed by using the STRING database (http://www.string-db.org/) (the confidence score cutoff was 900) and then visualized by Network Analyst (http://www.networkanalyst.ca) [16]. Finally, the nodes with higher degrees of interaction were considered as candidate genes.

RT-qPCR to confirm candidate genes

To confirm the expression of candidate genes in SLE patients and healthy controls, whole blood RT-qPCR analysis was performed. A total of 5 ml fresh whole blood samples were obtained from patients with SLE (n = 14; all female; age range, 18–54 years, SLEDAI (mean ± SD), 12.93 ± 4.87, new onset: previously treated patients = 2:12, patients with lupus nephritis, n = 11) and normal healthy controls (n = 12; 3 male and 9 female; age range, 24–30 years). All the fourteen patients were recruited from the Third Affiliated Hospital of Southern Medical University (Guangzhou, China) and informed written consent requested. The protocol was approved by the Ethics Committee of The Third Affiliated Hospital of Southern Medical University (No. 201603003). Total white cell RNA was extracted using TRIzol reagent (Vazyme, Nanjing, China). The first strand complementary DNAs (cDNAs) were synthesized using the PrimeScript RT reagents Kit (Perfect Real Time, Takara) according to the manufacturer’s instructions. And then, RT-qPCR analysis was performed in triplicate in an ABI 7300 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as an internal control for quantification of target gene expression. The following PCR primers were used: FBXW11, forward, 5′-CCGACTCGGTGATTGAGGAC-3′, and reverse, 5′-CCGATGTTCCCCACATCCAA-3′; FOXO1, forward, 5′-AACCTGTCCTACGCCGACCTCA-3′, and reverse, 5′-GCTCGGCTTCGGCTCTTAGCAA-3′; SMAD7, forward, 5′-GTGGCATACTGGGAGGAGAA-3′, and reverse, 5′-GATGGAGAAACCAGGGAACA-3′; GAPDH, forward, 5′-TGGACTCCACGACGTACTCAG-3′, and reverse, 5′-CGGGAAGCTTGTCATCAATGGAA-3′; PLCB1, forward, 5′-GTGTCCGACAGCCTCAAGAA-3′, and reverse, 5′-CCGATGTTCCCCACATCCAA-3′; SIN3A, forward, 5′-ACCATGCAGTCAGCTACGG-3′, and reverse, 5′-CACCGCTGTTGGGTGATGA-3′, RPL26L1, forward, 5′-TTCAATCCCTTCGTTACCTCGG-3′, and reverse, 5′-TAGTGTCCTCGAACTACCTGG-3′. To quantify gene expression, 2-step qRT-PCR was performed using hot start Taq at 95 °C (15 s), with annealing and extension at 60 °C (60 s) for 40 cycles, followed by a melting curve analysis. All qRT-PCR data were analyzed using the 2−ΔΔCt method [17]. All the statistics were analyzed with SPSS version 20.0. RT-qPCR data were expressed as the mean ± standard error of mean (SEM). Statistical significance was determined using an independent sample t test and p < 0.05 was considered to be statistically significant.

Results

Identification of DEGs in SLE

Three gene expression data containing SLE patients and healthy controls were analyzed with two-group comparisons (p < 0.01) and then 4461, 3400, and 4212 DEGs from GSE72509, GSE20864, and GSE39088 were identified respectively. The heat maps of these DEGs were shown in Fig. 1a. After integration of these DEGs by employing bioinformatics analysis, a total of 321 common DEGs were identified in SLE patients compared with healthy controls (Fig. 1b), including 231 upregulated genes and 90 downregulated genes (Supplementary Table 1).

Fig. 1
figure 1

a The heat maps of the three profile datasets GSE39088, GSE72509, and GSE20864. The profile dataset GSE39088 that generated from Belgium contained 78 SLE patients and 46 healthy controls. The profile dataset GSE72509 which was conducted in the USA contained 99 SLE patients and 18 healthy controls. The profile dataset GSE20864 included 21 SLE patients and 45 healthy controls from Japan. b Identification of 321 common DEGs from the three datasets GSE72509, GSE20864, and GSE39088. Different color areas represented different datasets. The cross areas meant the common DEGs. DEGs were identified with classical t test and statistically significant DEGs were defined with p < 0.01 as the cutoff criterion (DEGs, differentially expressed genes)

Gene ontology analysis of DEGs in SLE

The DEGs were categorized into three function groups: cellular component group, molecular function group, and biological process group. GO analysis showed that the 321 common DEGs were significantly enriched in some biological processes related to SLE, for example, innate immune response, defense response, cytokine-mediated signaling pathway, response to interferon-alpha, and I-kappaB kinase/NF-kappaB signaling. In the cellular component group, three most significantly enriched GO terms of the 321 common DEGs were oxidoreductase complex, AIM2 inflammasome complex, and ubiquitin ligase complex. In the molecular function group, the most significantly enriched GO term of the 321 common DEGs was single-stranded RNA binding (Fig. 2 and Supplementary Table 2).

Fig. 2
figure 2

Significantly enriched GO terms of DEGs in SLE based on their GO functions. The 321 common DEGs were categorized into three function groups: cellular component group, molecular function group, and biological process group. Each node represents a GO term. The edges reflect the relationships between the terms based on the similarity of their associated genes. The larger node size and deeper color indicate greater significance of the enrichment (GO, gene ontology; DEGs, differentially expressed genes; SLE, systemic lupus erythematosus)

Signaling pathway enrichment analysis of DEGs in SLE

Signaling pathway enrichment analysis of DEGs was performed with a knowledge base-KEGG PATHWAY. The detailed results of pathway enrichment analysis were shown in Fig. 3a and Supplementary Table 3. The 321 common DEGs were mainly enriched in some signaling pathways associated with immune system and apoptosis including RIG-I-like receptor signaling pathway, antigen processing and presentation, and p53 signaling pathway. The details of these three important signaling pathways were presented in Fig. 3b, c, and d. The DEGs were also enriched in infectious diseases including influenza A and herpes simplex infection.

Fig. 3
figure 3

Significantly enriched KEGG pathway terms of DEGs in SLE. The details of the three important signaling pathways—RIG-I-like receptor signaling pathway, p53 signaling pathway, and antigen processing and presentation—are presented (DEGs, differentially expressed genes; SLE, systemic lupus erythematosus)

A protein-protein interaction network for DEGs

The PPI network for the 321 common DEGs was constructed by employing the STRING online database and Network Analyst. It contained 1713 nodes and 2512 edges (Fig. 4). Among the nodes, 24 node genes were identified as candidate genes with an interaction degree > 28 (each node had more than 28 connections/interactions) (Supplementary Table 4). Among the candidate genes, RPL26L1 showed the highest node degrees, followed by SIN3A, PLCB1, FOXO1, SMAD7, and FBXW11.

Fig. 4
figure 4

A protein-protein interaction (PPI) network for DEGs

Validation of candidate genes by RT-qPCR

To validate these major results, we selectively performed RT-qPCR analysis of six random important DEGs including RPL26L1, FOXO1, SMAD7, FBXW11, PLCB1, and SIN3A in 14 SLE patients and 12 healthy controls. The relative mRNA expression levels of FOXO1 and SMAD7 were significantly lower in blood samples from SLE patients when compared with those from healthy controls. The expression levels of RPL26L1 and FBXW11 were significantly higher in SLE group than healthy control group. However, there were no significant differences in the relative mRNA expression levels of PLCB1 and SIN3A between SLE patients and healthy people (Fig. 5).

Fig. 5
figure 5

Relative mRNA expression levels of RPL26L1, FBXW11, FOXO1, SMAD7, SIN3A, and PLCB1 in blood samples from SLE patients and healthy controls, as determined using RT-qPCR. (n = 12 for healthy controls, n = 14 for SLE patients. Mean ± SEM are plotted; *p < 0.05; **p < 0.01; ***p < 0.001)

Discussion

Systemic lupus erythematosus is a common autoimmune disease of which the underlying molecular mechanisms remain largely unclear. Recently, increasing studies based on microarray data profiles have been carried out to elucidate SLE pathogenesis [18, 19]. However, the results generated from a single cohort study are always limited or inconsistent. Interestingly, the combination of integrated bioinformatics methods and expression profiling techniques may overcome the disadvantages. Thus, in this study, we integrated three cohort profile datasets from different places including the USA, Japan, and Belgium, and deeply analyzed these data by integrated bioinformatics methods. A total of 321 commonly changed DEGs were identified, including 231 upregulated and 90 downregulated genes. The 321 common DEGs were mainly enriched in biological processes related to immune responses and inflammatory responses, including innate immune response, defense response, cytokine-mediated signaling pathway, response to interferon-alpha, and I-kappaB kinase/NF-kappaB signaling, whereas the three most significant cellular components were oxidoreductase complex, AIM2 inflammasome complex, and ubiquitin ligase complex. Furthermore, the DEGs were primarily involved in several signaling pathways associated with immune system, including RIG-I-like receptor signaling pathway and antigen processing and presentation. These results were in accordance with the knowledge that immune abnormalities and inflammatory response contribute to autoimmune diseases, including SLE. One of the significant enriched pathways, the p53 signaling pathway, has been reported to be strongly associated with apoptosis, cell cycle arrest, and DNA repair. The result was consistent with the characterization of SLE—increased apoptotic cell production and aberrant apoptotic cell clearance [20].

As a common environmental factor, infection plays an important role in the process of SLE onset and development. In this research, KEGG pathway enrichment analysis indicated that common DEGs were enriched in infectious diseases, including influenza A and herpes simplex infection. In addition, GO function analysis showed that common DEGs were primarily enriched in interferon signaling pathway and RIG-I-like receptor signaling pathway. Certain studies have confirmed that type I IFN signature was found in SLE and IFN signaling pathways were strongly associated with the pathogenesis and development of SLE [21, 22]. RIG-I-like receptors (RLRs) are cytosolic sensors for virus RNA and the RIG-I-like receptor signaling induces the secretion of type I interferon and proinflammatory cytokines, leading to activation of immune responses [23]. RLR mutations were found in SLE patients and mice with RLR-related gene mutation spontaneously developed lupus-like symptoms [24]. Recently, Wu’s group analyzed the microarray expression profile GSE65391, including 924 SLE samples and 72 healthy controls, and got the same results as we did [25]. Taken together, the results of this study in association with the other previous studies have indicated a basis for the vital role of interferon signaling pathway and RIG-I-like receptor signaling pathway in the pathogenesis and progression of SLE.

In this work, GO function analysis of DEGs demonstrated that the DEGs were mainly enriched in NF-κB signaling pathway and that FBXW11 was enriched in these pathways. NF-κB signaling pathways consist of the classical NF-κB activation pathway and the alternative pathway, and both pathways play an important role in regulating the immune and inflammatory responses [26,27,28,29]. Previous studies have suggested that high plasma levels of the B cell–activating factor (BAFF) that is known to trigger a non-canonical NF-κB signaling pathway were found in both SLE patients and murine lupus models [27, 30]. In addition, recent studies have indicated that inhibiting the NF-κB signaling pathway ameliorates lupus nephritis in lupus-prone MRL/lpr mice and pristane-induced lupus mice [31,32,33]. These results have demonstrated that NF-κB signaling pathways serve a crucial role in the pathogenesis of SLE. FBXW11, one of the WD-40-containing F-box proteins, is a component of E3 ubiquitin ligase complex that promotes signal-dependent ubiquitination of IκBα and activates the NF-κB signaling pathway [34, 35]. In this study, FBXW11 was identified to be upregulated in SLE patients compared with healthy controls. Although the function of FBXW11 in SLE has not been fully elucidated, we assume that FBXW11 may be implicated in the development of SLE via the NF-κB signaling pathway.

FOXO1 is a member of the forkhead box class O (FOXO) family characterized by a conserved DNA binding domain [36]. In mammals, FOXO1 is involved in regulating various cellular processes and functions, including cell proliferation, apoptosis, oxidative stress responses, metabolism, tumor suppression, and immune responses [37,38,39,40]. A previous study has suggested that FOXO1 transcriptional levels were markedly decreased in the peripheral blood mononuclear cells of SLE patients and inversely correlated with the disease activity [41]. Recently, many studies have demonstrated that FOXO1 plays an important role in the proliferation and activation of B lymphocytes [42,43,44]. Moreover, FOXO1 was identified to negatively regulate Th17 differentiation [45]. MicroRNA-873 was found to facilitate Th17 differentiation by targeting FOXO1 and be implicated in the pathogenesis of systemic lupus erythematosus [46]. In this study, FOXO1 was identified to be downregulated in SLE patients. Thus, the results of this study combined with previous studies indicated that FOXO1 may be involved in the pathogenesis of SLE by regulating proliferation and differentiation of lymphocytes.

SMAD7 is a member of inhibitory Smad family with conserved carboxy-terminal MH2 domains and inhibits TGF-β and BMP-induced Smad signaling [47]. A microarray study revealed that the signaling of a TGF-β-dependent pathway induced fibrosis in discoid lupus erythematosus [48]. Additionally, a previous study has indicated that blocking TGF-β signaling by Smad7 overexpression ameliorated progressive renal injury in a mouse model of autoimmune crescentic glomerulonephritis [49]. However, the biological roles of Smad7 in the development of SLE remain largely unknown. In our study, the expression levels of Smad7 were decreased in SLE patients compared with healthy controls, suggesting that Smad7 may be involved in the progression of SLE.

RPL26L1, a member of the ribosomal protein family, shares high sequence similarity with ribosomal protein L26 [50] and has a role in protein synthesis. The present study has determined RPL26L1 to be upregulated in SLE and suggested that RPL26L1 may be closely related with SLE development. But there were still no researches on the link between RPL26L1 and SLE.

Conclusion

In conclusion, integrated bioinformatics analysis of three profile datasets that based on SLE patients and healthy controls was performed and 321 common DEGs were identified. The common DEGs were significantly enriched in various pathways, especially the pathways associated with immune system and inflammatory responses. Furthermore, four hub genes (RPL26L1, FBXW11, FOXO1, and SMAD7) were validated in SLE patients. Collectively, these results definitely point out the molecular basis to better understand the pathogenesis and provide valuable novel markers or targets for the diagnosis and treatment of SLE.