Introduction

Rheumatoid arthritis (RA) is an autoimmune inflammatory disease characterized by synovial inflammation and hyperplasia, as well as cartilage and bone destruction, affecting approximately 0.5–1.0% of the world’s population. RA occurs in women more than men and mainly diagnosed in people aged 40–60. With the development of rheumatoid arthritis, joint damage, disability, and cardiovascular and other comorbidities occur [1, 2]. Genetics, epigenetics, smoking, sex, dust inhalation, gut microbiota, and modifiable lifestyle factors are risk factors in the development of RA. Genetic factors account for 50% of risk factors, thought to be specifically associated with either ACPA-positive or ACPA-negative disease [2, 3]. Due to the influence of genetic factors, RA is considered to be a heterogeneous disease. Osteoarthritis (OA) is the most common arthritis characterized by joint pain and dysfunction caused by joint degeneration and the prevalence of OA increases with age. Due to the complexity of the etiology, OA can be divided into primary or secondary OA [4]. Systemic factors (such as genetics, dietary intake, estrogen, and bone density) and biomechanical factors (such as muscle weakness, obesity, and joint laxity) are risk factors in the development of OA [5]. More studies have shown that synovial inflammation is also one of the important pathological changes in OA. The RA and OA have certain similarities in risk factors, pathological changes, and clinical manifestation; however, further studies are needed to identify the differences.

Bioinformatics is a new area in molecular biology and information technology. With the development of gene chips and next-generation sequencing technology, the field of bioinformatics is becoming more popular in the field of human genetics [6, 7]. Bioinformatics analysis is used to study a variety of diseases [8,9,10].

Here, we aimed to use bioinformatics to analyze the gene expression differences between RA and OA synovial tissue. We downloaded GSE36700, GSE1919, GSE12021, GSE55235, GSE55584, and GSE55457 datasets from the Gene Expression Omnibus (GEO) database. The Gene Ontology (GO) functional enrichment analysis, Kyoto Encyclopedia Genes and Genomes (KEGG) pathway analysis, and protein–protein interaction (PPI) network analysis were used to further analyze the differentially expressed genes (DEGs). The DEGs were further verified using reverse transcription-polymerase chain reaction (qRT-PCR) to promote a further understanding of the difference between RA and OA synovial tissue.

Materials and methods

Microarray data

Gene expression profiles datasets, GSE36700 [11], GSE1919 [12], GSE12021 [13], GSE55235 [14], GSE55584 [14], and GSE55457 [14], were downloaded from the GEO database. The GSE55235 dataset contained 30 samples; 10 RA and 10 OA synovial tissue samples, which was based on the GPL91 [HG-U133A] Affymetrix Human Genome U133A platform. The GSE1919 dataset contained 15 samples; 5 RA and 5 OA synovial tissue samples, which was based on the GPL91 [HG_U95A] Affymetrix Human Genome U95A Array. The GSE36700 dataset contained 25 samples; 7 RA and 5 OA synovial tissue samples, which was based on the GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. The GSE12021 dataset contained 31 samples; 12 RA and 10 OA synovial tissue samples, which was based on the GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. The GSE55584 dataset contained 16 samples; 10 RA and 6 OA synovial tissue samples, which was based on the GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. The GSE55457 dataset contained 33 samples; 13 RA and 10 OA synovial tissue samples, which was based on the GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. The Series Matrix File of GSE36700, GSE1919, GSE12021, GSE55235, GSE55584, and GSE55457 was also downloaded from the GEO database.

Identification of DEGs

The DEGs between RA and OA samples were identified using the limma package of R software [15]. P value < 0.05 and |log FC|≥ 2 were chosen as the cut-off criteria. The upregulated and downregulated genes were visualized using R 4.0.0 software.

GO functional enrichment and KEGG pathway analysis of DEGs

GO functional enrichment analysis was used to annotate genes, including biological process (BP), molecular function (MF), and cellular component (CC) [16]. GO and KEGG [17] pathways of DEGs were identified and visualized with clusterProfiler [18], ggplot2 [19] package of R. P < 0.05 was chosen as the cut-off criteria, indicating a statistically significant difference.

Integration of the PPI network and hub gene analysis

The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) [20] was used for evaluating and integrating PPI information of DEGs. The cytoHubba plugin in Cytoscape version 3.7.2 was used to screen and visualize 15 hub genes (MCC method) [21].

Total RNA extraction and qRT-PCR validation

The synovial tissue from nine patients with RA and nine with OA was harvested for qRT-PCR validation to confirm the genes identified via bioinformatics analysis. The characteristics of patients was shown in Supplementary Table 1. Total RNA was extracted from synovial tissue using TRIzol reagent (Invitrogen, Thermo Fisher Scientific, Inc.) according to the manufacturer’s protocol. Isolated total RNA was reverse transcribed into cDNA with PrimeScript™ RT Master Mix (Perfect Real Time) (Takara, China), and qRT-PCR was carried out using TB Green® Premix Ex Taq™ (Tli RNaseH Plus) (Takara, China). β-Actin was used as an internal reference. The relative mRNA expression was calculated using the 2−ΔΔCt method. The t-test was used for the statistical analysis, and P < 0.05 indicated a significant difference.

Results

Identification of DEGs

A total 41 DEGs, including 28 upregulated and 13 downregulated genes, were identified in our study after bath correction and standardization using R. The results were visualized with a volcano plot using the R software (Fig. 1).

Fig. 1
figure 1

Volcano plot of DEGs detected between RA and OA synovial tissue

GO functional enrichment and KEGG pathway analysis of DEGs

Analysis results showed that for BP, the DEGs were enriched in humoral immune response, antimicrobial humoral response, complement activation, classical pathway, positive regulation of lymphocyte activation, and lymphocyte-mediated immunity. For CC, the DEGs were enriched in immunoglobulin complex, circulating, immunoglobulin complex, blood microparticle, and external side of the plasma membrane. For MF, the DEGs were enriched in antigen binding, CXCR chemokine receptor binding, immunoglobulin receptor binding, chemokine activity, glycosaminoglycan binding, cytokine activity, and chemokine receptor binding (Fig. 2, Table 1). The significantly enriched pathway of the DEGs included cytokine-cytokine receptor interaction, PPAR signaling pathway, chemokine signaling pathway, intestinal immune network for IgA production, and toll-like receptor signaling pathway (Fig. 3, Table 2).

Fig. 2
figure 2

GO functional enrichment analysis of DEGS

Table 1 GO functional enrichment analysis of DEGs
Fig. 3
figure 3

KEGG pathway analysis of DEGs

Table 2 KEGG pathway analysis of DEGs

PPI network construction and hub gene selection in the PPI network

With the STRING database, the PPI network of DEGs was established, consisting of 36 nodes and 32 edges and the PPI enrichment P-value was 4.63 × 10−14. The cytoHubba plugin in Cytoscape was used to identify and visualize 15 hub genes: CXCL10, CXCL9, CXCL11, CXCL13, GZMK, IGJ, POU2AF1, LCK, ADIPOQ, IGLL5, IL7R, TNFRSF17, PLIN1, PCK1, and MMP1 (Fig. 4).

Fig. 4
figure 4

The protein–protein interaction (PPI) network of the top 15 hub genes

qRT-PCR validation of the hub genes

According to the literature analysis, we examined the relative expression of CXCL10, CXCL11, CXCL13, GZMK, LCK, ADIPOQ, and PCK1 in 9 OA and 9 RA synovial tissue via qRT-PCR. Compared to the OA synovial tissue, the expression of ADIPOQ was significantly downregulated and the expression of CXCL10 and CXCL13 were upregulated in RA synovial tissue (Fig. 5). The target gene primers are shown in Table 3.

Fig. 5
figure 5

qRT-PCR validation of the hub genes between RA and OA synovial tissue. The relative expression levels of each gene were calculated using 2−ΔΔCt methods (*P < 0.05, **P < 0.01)

Table 3 The target genes and their primer sets

Discussion

In this study, we screened a total of 57 RA and 46 OA synovial tissue using bioinformatics analysis. The results of the GO functional enrichment analysis, KEGG pathway analysis, the PPI network construction, and the hub gene analysis provided new insights into the molecular mechanisms between RA and OA. The occurrence of RA and OA is a complex biological process, including the abnormally expressed genes and the activation of disease-associated pathways. It is important to identify the genes and signaling pathways involved in the pathogenesis of RA and OA. The identification of the DEGs and signaling pathways of RA and OA could help the further understanding of the difference between RA and OA.

The GO functional enrichment analysis showed that the DEGs between RA and OA synovial tissue were mainly enriched in the humoral immune response, antimicrobial humoral response, lymphocyte-mediated immunity, and positive regulation of cell activation for BP; in immunoglobulin complex, circulating, external side of plasma membrane, MHC class II protein complex, and MHC protein complex for CC; and in antigen binding, CXCR chemokine receptor binding, immunoglobulin receptor binding, chemokine activity, cytokine activity, and chemokine receptor binding for MF. These functional enrichment analyses indicated that the DEGs were enriched in inflammation and immune response, which play an important role in the pathogenesis of RA [22, 23]. KEGG pathway analysis showed that the DEGs were enriched in cytokine-cytokine receptor interaction, PPAR signaling pathway, chemokine signaling pathway, rheumatoid arthritis, and toll-like receptor signaling pathway which are closely related to inflammation and immune diseases [24, 25]. These results illustrate the important role of the DEGs between RA and OA and further study is needed to confirm the conclusion.

The DEGs were further analyzed by constructing a PPI network and screening the hub genes, then we verified the selected hub genes via qRT-PCR. Expressions of CXCL10 and CXCL13 were upregulated in RA synovial tissue. The expression of ADIPOQ was downregulated in RA synovial tissue. CXCL10 is located on chromosome 4q21.1 and encodes a chemokine of the CXC subfamily and ligand for the receptor CXCR3. CXCL10 has been detected in many autoimmune diseases, such as type I diabetes, systemic lupus erythematosus, and RA. CXCL10 has been detected in the sera, synovial fluid, and synovial tissue of patients with RA [26, 27]. In the present study, we found that CXCL10 was upregulated in RA synovial tissue compared with OA synovial tissue. CXCL13 is located on chromosome 4q21.1. The synovial CXCL13 is a marker of the severe pattern of RA [28]. In our study, the expression of CXCL13 was upregulated in RA synovial tissues compared with that in OA synovial tissue. Our result was consistent with previous studies. ADIPOQ is an adipokine released from adipose tissue and plays an important role in bone formation and bone resorption [29]. ADIPOQ is located on chromosome 3q27.3 and mutations on this gene are associated with adiponectin deficiency. Adiponectin expression levels were reported to be higher in patients with OA than in the healthy controls. The expression of this gene is also increased in knee osteoarthritis synovial tissues [30], and the SNP, rs182052, of ADIPOQ is associated with susceptibility to knee osteoarthritis in the Chinese population [31], and the SNP, rs1501299, may be associated with the development of OA as previously reported [32]. In the present study, ADIPOQ was downregulated in RA synovial tissue compared with OA. It may be due to the fact that obesity is more correlated with the occurrence of OA than RA.

In conclusion, we integrated bioinformatics analysis and qRT-PCR validation to identify the DEGs between RA and OA synovial tissues. Compared with previous studies that included a single dataset, our study adopted multiple datasets to get accurate and reliable results. We identified three hub genes, CXCL10, CXCL13, and ADIPOQ, which were validated using qRT-PCR. These genes may serve as novel biomarkers and potential targets for the difference of accurate diagnosis and treatment strategies between RA and OA. Further experiments are needed to verify these candidate biomarkers.