Background

Tuberculosis is a highly infectious disease caused by Mycobacterium tuberculosis and is one of the top ten causes of death worldwide. Tuberculosis is most prevalent in developing countries and is generally treatable if detected early. Although TB primarily affects the lungs, it can spread to other parts of the body in some cases, a condition known as extrapulmonary tuberculosis (EPTB) [1, 2]. One form of EPTB is Bone and Joint Tuberculosis (BJTB), which is a destructive lesion caused by the invasion of Mycobacterium tuberculosis into bones or joints. Mycobacterium tuberculosis can cause inflammation after invading joints and proliferate extensively within cells, leading to restricted and disrupted immune system function [3]. The incidence of BJTB continues to rise, and it is speculated that bone tuberculosis will become the seventh leading cause of disability [4].

In recent years, diagnosing musculoskeletal tuberculosis infection has remained a significant clinical challenge. Timely diagnosis and treatment of bone tuberculosis are crucial for preventing severe bone and joint damage and severe neurological sequelae [4, 5]. Identifying symptoms of bone tuberculosis before it progresses to a certain extent is challenging, especially spinal tuberculosis, which is often asymptomatic in its early stages and may not present any noticeable symptoms. By the time bone tuberculosis is ultimately diagnosed, the signs and symptoms are typically severe [6, 7]. With the widespread application of targeted therapy and diagnostics, it is of great importance to study and explore new methods and molecular targets for the diagnosis and treatment of bone tuberculosis. This study aims to explore molecular markers that can serve as diagnostic markers for BJTB patients, providing potential molecular targets for early clinical diagnosis of BJTB, and preliminarily discussing the molecular cell immune mechanisms of BJTB development, thus providing a theoretical basis for the treatment of BJTB.

Materials and methods

Sample collection

Patients who visited the Department of Orthopaedics of the Second Hospital & Clinical Medical School, Lanzhou University from January 2021 to May 2023 were selected. Those diagnosed with spinal tuberculosis based on symptoms, signs, imaging, laboratory, and histopathological examinations, and who had excluded tuberculosis in other parts, other infectious diseases, tumors, trauma, immunodeficiency diseases, and metabolic diseases, were included in the experimental group (test group). Patients diagnosed with lumbar intervertebral disc degeneration, based on symptoms, signs, and imaging examinations, and who had excluded tuberculosis in other parts, other infectious diseases, tumors, trauma, immunodeficiency diseases, and metabolic diseases, were included in the control group. Each group consisted of 5 patients, with 3 males and 2 females in the experimental group, aged 28–75 years, including 2 cases of thoracic vertebra tuberculosis, 1 case of lumbar vertebra tuberculosis, 2 cases with tuberculosis lesions in both thoracic and lumbar vertebrae, 1 case with an intraspinal abscess, and 4 cases with multiple abscesses (3 with paravertebral and intraspinal abscesses, 1 with paravertebral and psoas major abscesses); the control group included 3 males and 2 females, aged 27–82 years, all with lumbar disc degeneration. There was no statistically significant difference in age and gender between the experimental and control groups (P > 0.05). After surgical resection of the diseased intervertebral disc tissue, the specimen was cleaned with physiological saline on a sterile table within 10 min, cut into small pieces, placed in a specimen storage tube, immediately immersed in liquid nitrogen, and then stored at -80 °C. All research subjects have been reviewed by the Ethics Committee of the Second Hospital of Lanzhou University, and informed consent has been obtained from the patients.

RNA extraction and sequencing

Total RNA was treated with RQ1 DNase (Promega) to remove DNA. The quality and quantity of the purified RNA were assessed by measuring the absorbance at 260 nm/280nm (A260/A280) using a Nanodrop One (Thermo) spectrophotometer. RNA integrity was further confirmed by 1.5% agarose gel electrophoresis.

For each sample, 1 µg of total RNA was used for RNA-seq library preparation using the KAPA Stranded mRNA-Seq Kit for Illumina® Platforms (#KK8544). Polyadenylated mRNAs were purified using VAHTS mRNA Capture Beads (N401-01), fragmented, and converted into double-stranded cDNA. After end repair and A-tailing, the DNAs were ligated to the Diluted Roche Adaptor (KK8726). The ligated products were purified, size-fractioned to 300–500 bp, amplified, purified, quantified, and stored at -80 °C before sequencing. The strand marked with dUTP (the second cDNA strand) was not amplified, enabling strand-specific sequencing.

For high-throughput sequencing, libraries were prepared according to the manufacturer’s instructions and sequenced on the Illumina Novaseq 6000 system using 150 nt paired-end sequencing.

RNA-seq raw data cleaning and alignment

Raw reads containing more than 2 N bases were initially discarded. Adapters and low-quality bases were then trimmed from the raw sequencing reads using the FASTX-Toolkit (Version 0.0.13). Short reads less than 16 nt were also removed. Subsequently, clean reads were aligned to the GRCh38 genome using TopHat 2 [8], allowing up to 4 mismatches. Uniquely mapped reads were utilized for gene read counting and FPKM (fragments per kilobase of transcript per million mapped reads) calculation [9].

Differentially expressed genes (DEGs) analysis

The R Bioconductor package edgeR [10] was utilized to screen out the DEGs. A false discovery rate < 0.01 and fold change > 2 or < 0.5 were set as the cut-off criteria for identifying DEGs.

The screening and analysis of long non-coding RNAs (LncRNAs)

Cufflinks was utilized for the assembly and prediction of transcripts from RNA-seq data, followed by the expression level screening of the predicted transcripts, where transcripts with RPKM values less than 1 were excluded. Subsequently, Cuffcompare was employed to merge these transcripts into a unified transcript. From the assembly results, transcripts that overlapped with known coding genes, were shorter than 200 base pairs, possessed potential coding capabilities, or were within 1000 base pairs of the nearest gene were sequentially eliminated, yielding the predicted results for novel LncRNAs. The predicted LncRNAs were then aligned to non-coding RNA data (http://www.noncode.org/) to ascertain the number of newly predicted LncRNAs that already existed in the Noncode database, and these LncRNAs were annotated in accordance with the reference genome. Comparative analysis between samples was conducted to identify differentially expressed LncRNAs (DE LncRNAs). For DELncRNAs, gene information within a 10 kilobase (kb) range upstream and downstream was extracted, and the Pearson correlation coefficient was calculated between DELncRNAs and mRNAs for co-expression analysis. lncRNA-target gene relationships (pairs) that met the criteria of an absolute correlation coefficient greater than 0.6 and a p value less than 0.05 were selected. The union of these two datasets was taken to determine the cis-acting targets of LncRNAs. Enrichment analysis using the KEGG and GO was then performed on the targets.

Functional enrichment analysis

To categorize the functional classes of DEGs, Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were identified using the KOBAS 2.0 server [11]. A hypergeometric test and Benjamini-Hochberg FDR controlling procedure were applied to determine the enrichment of each term.

Validation of qPCR assay

Total RNA was extracted from the samples of patients with lumbar disc herniation and spinal tuberculosis using the Trizol method and dissolved in DEPC water. Subsequently, the cDNA was amplified using the Takara reaction kit, following the steps of genomic DNA removal, reverse transcription, and real-time quantitative PCR. The amplification protocol was as follows: initial denaturation at 95 °C for 10 min for one cycle; amplification at 95 °C for 15 s, 60 °C for 15 s, and 72 °C for 30 s for 40 cycles. The Ct values of each group were collected, and the mRNA quantity was calculated using the following formula: F = 2-△△Ct (ΔCt = Cttarget gene-Ctreference Gene, ΔΔCt = ΔCtTB-ΔCtcontrol). Primer sequence as follows (Table 1):

Table 1 Primer sequence

Results

Identification and enrichment analysis of DEGs in human spinal tuberculosis tissue

This project involved RNA-seq analysis of human spinal tuberculosis tissue and control tissue. Principal component analysis revealed distinct clustering of the patient groups based on All DEGs (Fig. 1D). A total of 2366 DEGs were identified through high-throughput sequencing, with the top 30 most DEGs depicted in Figure S1G. Further analysis revealed 974 genes that were significantly upregulated (Fig. 1B). KEGG enrichment analysis indicated that the upregulated genes are associated with pathways such as cytokine-cytokine receptor interaction, tuberculosis, and TNF-α signaling (Figure S1E). These genes are primarily enriched in biological processes including chemotaxis, cell-cell signal transduction, immune response, interferon-γ-mediated signaling pathways, inflammatory responses, cytokine-mediated signaling pathways, signal transduction, cellular response to lipopolysaccharides, extracellular matrix tissue, and T cell co-stimulation (Fig. 1E). Genes of interest include CXCL9, CXCL10, PLA2G2D, and IL1B. CXCL9 and CXCL10 can promote protective immune responses, PLA2G2D facilitates cell metabolism, and IL1B mediates inflammatory responses to promote tissue regeneration.

RNA-seq DEG DOWN analysis identified 1392 genes with significantly downregulated expression levels (Fig. 1B). KEGG enrichment analysis suggested that downregulated genes are linked to pathways such as ECM-receptor interaction, PPAR signaling pathway, and arachidonic acid metabolism (Figure S1F). GO enrichment analysis indicated that downregulated genes are enriched in biological processes related to muscle filament sliding, muscle contraction, striated muscle contraction, regulation of striated muscle contraction, myocardial contraction, fungal defense response, collagen metabolism process, skeletal system development, and glucose metabolism process (Fig. 1F). Genes associated with muscle contraction, such as TCAP, TNNC2, and PGAM2, deserve particular attention.

Fig. 1
figure 1

Genome wide profiling of BJTB- associated lncRNA in human clinical organization. A: Venn diagram of detected LncRNAs in TB and control samples At least two samples with FPKM ≥ 0.5 was considered to be detected in the group; B: The number of DE LncRNAs and all DEGs between TB and control samples The number of up regulated and down regulated DEGs were shown in bar plot; C-D: Principal component analysis (PCA) of 10 TB and control samples based on normalized DeLncRNAs (C) and all DEGs (D) expression level The samples were grouped by disease state and the ellipse for each group is the confidence ellipse; E-F: Bar plot shows the top ten enriched GO biological process terms of up DEGs (E) and down DEGs (F) ; G: Expression heatmap of top 30 most significant DE LncRNAs between TB and control samples

Genome-wide profiling of BJTB-Associated lncRNA in human clinical bone tissues

Using an RPKM threshold of > 0.5 for filtering, lncRNA analysis of bone tuberculosis RNA-seq data revealed a total of 3652 LncRNAs (Fig. 1A), of which 3293 were known LncRNAs (1589 exclusively in the control group, 138 exclusively in the TB group, and 1566 shared between both groups); 359 were novel LncRNAs (151 exclusively in the control group, 129 exclusively in the TB group, and 79 shared between both groups) (Figure S1A). With a fold change (FC) threshold of ≥ 2 or ≤ 0.5 and an FDR of < 0.05, a total of 540 DE LncRNAs were identified in the TB group compared to the control group, with 356 LncRNAs significantly upregulated and 184 LncRNAs significantly downregulated (Fig. 1B).

Co-expression analysis of lncRNA regulation in trans

Further analysis of DE LncRNAs involved extracting gene information within a 10 kb range upstream and downstream of the LncRNAs and calculating the Pearson correlation coefficient between DE LncRNAs and mRNA for co-expression analysis. LncRNA-target pairs with an absolute correlation coefficient ≥ 0.98 and a p value < 0.01 were selected, and the intersection of these two datasets was taken to identify the cis-acting targets of LncRNAs. The analysis revealed that 311 significantly different LncRNAs could cis-regulate 777 target genes (Fig. 2A). These target genes were enriched in biological processes such as muscle contraction, inflammatory response, and immune response pathways, which are closely related to bone tuberculosis (Fig. 2B). There were 51 genes enriched in the immune response pathway regulated by lncRNA cis-regulation (Fig. 2C), primarily associated with chemotaxis, immune response, cellular response to lipopolysaccharide, chemokine-mediated signaling pathway, neutrophil chemotaxis, inflammatory response, cytokine-mediated signaling pathway, cellular response to interferon-gamma, positive regulation of T cell proliferation, and antimicrobial humoral immune response mediated by antimicrobial peptide (Fig. 2D). LncRNAs regulating immune response-related genes, such as upregulated RP11-451G4.2, RP11-701P16.5, AC079767.4, AC017002.1, LINC01094, CTA-384D8.35, and AC092484.1, and downregulated RP11-2C24.7, merit close attention (Fig. 2E). This study further analyzed the co-expression network of the top 10 biological processes and the DE LncRNAs and target genes involved, visually presenting the relationship between the biological processes and the DE LncRNAs and target genes (Fig. 2F).

Fig. 2
figure 2

>Co expression network illustration between DeLncRNAs and DemRNAs. A: Scatter plot shows DE LncRNAs and its number of co expressed DE mRNAs Red points denote up regulated LncRNAs involved in co expression pairs and blue points denote down regulated LncRNAs Cutoffs of p value 0.01 and Pearson coefficient > = 0.98 were applied to identify the co expression pairs; B: Top 10 most enriched GO terms (biological processes) by the DE mRNAs co expressed with DE LncRNAs; C: Heat map of 51 DE LncRNAs target genes related to immune response; D: GO enrichment map of 51 DE LncRNAs target genes related to immune response; E: Boxplots showing expression status of eight DE LncRNAs in TB and control samples; F: The co expression network between DE LncRNAs and DE mRNAs, which are involved in the top 10 GO terms shown in (B), LncRNAs are on the left and co expressed mRNAs are in the center and the mRNA enriched GO terms are on the right

Verification of screened mRNAs and LncRNAs

Among the screened mRNAs, the expression levels of CXCL9, CXCL10, PLA2G2D, and IL1B genes in the TB group were significantly elevated compared to the control group (P < 0.01). Among the screened LncRNAs, the expression of LncRNAs: RP11-451G4.2, RP11-701P16.5, AC079767.4, AC017007.1, LINC01094, and CTA-384D8.35 in the TB group were significantly increased compared to the control group (P < 0.01), while the expression of RP11-2C24.7 was significantly decreased (P < 0.01) (Fig. 3).

Fig. 3
figure 3

Verification of screened mRNAs and LncRNAs,qPCR assay is used to detected the level of CXCL9 (A), CXCL10 (B), PLA2G2D (C), IL1B (D),  RP11-451G4.2 (E), RP11-701P16.5 (F), AC079767.4 (G), AC017007.1 (H), LINC01094 (I), CTA-384D8.35 (J) and RP11-2C24.7 (K).

Discussion

This study identified DEGs and long non-coding RNAs (LncRNAs) in human bone tuberculosis tissue. By establishing the co-expression relationship between DE mRNAs and LncRNAs, we discovered 8 LncRNAs that are significantly differentially expressed between bone tuberculosis and control samples. The upregulated LncRNAs include RP11-451G4.2, RP11-701P16.5, AC079767.4, AC017007.1, LINC01094, CTA-384D8.35, and AC092484.1, while the downregulated lncRNA is RP11-2C24.7. The aberrant expression of these LncRNAs may influence the pathogenesis and progression of bone tuberculosis by regulating the expression of genes related to downstream signaling pathways involved in muscle contraction, striated muscle contraction, inflammatory response, and immune response. The findings provide valuable insights for further validation and elucidation of the role and mechanisms of LncRNAs in bone tuberculosis, which is significant for the identification of new therapeutic targets for the disease.

High-throughput sequencing is currently widely used in the search for novel diagnostic and therapeutic targets for various diseases, yielding promising results. Our research group has conducted transcriptome sequencing on clinical samples of bone tuberculosis and identified multiple potential molecular markers and therapeutic targets through bioinformatics analysis. These include immune-related genes CXCL9, CXCL10, PLA2G2D, and IL1B, which are significantly overexpressed in spinal tuberculosis. CXCL9 and CXCL10 can promote protective immune responses [12], PLA2G2D facilitates cell metabolism [13], regulates immune cell infiltration and expression of immune checkpoint genes [14] and IL1B mediates inflammatory responses to promote tissue regeneration [15]. The establishment and evasion of innate immunity are crucial components of the interaction between Mycobacterium tuberculosis and its host [16]. The earliest encounter between the host and the pathogen in TB occurs between innate immune cells and Mycobacterium tuberculosis. Although innate immunity is vital for early anti-mycobacterial responses, it is also important for the progression and long-term control of tuberculosis infection by continuously stimulating and nurturing adaptive immune responses and regulating inflammation. However, Mycobacterium tuberculosis employs multiple strategies to disrupt innate immune responses and establish chronic infections [17,18,19]. Current research on the host’s innate immunity in bone tuberculosis is limited. This study’s transcriptome findings suggest that differentially expressed molecules related to immunity may also play a significant role in the occurrence and development of bone tuberculosis. Therefore, further validation using clinical samples and molecular biology methods can provide a strong foundation for identifying potential molecular markers and therapeutic targets for bone tuberculosis.

Host genes related to immunity, cell death, and phagocytosis are upregulated upon stimulation by Mycobacterium tuberculosis [20]. In addition to mRNA, miRNAs, and a large number of non-coding RNAs are also involved in host-pathogen interactions. The characteristics of RNA regulatory networks are an emerging topic in the field of host-pathogen interactions [20, 21]. Previous studies have found interactions between various RNAs in tuberculosis patients, but there is relatively little research on bone tuberculosis [22, 23]. Long non-coding RNAs (LncRNAs) can regulate the expression of protein-coding genes and participate in gene silencing, cell cycle regulation, and cell differentiation processes [24]. It has been discovered that LncRNAs can regulate immune responses, and dysregulation of LncRNAs is associated with many diseases, including cancer and infectious diseases [25]. Numerous studies have found that lncRNA expression patterns may play a significant role in tuberculosis infection and can serve as novel molecular biomarkers and therapeutic targets for tuberculosis [26]. For instance, lncRNA SNHG15 is significantly increased in spinal tuberculosis tissues. Downregulation of lncRNA SNHG15 expression can inhibit the secretion of inflammatory cytokines by regulating the RANK/RANKL pathway, thereby modulating osteoclasts. Our group’s further analysis of LncRNAs in the sequencing data revealed 540 DE LncRNAs, with 311 DE LncRNAs potentially cis-regulating 777 target genes. These target genes are enriched in biological processes such as negative regulation of RNA polymerase II promoter transcription, DNA-dependent translation, DNA-dependent regulatory translation, multicellular tissue development, natural immune response, and small molecule metabolism. Among them, the upregulated LncRNAs such as RP11-451G4.2, RP11-701P16.5, AC079767.4, AC017002.1, LINC01094, CTA-384D8.35, AC092484.1, and the downregulated RP11-2C24.7, which are involved in the regulation of immune response genes, along with the genes C3, AGO4, CTSB, IGHE, and FGFR1, which are under LncRNAs cis-regulation, are of significant interest due to their potential roles in immune modulation.

In summary, the DE mRNAs and LncRNAs in bone tuberculosis are both associated with immune regulatory pathways, which play a crucial role in the occurrence and development of tuberculosis. These pathways not only promote or inhibit the tuberculosis infection and development at the mechanistic level but may also play an important role in the process of tuberculosis transferring to bone tissue. The multiple molecules related to immune regulation in this study can serve as biomarkers for early diagnosis and therapeutic targets for bone tuberculosis, providing strong clues for identifying potential molecular markers and therapeutic targets for the disease. However, since patients with lumbar intervertebral disc degeneration were used as the controls in this study, and they are not clinically healthy subjects, it is possible that their gene expression and signaling pathways have already changed compared to a normal population. This may lead to insufficient evidence for the DEGs and signaling pathways to serve as biomarkers for the early diagnosis of bone tuberculosis and as molecular targets for treatment. Therefore, this study will further collect healthy patient tissues as the control in future research to more accurately screen for markers and molecular targets of bone tuberculosis, and further verify them in animal experiments.