Introduction

Almost 2.5 million people around the world suffer from multiple sclerosis (MS), a chronic inflammatory disease of the central nervous system (CNS). MS is characterized by axon demyelination of the brain and spinal cord (Dendrou et al. 2015). Genetic and environmental factors influence the onset and development of MS (Olsson et al. 2017). Although it has been proposed that MS is caused by the action of CD4+ T cells (McFarland and Martin 2007), recent evidence revealed that memory B cells also play an important role in the pathogenesis of MS (Corcione et al. 2004; Cepok et al. 2005; Baker et al. 2017). B cells participate in the progress of the CNS lesions by producing auto-antibodies and secreting pro-inflammatory cytokines and also by presenting auto-antigens to activated T cells (Krumbholz et al. 2012; Fillatreau et al. 2002; Bar-Or et al. 2010; Tintore et al. 2008; Ignacio et al. 2010; Berger et al. 2003; Sellebjerg et al. 2009). Depletion of memory B cells is a high-efficacy treatment for relapsing MS, while increasing memory B cells exacerbates MS (Baker et al. 2017). Clinical trials with Rituximab, a monoclonal antibody against CD20, was an effective treatment for MS progression through the targeting of memory B cells (Bar-Or et al. 2008; Duddy et al. 2007).

Long non-coding RNAs (lncRNA) are defined as transcripts with more than 200 nucleotides in length without a significant open reading frame to encode protein (Ulitsky and Bartel 2013; Liao et al. 2011). lncRNAs play important roles in multiple biological processes through the regulation of gene expression (Fatica and Bozzoni 2014; Lee 2012). As disease biomarker candidates they have advantages over protein-coding genes due to their relative stability in body fluids and the relative ease of their detection by highly sensitive and specific PCR methods (Geisler and Coller 2013; Tong and Lo 2006).

Around 50% of protein-coding genes are correlated with the co-expression of a lncRNA located less than 50 kb away. Moreover, this correlation does not depend on the orientation of transcription of lncRNA genes and co-expressed protein-coding genes (Spurlock III et al. 2015).

There are several evidances indicating modulation of miRNAs levels in autoimmune diseases, thus regulation of such miRNAs may prevent development of autoimmune diseases, as reported for MS pathogenesis (Tufekci et al. 2011). LncRNAs could exert their functions through interacting with such miRNAs (Paraskevopoulou et al. 2012; Li et al. 2013). The influence of lncRNAs and microRNAs (miRNAs) on each other is rapidly emerging in recent studies. In some cases, the stability of a lncRNA is reduced by interacting with specific miRNAsIn other cases, lncRNAs act as decoys for miRNAs, suppressing miRNA repression of target messenger RNAs (mRNA). Other lncRNAs compete with miRNAs for interaction with the shared target and thereby influence mRNA expression. Additionally, some lncRNAs repress target mRNAs by producing miRNAs (Chen et al. 2012; Yoon et al. 2014).

In the current study, we analyzed a subset of lncRNAs that are specifically expressed in memory B cells at less than 50 kb distance away from differentially expressed genes in peripheral blood mononuclear cells (PBMC) of relapsing-remitting multiple sclerosis (RRMS) patients.

Materials and Methods

The approaches used to identify candidate lncRNAs are illustrated in Fig. 1 and described in more detail below.

Fig. 1
figure 1

In silico workflow of the study. Flow chart showing the methodological approaches used to identify lncRNAs linked to MS in B cells. In the first step, memory B cell lineage-specific lncRNAs retrieved from whole-genome RNA-seq data and chromosomal locations and transcripts of these lncRNAs were extracted from Ensembl GRCh37. a We obtained differentially expressed genes with adjusted p values < 0.05 and log fold change > ±1 from GSE21942 GEO dataset, comparing MS to healthy individuals. Finally, using python programming language, selected lncRNAs with less than 50 kb distance from protein-coding genes were selected as appropriate candidate lncRNAs for this study. b The sequences of lncRNA transcripts were entered into the LncDisease software and one of the lncRNAs found effective with miRNAs of MS

LncRNAs Selection

In this study 48 memory B cell lineage-specific lncRNAs were retrieved from whole-genome RNA-seq data (Ranzani et al. 2015). In the next step, the chromosomal locations and the identified transcripts for these lncRNAs were mapped using Ensembl (genome assembly GRCh37) for further analyses.

LncRNA Adjacent to Differentially Expressed Genes in MS

We used the dataset from the Gene Expression Omnibus (GEO) database and GEO series (GSE) by accession number: GSE21942 (Platform: GPL570, Affymetrix Human Genome U133 Plus 2.0 Array; 29 samples) to extract genes expressed differentially in MS patients compared to healthy individuals. Next, GEO2R analyzer was used to compare two groups of samples, control and patient, to identify differentially expressed genes. Genes with expression levels with adjusted p values < 0.05 and absolute log fold change greater than 1 were selected. Finally, using Python programming language (v3.6.0), selected lncRNAs that are located on the same chromosome with relevant protein-coding genes were identified, and then those lncRNAs with less than 50 kb distance from the protein-coding genes were considered appropriate candidate lncRNAs for this study.

LncRNA–miRNA Interactions and Associated Diseases

To identify the possible interaction between lncRNAs and miRNAs, LncDisease software (a sequence-based bioinformatics tool) was used to identify the effective transcripts of the lncRNA associated with MS through interaction with miRNAs involved in MS disease (Wang et al. 2016). LncDisease utilizes TargetScan and miRanda criteria to perform analysis as well as the HMDD database, and human miRNA-disease associations, to reach the final results.

Ethical Issue

The human subject protocol used for this study was approved by an Institutional Review Board of the Royan Institute (Project ID. No. 91000573). All study procedures were carried out in accordance with the approved guidelines.

Human Subjects

The present case–control study was designed at the Royan Institute of Isfahan. Written informed consent was obtained from each individual. The datasets analyzed during the current study are available in the GSE21942 dataset in the GEO database, and all data generated during this study are included in this article and its supplementary information files. Patients were clinically diagnosed with multiple sclerosis by a neurologist through clinical and laboratory parameters based on the revised McDonald criteria (Polman et al. 2011). In this study, PBMCs were isolated from 50 RRMS patients and 25 healthy controls. A total of 25 MS patients was in remitting phase and under regular treatment with interferon beta-1α (CinnoVex®) and another 25 cases were in the relapsing phase. Age- and sex-matched healthy controls with no history of autoimmune diseases or malignancies and no acute or chronic infections were sampled. Disability was graded using the Expanded Disability Status Scale (EDSS), method of calculating disability in MS patients (Kurtzke 1983).

PBMC Preparation

Peripheral blood was collected from all subjects in tubes containing Ethylenediaminetetraacetic acid (EDTA) to prevent coagulation. Human PBMCs were isolated on a ficoll-hypaque lymphocyte separation medium (STEMCELL Technologies, USA) density gradient.

RNA Extraction and cDNA Synthesis

Total RNA was isolated with Trizol® reagent (Invitrogen, USA) following the manufacturer’s instructions. The RNA quality and quantity were assessed using a NanoDrop Spectrophotometer (Nanodrop 1000, Thermo Scientific, USA) and electrophoresis performed on 1% agarose gels. Next, in order to remove DNA contaminations, RNA was treated with deoxyribonuclease I (DNase I) (Thermo Scientific, USA). Total RNA (1 μg) was used for the biosynthesis of complementary DNA (cDNA) following recommended procedures from the manufacturer (Thermo Scientific, USA). cDNAs were stored at − 80 °C until use.

Quantitative Reverse Transcription PCR (RT-qPCR)

Quantitative reverse transcription PCR (RT-qPCR) was carried out with the StepOne™ RT-qPCR System (Applied Biosystems, USA). RT-qPCR amplifications were performed in triplicate. The RT-qPCR process was carried out in a final content volume of 10 μL and included SYBR Premix Ex Taq II (TaKaRa, Japan) and specific primer pairs for each lncRNA and coding gene. In order to normalize gene expression in healthy controls, relapsing and remitting patients, UBC and YWHAZ mRNAs were used as reference genes (Oturai et al. 2016). The list of primer sequences used for PCR are summarized in Table 1.

Table 1 Primers used for quantitative reverse transcription PCR

Statistical Analysis

The statistical analyses were carried out using SPSS 17 software (SPSS, Chicago, IL, USA) and Graph Pad Prism (version 6; Graph Pad software). Data normality was checked using the Shapiro–Wilk test and differences between groups were assessed by One-way ANOVA followed by pairwise comparisons and using Tukey’s correction. The correlation between lncRNAs and coding genes were assessed using Pearson’s correlation. Statistical significance was considered as p values less than 0.05. Receiver Operating Characteristic (ROC) curve analysis was used to determine the discriminatory power of identified biomarkers between the patients and controls.

Results

LncRNAs Selection

From the available RNA-Seq data, 48 lncRNAs were identified that were lineage-specific to memory B cells and were annotated to chromosomes using the GRCh37 genome assembly in Ensembl. Next, we considered differentially expressed protein-coding genes which had been identified in the PBMC of MS patients compared to healthy controls (497; Supplementary Table 1) and were retrieved from GEO database. We examined the proximity of this set of differentially expressed genes to the 48 selected lncRNAs and identified 2 lncRNAs located less than 50 kb distance away from the respective protein-coding genes. AL928742.12 was 10 kb away from immunoglobulin heavy constant alpha 2 (IGHA2) and RP11-530C5.1 has overlap with PRKC apoptosis WT1 regulator (PAWR), indicating the potential cis-regulatory relationships of these neighboring genes (Figs. 2 and 3). To select the appropriate candidates among the three annotated transcripts associated with AL928742.12, respective exonic sequences (in FASTA format) were entered into the ‘LncRNAs Input’ panel of the LncDisease software. AL928742.12-001 was scored as an MS-associated transcript according to predicted miRNA-lncRNA interactions. In the case of RP11-530C5.1, there was only one annotated transcript.

Fig. 2
figure 2

Physical gene proximities of LncRNAs with expressed protein-coding genes from array data. Proximities of differentially expressed protein-coding genes relative to lncRNAs in MS patients were determined using python programming language. mRNAs retrieved from GEO were inserted to STRING db and then combined with the selected lncRNAs and visualized using Cytoscape. Identified protein-coding gene–lncRNAs pairs located within the distance parameters (50 kbp) are indicated with thick lines

Fig. 3
figure 3

Chromosomal organization of protein-coding genes and lncRNA clusters. Genomic data were obtained from the UCSC genome browser. Protein-coding genes are in blue, and lncRNA genes in green. The region on the respective chromosomes displayed in detail is indicated with a red bar in each chromosome idiogram

Demographic and Clinical Characteristics of Enrolled Samples

MS patients and healthy individuals enrolled in this study completed questionnaires. Subject information that was included in the final analyses is shown in Table 2 and additional information on all individuals is provided in supplementary Table 2. Statistical analyses showed no significant differences between samples in each group with regard to sex and age.

Table 2 Demographic features of MS patients and controls

LncRNAs Expression Levels in RRMS Compared to the Healthy Controls

After measuring differentially expressed lncRNAs in each group, statistical analyses showed a significant up-regulation of RP11-530C5.1 in relapsing MS patients compared to the remitting patients (p value = 0.046) and healthy controls (p value = 0.002). In contrast, there was a significant decrease of AL928742.12 expression in relapsing MS patients compared to the controls (p value < 0.001) (Fig. 4).

Fig. 4
figure 4

Expression level analyses of AL928742.12 and RP11-530C5.1 lncRNAs in relapsing–remitting and control samples. a Scatter-plot of the expression level of AL928742.12 and b Scatter-plot of the expression level of RP11-530C5.1 in MS and control samples was measured by RT-qPCR and values are given as the mean normalized expression relative to UBC and YWHAZ. (*p < 0.05, **p < 0.01 and*** p < 0.001)

Evaluating Expression Levels of Coding Genes (IGHA2 and PAWR)

In the next step, we measured the expression levels of protein-coding genes, PAWR and IGHA2, which were located less than 50 kb distance away from the differentially expressed lncRNAs. No statistically significant changes in PAWR and IGHA2 expression levels were detected when we compared RRMS patients to healthy controls (Fig. 5).

Fig. 5
figure 5

Expression of IGHA2 and PAWR in MS patients compared to the controls. Scatter-plots of the differential expression level of aPAWR and bIGHA2 in relapsing, remitting, and control samples. Values are given as mean normalized expression relative to UBC and YWHAZ. (*p < 0.05, **p < 0.01 and*** p < 0.001)

Correlation Coefficient Between lncRNA and Coding Genes

Pearson’s correlation tests showed positive correlations between the expression levels of RP11-530C5.1 and PAWR (Pearson’s correlation = 0.269, p value = 0.047). Likewise, there was a significant correlation between AL928742.12 and IGHA2 (Pearson’s correlation = 0.476, p value < 0.001) (Fig. 6).

Fig. 6
figure 6

Correlation analyses between lncRNAs and protein-coding genes in MS. Pearson correlation of expression between a AL928742.12 and IGHA2 and b RP11-530C5.1 and PAWR

ROC Curve Analysis Results

The results of ROC curve tests showed positive biomarker potential for AL928742.12 (AUC = 0.723, p value = 0.006), as well as the RP11-530C5.1 (AUC = 0.825, p value < 0.0001) (Fig. 7). These findings indicate that these two lncRNAs have the potential to serve as diagnostic biomarkers to distinguish healthy controls from relapse phase MS patients.

Fig. 7
figure 7

ROC curve analysis indicates lncRNAs as likely biomarkers for MS. Discriminatory power of the individual lncRNAs, a AL928742.12 and b RP11-530C5.1, as biomarkers for the diagnosis of MS patients and controls

Discussion

Several specific lncRNAs were recently shown to be deregulated in the PBMCs of MS patients, leading to a proposed role for specific lncRNAs in the progression of MS pathogenesis (Zhang and Cao 2016; Teimuri et al. 2018; Hosseini et al. 2019). LncRNAs have important roles in regulating gene expression and abnormal expression of lncRNAs has recently been linked to the pathogenesis and progression of multiple diseases (DiStefano 2018). Recent studies also demonstrated control of the immune system by lncRNAs (Zhang and Cao 2016; Heward and Lindsay 2014). B lymphocytes have a key role in the normal immune response by secreting antibodies in humoral immunity. Therapies that target memory B cells have become an important focus in MS disease research (Baker et al. 2017).

In this study, we aimed to identify candidate memory B cell-specific lncRNAs involved in MS pathogenesis. To accomplish this, we selected lncRNAs specifically expressed in memory B cell lineage that were also located less than 50 kb distance away from genes differentially expressed in the PBMC of MS patients. Because these differentially expressed genes might be involved in MS pathogenesis, their close physical association with differentially expressed lncRNAs suggests an associated involvement of the lncRNA with MS as well. We evaluated the expression levels of the identified lncRNAs and their neighboring mRNAs in relapsing and remitting phase MS patients compared to healthy individuals.

Analysis of microarray data retrieved from gene expression profiling in PBMCs from relapsing–remitting MS patients demonstrates that PAWR is among the top differentially expressed genes in B cell (Comabella et al. 2015). PAWR is a pro-apoptotic gene and a recent study showed that B cells derived from patients with RRMS induce apoptosis in oligodendrocytes and neurons via unknown secreted factors (Lisak et al. 2017). IGHA2, on the other hand, encodes the constant region of immunoglobulin heavy chains. Immunoglobulins serve as receptors that initiate B lymphocyte differentiation into antibody-secreting plasma cells. Secreted immunoglobulins mediate the effector phase of humoral immunity, which blocks antigen binding to these receptors (McHeyzer-Williams et al. 2012; Schroeder and Cavacini 2010).

RP11-530C5.1 was significantly higher in relapsing MS patients compared to remitting phase patients and healthy controls. The correlation between RP11-530C5.1 and PAWR expression suggests a cis-regulatory role for RP11-530C5.1 on PAWR in memory B cells.

Recently, it was shown that AL928742.12 is down-regulated in inflammatory bowel disease (Mirza et al. 2015). Our results indicated that AL928742.12 was also significantly down-regulated in relapsing MS patients compared to the healthy controls. AL928742.12 expression was also significantly correlated with IGHA2 expression.

LncRNAs may have correlation with their adjacent genes and exert a positive or negative effect on expression of these genes at both transcriptional and post-transcriptional levels. Such regulation is important in development, differentiation, or even progress of human disease (Wilusz et al. 2009; Taft et al. 2010; Yap et al. 2010).

Conclusion

In this study, candidate lncRNAs involved in MS disease were identified from B cell-specific lncRNAs that were identified through adjacent differentially expressed genes in the PBMC of MS patients compared to the healthy controls, and also through interactions with miRNAs involved in MS. Here, we propose that deregulated lncRNAs identified from these associations could provide a valuable resource for studies to discern the important lncRNAs in diseases involving specific cell types and their associations with, and potential regulation of, nearby protein-coding genes.