Introduction

Noncoding RNAs (ncRNAs) are transcripts with no potential of protein coding, the number of which constitutes over 98% of the entire genome transcripts [1]. Ubiquitously detected in eukaryotic organism, ncRNAs were once regarded as useless transcriptional noise with no specific biological function [2]. Long noncoding RNAs (lncRNAs), defined as RNA transcripts over 200 nucleotides in length with no ability of coding proteins, demonstrated greater tissue specificity compared to protein-coding mRNAs and thus might serve as promising biomarkers for multiple diseases [3]. Recently, emerging evidence suggested that lncRNAs participate in various critical biological processes including imprinting genomic loci, shaping chromosome conformation, cell differentiation, development and diverse diseases [4].

lncRNAs utilize a variety of mechanisms to translationally modulate expression, degradation and modification of protein, of which the most critical regulation is competing endogenous RNAs (ceRNAs) theory proposed by Salmena et al. [5]. The hypothesis of ceRNA described complex posttranscriptional communication network of all transcript RNA species including lncRNAs, which can act as natural miRNA sponges to inhibit miRNA functions by sharing miRNA response elements (MRE) [6]. Subsequent investigations confirmed the importance of the ceRNA regulatory network of lncRNA–miRNA–mRNA in different diseases [7].

Colorectal cancer (CRC), one of the most frequently occurred cancers in digestive tract, is the third leading cause of cancer-related deaths worldwide [8]. As a multi-step disease, colorectal carcinogenesis is the accumulation of various genetic or epigenetic alternations and their complicated interactions [9]. In recent years, increasing numbers of lncRNAs have been reported to modulate different biological behaviors of CRC tumor cells through ceRNA regulation. For example, lncRNA CCAT1 functions as a ceRNA to antagonize the effect of miR-410 on the down-regulation of ITPKB in human HCT116 colon cancer cells [10]. lncRNA UICLM has been reported to promote CRC liver metastasis by acting as a ceRNA for miRNA-215 to regulate ZEB2 expression [11]. Besides, the novel lncRNA TUSC7 inhibited cell proliferation by sponging miRNA-211 in CRC [12].

Previously, gene expression data of colon cancer have been analyzed in TCGA (The Cancer Genome Atlas) in relation to pathological stage [13] and potential prognostic miRNA biomarkers were identified for predicting overall survival of colon cancer [14]. However, the whole picture of the ceRNA regulatory mechanism of lncRNA–miRNA–mRNA in CRC still remains unclear. Despite considerable improvement in the understanding of lncRNAs over the past decade, only a fraction of annotated lncRNAs has been well identified for biological function in CRC. In this study, we used RNA sequencing data of colon carcinoma (COAD) and rectum carcinoma (READ) from TCGA to identify significantly altered lncRNAs, miRNAs and mRNAs in colorectal carcinogenesis. In addition, the ceRNA modulation network of these lncRNAs, miRNAs and mRNAs was constructed, which would elucidate the molecular mechanisms involved in initiation and progression of CRC, thus providing novel clues for clinical diagnosis and therapy.

Materials and Methods

Access of Raw Data

The raw RNA sequencing data of CRC patients were downloaded from The Cancer Genome Atlas (TCGA), a public available database (cancergenome.nih.gov) which is collaboration between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) that has generated comprehensive and multi-dimensional maps of the critical genomic changes in 33 types of cancer [15]. Raw data of more than 11,000 patients with tumor tissue and matched normal tissues were stored in TCGA dataset. In this study, RNA sequencing data of 478 colon adenocarcinoma cases and 41 controls as well as 166 rectum adenocarcinoma cases and 10 controls were obtained to investigate the significant changes of lncRNAs, miRNAs, mRNAs and their complex interactions in relation to the carcinogenesis of CRC.

Screening of Differentially Expressed lncRNAs, miRNAs and mRNAs

Then, the raw count data of lncRNAs, miRNAs and mRNAs were processed with edge R, a package based on R language for differential expression analysis. For all the P values, we used false discovery rate (FDR) to correct the statistical significance of multiple testing. Differentially expressed genes (DEGs) and differentially expressed lncRNAs (DELs) were screened by cutoff standards of FDR adjusted P value < 0.001 and |logFC (fold change)| > 3. In addition, differentially expressed miRNAs (DEMs) with FDR adjusted P value < 0.001 and |logFC (fold change)| > 4 were considered as significant. The Ensembl database (www.ensembl.org) was used to define and annotate the differentially expressed lncRNAs (DELs). In order to identify the common ceRNA regulatory network in CRC, we subsequently used “MATCH” function to get overlapping DELs, DEMs and DEGs from two profiling datasets of COAD and READ.

Prediction of Targets of lncRNA–miRNA and miRNA–mRNA

The targeting relationship of DELs–DEMs and DEMs–DEGs was predicted by miRWalk, which is a comprehensive online algorithm that provides information on miRNA from human, mouse and rat on their predicted as well as validated binding sites on their target genes (http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk/index.html) [16, 17]. P value < 0.001 indicated significant lncRNA–miRNA and miRNA–mRNA binding possibilities.

Construction of lncRNA–miRNA–mRNA ceRNA Regulatory Network

In order to construct the lncRNA–miRNA–mRNA regulatory network in CRC, we next overlapped DELs with the targets of DEMs to generate the lncRNA–miRNA regulation, and overlapped DEGs with the targets of DEMs to generate the miRNA–mRNA regulation. On the basis of the ceRNA theory that lncRNAs act as natural miRNA sponges to inhibit miRNA functions, the expressions of lncRNA–miRNA and miRNA–mRNA were all negatively correlated. Finally, the ceRNA regulatory network of lncRNA–miRNA–mRNA in colorectal carcinogenesis was constructed by Cytoscape software.

Functional Enrichment Analysis

The biological function and pathway of the genes involved in ceRNA regulatory network would demonstrate instructive information. Therefore, we performed functional and pathway enrichment analysis through Database for Annotation, Visualization and Integrated Discovery (DAVID). DAVID (https://david.ncifcrf.gov/), a bioinformatics data resource with an integrative biology knowledge database and comprehensive analysis tools, helps researchers to discover biological meaning behind large amount of genes [18]. Gene ontology (GO) analysis of the cellular component, molecular function and biological process [19] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis [20] were conducted for the identified differentially expressed genes in ceRNA network by DAVID. P value < 0.05 indicates statistically significant.

Results

CRC-Specific lncRNAs, miRNAs and mRNAs

According to the screening criteria of P value < 0.001 and |logFC| > 3, a total of 36 up-regulated and 12 down-regulated lncRNAs are obtained by overlapping significantly differentially expressed lncRNAs in COAD and READ. Up-regulated ones include lncRNAs such as BBOX1-AS1, CCAT2, UCA1, FEZF1-AS1, LINC00698, LINC00858, LINC00941, MAFG-AS1, LINC01234, LINC01411, NPSR1-AS1, SLCO4A1-AS1; down-regulated lncRNAs include B3GALT5-AS1, BVES-AS1, PGM5-AS1, LINC00974, HAND2-AS1, CDKN2B-AS1, LINC00682 and LINC00507. The top significant altered lncRNAs and their targeting miRNAs in CRC are summarized in Table 1, and the whole altered lncRNAs are listed in Supplementary Table 1. Totally, 49 up-regulated miRNAs such as mir-21, mir-141, mir-374a, mir-19b-1, mir-126, mir-29b-1, mir-542, mir-98, as well as 15 down-regulated miRNAs such as mir-197, mir-139, mir-504, mir-6511b-1, mir-6511b-2, mir-3622a, mir-129-1 are significantly altered in both COAD and READ. And some of the most significantly changed miRNAs and their targeting genes in CRC are shown in Table 2, and the whole altered miRNAs are listed in Supplementary Table 2. In addition, altogether 260 up-regulated and 294 down-regulated mRNAs are detected by overlapping significantly differentially expressed genes in COAD and READ.

Table 1 Differentially expressed lncRNAs between colorectal cancer and normal tissues
Table 2 Differentially expressed miRNAs between colorectal cancer and normal tissues

ceRNA Regulatory Network of lncRNA–miRNA–mRNA

In order to construct the ceRNA regulatory network of lncRNA–miRNA–mRNA, we next predict the interaction of miRNAs with lncRNAs and mRNAs. According to the ceRNAs theory, the expressions of miRNAs should be negatively correlated with expressions of targeted lncRNAs and mRNAs. Therefore, we overlapped the predicted targets of up-regulated DEMs with down-regulated DELs and DEGs, and overlapped the predicted targets of down-regulated DEMs with up-regulated DELs and DEGs. Finally, ceRNA networks including 22 up-regulated lncRNAs, 12 down-regulated miRNAs and 122 up-regulated mRNAs (Fig. 1a), as well as 8 down-regulated lncRNAs, 43 up-regulated miRNAs and 139 down-regulated mRNAs are constructed (Fig. 3a).

Fig. 1
figure 1

a Competitive endogenous RNA (ceRNA) regulation network of 22 up-regulated lncRNA, 12 down-regulated miRNA and 122 up-regulated mRNA in colorectal cancer. Blue circle indicates down-regulated miRNA; orange triangle indicates up-regulated mRNA; red rhombus indicates up-regulated lncRNA. be Represented altered pathways. b collagen catabolic process (involved genes: KLK6, MMP7, MMP13, MMP3, MMP1, MMP11, COL10A1, COL11A1); c wnt pathway (involved genes: MMP7, WNT2, DKK4, WNT7B, PRKCG, NKD1, NKD2); d calcium ion binding (involved genes: LPO, COMP, MMP1, MMP11, NKD1, NKD2, DSC3, DSG3, PRKCG, CACNG4, TESC, GRIN2D), e organic anion transport (involved genes: SLCO1B3, SLC22A11, SLC26A9, SLC13A3, SLC4A11)

GO Functional Enrichment

We then performed GO functional enrichment for differentially expressed genes between CRC and normal tissues involved in the ceRNA regulatory network. The results showed that up-regulated genes of ceRNA network mainly enrich in biological process (BP) including organic anion transport, collagen catabolic process, negative regulation of RNA metabolic process, wound healing and Wnt receptor signalling pathway (Fig. 2). Cellular component (CC) analysis indicates enrichment in extracellular region, cell junction, proteinaceous extracellular matrix, anchoring junction and extracellular matrix. Besides, these up-regulated genes of ceRNA network show significant enrichment in molecular function (MF) of calcium ion binding, anion antiporter activity, serine-type peptidase activity and serine hydrolase activity (Table 3).

Fig. 2
figure 2

Gene ontology analysis of up-regulated genes involved in the ceRNA regulation network of colorectal cancer. A total of 122 up-regulated mRNAs were chosen for GO analysis for biological process, cellular component and molecular function. The bar graphs represented the enrichment of these mRNAs with horizontal axis for GO items and vertical axis for −log P value

Table 3 Gene ontology analysis of genes involved in the ceRNA regulation network of colorectal cancer

As for the down-regulated genes of CRC involved in the ceRNA regulatory network (Fig. 3), GO analysis suggests significant enrichment in biological process (BP) of metal ion homeostasis, transmission of nerve impulse, cell–cell signalling, transmembrane transport and cell surface receptor-linked signal transduction (Fig. 4). Cellular component (CC) analysis indicates enrichment in sarcoplasmic reticulum, sarcoplasm, cell fraction, extracellular region and synapse. Additionally, down-regulated genes of ceRNA network show significant enrichment in molecular function (MF) of peptide receptor activity, neuropeptide receptor activity, peptide binding, ATPase activity and hormone activity (Table 3).

Fig. 3
figure 3

a Competitive endogenous RNA (ceRNA) regulation network of 8 down-regulated lncRNA, 43 up-regulated miRNA and 139 down-regulated mRNA in colorectal cancer. Orange circle indicates up-regulated miRNA; blue triangle indicates down-regulated mRNA; green rhombus indicates down-regulated lncRNA. be Represented altered pathways. b ABC transporters (involved genes: ABCB11, ABCA8, ABCG2, ABCB5); c Neuroactive ligand–receptor interaction (involved genes: ADCYAP1R1, GLP2R, NPY2R, GRIK3, GALR1, AGTR1, AVPR1B, TACR2); d calcium signalling pathway (involved genes: CHP2, NOS1, MYLK, AGTR1, AVPR1B, TACR2), e steroid hormone biosynthesis (involved genes: HSD3B2, CTP3A4, UGT2B17)

Fig. 4
figure 4

Gene ontology analysis of down-regulated genes involved in the ceRNA regulation network of colorectal cancer. A total of 139 down-regulated mRNAs were chosen for GO analysis for biological process, cellular component and molecular function. The bar graphs represented the enrichment of these mRNAs with horizontal axis for GO items and vertical axis for −log P value

KEGG Pathway Analysis

KEGG analysis enriches up-regulated genes of ceRNA network in pathways of Wnt signalling pathway, tyrosine metabolism, taurine and hypotaurine metabolism, melanogenesis and phenylalanine metabolism (Table 4). In addition, down-regulated genes involved in ceRNA network of CRC show significant enrichment in pathways of ABC transporters, neuroactive ligand–receptor interaction, retinol metabolism, nitrogen metabolism, calcium signalling pathway and steroid hormone biosynthesis (Table 4). The represented altered pathways were shown in Figs. 1b–e and 3b–e.

Table 4 Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis of genes involved in the ceRNA regulation network of colorectal cancer

Discussion

In 2011, Salmena et al. [5] proposed a unifying hypothesis on how mRNAs and lncRNAs “talk” to each other by using microRNA response elements (MREs) as letters of a new language. This “competing endogenous RNA” (ceRNA) communication forms a large-scale regulatory network across the transcriptome, which greatly expands the functional genetic information within the human genome and plays essential roles in pathological conditions including cancer [21]. Previously, gene expression data of colon cancer have been analyzed in The Cancer Genome Atlas (TCGA) in relation to pathological stage [13], and potential prognostic miRNA biomarkers were identified for predicting overall survival of colon cancer [14]. In this study, for the first time, we used data of TCGA to screen significantly altered lncRNAs, miRNAs and mRNAs in CRC. In addition, the ceRNA modulation network was constructed using these lncRNA, miRNA and mRNAs, which would shed novel light on the molecular mechanisms involved in initiation and progression of CRC.

lncRNAs have been found to be implicated in a variety of biological regulatory functions including epigenome, transcriptional or posttranscriptional levels and pathogenesis of cancers [22,23,24]. In this study, we identified a total of 36 up-regulated and 12 down-regulated lncRNAs whose expressions were significantly changed in both COAD and READ. Among these lncRNAs, some have been confirmed by previous molecular experiments: For example, lncRNA CCAT2 has been found to be overexpressed in CRC tissues than adjacent normal tissues and was an independent poor prognostic factor [25]. In addition, CCAT2 was found to underlie metastatic progression and chromosomal instability by a novel mechanism of MYC and WNT modulation in colon cancer [26]. Another lncRNA UCA1 was also found to be up-regulated in CRC, which influence cell proliferation, apoptosis and cell cycle distribution [27]. Besides, increased UCA1 was related with tumor proliferation and metastasis, which predicted worse survival in CRC [28]. The above-mentioned results supported our findings in these studies to some degree. Therefore, the lncRNAs we identified in this study might serve as biomarkers with a potential application in CRC diagnosis, progression and therapy.

MicroRNAs are 18-25 nucleotide-long, single-stranded noncoding RNA which posttranscriptionally modulate gene expression in a variety of cancer-related signalling pathways and processes [29]. By analyzing TCGA data, we summarized differentially expressed miRNAs in both COAD and READ, including 49 up-regulated miRNAs and 15 down-regulated miRNAs. Several miRNAs in our list have been investigated in CRC: for example, mir-21 expression level in CRC linked with worse clinical outcome especially in carcinomas with high PTGS2 level, indicating complex implication of immunity and inflammation in CRC progression [30]. In addition, the overexpression of miR-141 correlated with liver metastasis of CRC through inhibiting apoptosis and inducing migration [31]. Currently, the understanding of other miRNAs identified in this study is limited, which require molecular studies to confirm. These CRC-specific miRNAs might become potential biomarkers with specificity in the diagnosis and progression of CRC in future.

We next constructed the ceRNA regulatory network of lncRNA–miRNA–mRNA by overlapping the targets of DEMs with DELs and DEGs. Finally, 22 up-regulated lncRNAs, 12 down-regulated miRNAs and 122 up-regulated mRNAs were involved in the ceRNA network. Altogether 8 down-regulated lncRNAs, 43 up-regulated miRNAs and 139 down-regulated mRNAs were implicated in the ceRNA networks. Collectively, ceRNA network mainly regulated important metabolism process of amino acid, hormone and nitrogen, as well as controlled critical functions such as cell signal, transmembrane transport and wnt pathway. Among the identified pathways, Wnt signalling pathway and its complex regulation has long been identified as a critical event in colorectal carcinogenesis [32]. Therefore, there has been a considerable increase in interest in potential therapeutic targets of the Wnt signalling pathway [33]. Of the screened biological processes, metal ion homeostasis has also been found to be implicated in colorectal cancer [34], which offers us a promising research direction of metabolic balance. And novel transmembrane transport pattern such as exosome showed great involvement in the initiation and progression of CRC [35, 36], which might also be regulated by ceRNA network according to our findings. The above biological processes demonstrated essential roles in colorectal carcinogenesis [37], which still worth further investigations to better clarify. In this study, clinical information such as TNM, tumor stage and histology was not analyzed which might be a potential limitation and required further studies to clarify. In the future, molecular biology methods including qPCR, luciferase reporter systems, co-immunoprecipitation assays are helpful to validate our findings, thus unravelling the molecular mechanisms of ceRNA networks.

Conclusion

We identified significantly altered lncRNAs, miRNAs and mRNAs in colorectal carcinogenesis by using RNA sequencing data of colon carcinoma and rectum carcinoma from TCGA. Differentially expressed lncRNAs and miRNAs might serve as potential biomarkers for tumorigenesis of CRC. In addition, the ceRNA regulatory network of lncRNA–miRNA–mRNA was constructed, which would elucidate novel molecular mechanisms involved in initiation and progression of CRC, thus providing promising clues for clinical diagnosis and therapy.