Introduction

The increasing availability of multi-level expression data from cancer and normal tissue has created a new opportunity for integration and extraction of knowledge from large datasets such as gene expression omnibus (GEO) that promises a more comprehensive understanding of cancer. Previous data integration efforts in pancreatic cancer have focused on integrating a subset of profiles. For example, Tahira et al. [19] used custom complementary DNA (cDNA) microarray comprising protein-coding messenger RNA (mRNA) and long noncoding RNA (lncRNA) to identify significant expression signatures correlated to pancreatic cancer and metastasis. In addition, Frampton et al. [7] combined data from microRNA (miRNA) and mRNA expression profiles and bioinformatic analyses to identify functional miRNA–mRNA interactions that contribute to growth of PDACs. Similarly, Donahue et al. [5] developed a method to identify prognosis-significant genes based on analysis of DNA copy number, and mRNA and miRNA expression. However, it remains unknown what is their relationships and how to efficiently integrate different level expression profiles.

Recent studies have revealed that miRNAs disorder is often due to the aberrant expression of lncRNAs and transcription factors (TFs). lncRNAs because of their longer size (>200 nucleotides), can regulate microRNA abundance by binding and sequestering them, acting as the so-called microRNA sponges, thus regulating the expression of target mRNAs. Wang et al. close the circle by describing a lncRNA–miRNA–mRNA (CHRF-miR489-Myd88) trio that functions interdependently to regulate cardiac hypertrophy. HULC is a highly up-regulated lncRNA in liver cancer transcribed from human chromosome 6p24.3. The HULC gene consists of two exons and a single intron, while HULC contains a polyA tail and particularly a conserved target site of miR-372 [16, 20]. The expression of miRNAs is controlled by TFs also. For example, miR-122 is transcriptionally controlled by TFs enriched in the liver, such as hepatocyte nuclear factors (HNFs) and CCAAT/enhancer-binding proteins (C/EBPs), which play pivotal roles in regulating the expression of liver-specific genes [22].

Recent advances in high-throughput sequencing of immunoprecipitated RNAs after crosslinking (CLIP-Seq, HITS-CLIP, PAR-CLIP, CLASH, and iCLIP) and chromatin immunoprecipitation with massively parallel DNA sequencing (ChIP-Seq) provide powerful ways to identify biologically relevant miRNA–mRNA and lncRNA–miRNA and TF–miRNA interactions [ 13, 23, 24 ]. The application of CLIP-Seq methods has reliably identified Argonaute (Ago) and other RNA-binding proteins (RBP) binding sites to characterize miRNA–mRNA and lncRNA–miRNA interactions [ 1, 3, 9, 12 ]. The application of the ChIP-Seq technique has significantly decreased false-positive predictions of transcription factor binding sites (TFBSs) to identify TF–miRNA interactions [6, 17, 18].

The involvement of plenty of lncRNAs and TFs in the transcriptional regulation of miRNA has not been reported in pancreatic cancer. There is a great need to integrate these large-scale data sets to explore the regulation mechanisms of TF–miRNA–mRNA and lncRNA–miRNA–mRNA. In this study, we primarily use bioinformatics method to predict these two regulation mechanism of miRNAs disorder in pancreatic cancer. We integrated expression microarray to identify a differentially expressed mRNAs, miRNAs, and lncRNAs in pancreatic cancer. Combining these differentially expressed RNAs, we constructed miRNA–mRNA regulatory network based on CLIP-Seq data to reveal biological effects of interrelated miRNAs. Furthermore, analysis of regulatory networks including TF–miRNA–mRNA and lncRNA–miRNA–mRNA identified several miRNAs, lncRNAs, and TF which are possibly involved in this two regulation mechanisms in pancreatic cancer.

Materials and Methods

Selection of Studies and Datasets

GEO (http://www.ncbi.nlm.nih.gov/geo/webcite) was searched for PADC mRNAs, miRNAs, and lncRNAs expression profiling studies. We included only original experimental articles that compared the expression of RNAs in PDAC tissue and noncancerous pancreatic tissue in humans. The titles and abstracts of the articles were screened, and the full text of the articles of interest was evaluated. We selected three datasets including GSE32676 (25 pancreatic cancer samples and seven normal pancreas control), GSE30134 (18 pancreatic cancer samples and nine normal pancreas control), and pancreatic expression database [4] (PED, http://www.pancreasexpression.org/webcite) (96 pancreatic cancer samples and four normal pancreas control) as original mRNAs datasets finally. GSE30134 is used as an original lncRNAs dataset. Another three datasets including GSE24279 (136 pancreatic cancer samples and 22 normal pancreas controls), GSE28862 (three pancreatic cancer samples and three normal pancreas adjacent to cancer), and GSE32678 (25 pancreatic cancer samples and seven normal pancreas controls) were selected as original miRNAs dataset. GSE32678 and GSE32676 were came from the same group of the patients.

Data Processing

Differential Expression Analysis

We downloaded the original data and documents from these datasets with CEL or TXT format. If the data included raw CEL data, we use GC-RMA methods from bioConductor (http://www.bioconductor.org/) to normalize and summarize the probe set information. For which had no CEL data, we used standard TXT format for import. Then the data file was imported into BRB-ArrayTools version 4.2 [ 25 ] (National Cancer Institute), which was available at http://linus.nci.nih.gov/BRB-ArrayTools.html. RNAs which passed filtering and normalization criteria were analyzed using BRB-ArrayTools, which compares RNAs expression among predefined classes and presumes the data consists of experiments of different samples representative of the classes. We identified differentially expressed RNAs using a multivariate permutation test.

Vote-Counting Strategy

The RNAs were selected according to their importance as follows: (i) number of the same RNAs being differentially expressed in expression profiling datasets; (ii) number of the same RNAs having a consistent direction of change; and (iii) RNAs with an inconsistent direction of change in only two datasets were excluded.

Construction of Regulatory Network

A total of 606408 miRNA–mRNA interactions and 10212 lncRNA–miRNA interactions were downloaded based on CLIP-Seq data from starBas [13, 23] (http://starbase.sysu.edu.cn/index.php/webcite) in April, 2014. Of these, 4811 pairs of miRNA-mRNA interactions have valid relationships with expression profiles, which were preprocessed and identified to differentially expressed miRNA and mRNA in pancreatic cancer. The following parameters were selected for reducing false positives in processing: (i) Number of supporting experiments ≥1 mean that at least one CLIP-Seq experiments supported the predicted miRNA target site. (ii) Pan-Cancer ≥1 mean that expression of miRNA and target gene is anti-correlation (pearson correlation: r < 0, p value < 0.05) at least one cancer type. (iii) Expression regulatory patterns of miRNA–mRNA, consistent with up-down or down-up were included. A total of 55675 TF–miRNA interactions were downloaded based on ChIP-Seq data from ChIPBase [24] (http://deepbase.sysu.edu.cn/chipbase/index.php/webcite) in April, 2014. The above interactions information was imported into Cytoscape software version 2.8.3 [2] (http://www.cytoscape.org/webcite) to construct regulatory network.

Pathway Analysis

To explore biological effects of interrelated miRNAs in pancreatic cancer, we inputted the candidate genes into database of DAVID [8] (http://david.abcc.ncifcrf.gov/webcite) for pathway enrichment analysis.

Results

Identification of Differentially Expressed mRNAs, miRNAs and lncRNAs

We collected pancreatic cancer-related microarray expression data including mRNAs, miRNAs, and lncRNAs from GEO:139 pancreatic cancers and 20 normal pancreas in three mRNAs expression data sets; 164 pancreatic cancers and 32 normal pancreas in three miRNAs expression datasets; and 18 pancreatic cancers and nine normal pancreas in one lncRNAs expression data sets. We developed a computational pipeline to analyze the above date (Fig. 1). The results show that 4385 mRNAs, 500 miRNAs, and 21 lncRNAs were differentially expressed in pancreatic cancer (Tables S1, S2, Table 1). Of which, 18 mRNAs and 54 miRNAs were of high confidence (Tables 2, 3). However, 325 mRNAs and 45 miRNAs with inconsistent direction in two studies were excluded (Table S3, Table S4).

Fig. 1
figure 1

Flow-chat of data analysis. Firstly, we identified differentially expressed mRNAs, miRNAs, and lncRNAs in pancreatic cancer through integrating expression microarray data Then we constructed network of miRNA–mRNA based on CLIP-Seq data. Furthermore, we constructed lncRNA–miRNA–mRNA regulatory network based on CLIP-Seq data and TF–miRNA–mRNA regulatory networks based on ChIP-Seq data from ChIPBase

Table 1 In cRNAs (n = 21) in one expression profiling datasets
Table 2 mRNAs (n = 18) in three expression profiling datatsets
Table 3 miRNAs (n = 54) in at least two expression profiling datasets

Regulatory Network of lncRNA–miRNA–mRNA and IT–miRNA–mRNA

We construct a miRNA–mRNA regulatory network. Interaction analysis show that 36 differentially expressed miRNAs targeted 1779 mRNAs up or down. In detail, 18 down-expressed miRNAs deregulate 1170 mRNA and 18 over-expressed miRNAs deregulate 609 mRNAs (Fig. 2). As an typical example, interaction of miR217-KRAS has been found in this miRNA–mRNA regulatory network. miR-217 has been found to down-regulated in PDAC tissues and in PDAC cell lines compared with the corresponding normal pancreatic tissue. KRAS was proved to be a direct target of miR-217 by dual-luciferase reporter gene assay. Previous study showed that miR-217 can regulate KRAS and function as a tumor suppressor in PDAC [26]. The regulation of miR217-KRAS was found in this miRNA–mRNA regulatory network which validated our predicted results. miR-326 has been reported to down-regulated in glioblastoma specimens. PKM2 as target of miRNA-326 was high levels of protein expression [10]. The regulation of miR326-PKM2 now was found in pancreatic cancer, however, the role of miR-326 in pancreatic cancer has not been elucidated thus far. miR-125a has been reported to degradate SMG1 mRNA expression in human cells [21]. SMG1 is considered to be an essential factor in the nonsense-mediated mRNA decay pathway. This interaction of miR-125a-SMG1 identified in this study has not been investigated in pancreatic cancer.

Fig. 2
figure 2

The miRNA–mRNA regulatory network. miRNA was indicated to triangle and mRNA was indicated to circle. The color of red represents high expression and green represents low expression. Arrows indicated miRNAs regulating mRNA and straight line indicated miRNAs deregulating mRNA (Color figure online)

In lncRNA–miRNA–mRNA regulatory network, abnormal expression of 19 miRNAs was regulated by the aberrant expression of lncRNAs (Fig. 3). Additionally, abnormal eight miRNAs were transcriptionally regulated by TFs in IT–miRNA–mRNA regulatory network (Fig. 4). Three lncRNAs including MALAT1, HOTAIR, and H19 have been reported to participate in gene expression in pancreatic cancer. Expression of MALAT1 was significantly higher in PDAC compared to the adjacent normal pancreatic tissues and patients with higher MALAT1 expression had a poorer disease-free survival [14]. HOTAIR expression was increased in pancreatic tumors compared with non-tumor tissue and is associated with more aggressive tumors [11]. H19 was reported as a sponge to antagonize let-7 in pancreatic cancer [15]. The involvement of several lncRNAs and TFs in the transcriptional regulation of miRNA was rarely reported in pancreatic cancer.

Fig. 3
figure 3

The regulatory network of lncRNA–miRNA–mRNA. lncRNA, miRNA, and mRNA were indicated to deformed triangle, triangle, and circle, respectively. The color of red represents high expression and green represents low expression. Arrows indicated miRNAs regulating mRNA and straight line indicated miRNAs deregulating mRNA. And the purple line indicated lncRNA sponging miRNA (Color figure online)

Fig. 4
figure 4

The regulatory network of TF–miRNA–mRNA. TF, miRNA and mRNA were indicated to diamond, triangle, and circle, respectively. The color of red represents high expression and green represents low expression. Arrows indicated miRNAs regulating mRNA and straight line indicated miRNAs deregulating mRNA. And the azure color line indicated TF binding to miRNA (Color figure online)

Pathway Annotation of Differentially Expressed miRNAs

In order to study the function of differentially expressed miRNAs in regulatory network, we performed KEGG pathway annotation using DAVID Database for 21 miRNA in regulatory network and KEGG pathways of miRNA targeted genes are shown in Fig. 5. The most of differentially expressed miRNAs affected pathways included pathways in cancer; ECM-receptor interaction; focal adhesion; wnt signaling pathway; cell cycle; and TGF-beta signaling pathway.

Fig. 5
figure 5

Pathway annotation of 21 differential expressed miRNAs. Genes regulated by 21 differential expressed miRNAs were enriched in pathways using DAVID Database, respectively. Total 20 differential expressed miRNAs regulated genes have enriched significant pathways as shown. The area proportion of each pathway presents the number of genes enriched in this pathway

Discussion

In this study, we systematically analyze the complex effects of interrelated miRNAs and provide a framework for revealing the mechanism of miRNAs disorder regulated by TF–miRNA–mRNA and lncRNA–miRNA–mRNA. Conventional analysis methods focus on differentially expressed genes and miRNAs between biological processes or disease states, which then were selected for target prediction by bioinformatic analysis softwares such as TargetScan. Here, we integrated expression microarray to identify differentially expressed mRNAs and miRNAs. We especially combined public miRNAs–mRNAs interactions that have been generated by high-throughput CLIP-Seq to reduce the rate of false-positive predictions. Then we firstly constructed two regulatory networks including TF–miRNA–mRNA and lncRNA–miRNA–mRNA in pancreatic cancer. Our results revealed a set of miRNAs (Table 4) that were possibly involved in this two regulation mechanisms. This study provides a new insight into understanding the molecular mechanism of pancreatic cancer.

Table 4 miRNAs (n = 21) involved in IncRNA–miRNA–mRNA and IncRNA–miRNA–mRNA regulatory mechanism

However, this study had some limitations that should be acknowledged. The first is the datasets used in the analysis are limited to 14 cancer types excluding pancreatic cancer. In the future, we will perform argonaute-2 photoactivatable-ribonucleoside-enhanced crosslinking and immunoprecipitation (AGO2-PAR-CLIP) in pancreatic cancer cells to generate a biochemically validated set of miRNA-binding sites. The second is the shortage of lncRNA expression microarray data in pancreatic cancer in public datasets. We only identified 21 differentially expressed lncRNAs, so that none of lncRNA was matched to expression data in lncRNA–miRNA–mRNA regulatory network. Due to this point, investigation of lncRNA expression profiles of pancreatic cancer and screen differentially expressed lncRNA are urgently needed. Finally, it should be emphasized that the two regulation mechanisms analyzed in this study were only bioinformatically predicted, thus should be considered for further validation and functional examination with in vivo and in vitro experiments.