1 Introduction

Lung cancer has become the most common cancer among men and women (13.0 % of the total) in the world. In 2012, among men, lung cancer was the highest incidence cancer (16.7 % of the total), and it was also the most common causes of cancer death (23.6 %). The estimated age-standardized rates (ASRs) in the incidence and mortality were 34.2 per 100,000 and 30.0 per 100,000, and both were the highest in cancer [1]. Among women, it was the third most common cancer (8.7 % of the total). Lung cancer was also the second most common cause of cancer death (13.8 %). Because of the poor survival of lung cancer, the 5-year prevalence (1.9 million) is very close to the annual mortality (1.6 million) [1]. Squamous cell lung carcinoma (SCC), lung adenocarcinoma (AD) and large cell lung carcinoma (LCLC) are three main types of non-small cell lung cancers (NSCLCs), which account 83.2 % of lung cancers [2].

miRNAs are small noncoding RNAs (20–25 nucleotides in length) that make gene expression silence posttranscriptionally through binding to their target mRNAs by the 3′ untranslated regions (3′UTRs). Mature miRNAs and argonaute protein are incorporated into an RNA-induced silencing complex (RISC) [3, 4]. RISC guides to its mRNA target in the cytoplasm by the associated miRNA. There are two primary systems used to control mRNA expression: inhibition of mRNA translation and mRNA degradation. The higher the degree of base pairing between the miRNA and the mRNA, the more likely the target mRNA will be degraded [5]. In some conditions, miRNAs can positively regulate gene expression; however, the underlying mechanisms are not clearly elucidated [4].

miRNAs can modulate many biological processes including development, differentiation, proliferation, cell death, and playing an important role in the pathogenesis of different tumor types [3, 69]. In lung cancers, Yanaihara et al. [10] reported that the low overall survival in AD was related to high miR-155 and low let7a-2 expression. Hu et al. [11] reported that a four-miRNA signature (miR-486, miR-30d, miR-1 and miR-499) could predict overall survival for NSCLC independently. Yu et al. [12] found that a five-miRNA signature (miR-137, miR-372, miR-182*, miR-221 and let-7a) was related to disease-free survival in NSCLC. Lebanony et al. [13] discovered that miR-205 was a highly specific biomarker for SCC.

However, there is no global analysis of the miRNA–mRNA interaction network in lung cancers. Such a systems level approach can bring a new way to understand the complex biological processes. In this study, based on pair-matched miRNA–mRNA expression profile of NSCLC samples, we constructed and analyzed the miRNA–mRNA interaction network through a number of bioinformatics tools and platforms.

2 Materials and Methods

2.1 miRNA and mRNA Expression Profiles

In the past dozen years, the growth in number of miRNA and mRNA expression profiles is exponential. Large numbers of expression profiles are available freely from databases Gene Expression Omnibus (GEO; www.ncbi.nlm.nih.gov/geo) [14]. We download the miRNA and mRNA expression profiles from GEO (Accession Number GSE29250 [15]). This expression profile includes 12 pair-matched samples including 6 NSCLC tissues and their matching normal control from adjacent tissues. Genome-wide analysis of miRNA expression and gene expression in NSCLC were parallel measured at Illumina Human v2 MicroRNA expression beadchip and HumanHT-12 V4.0 expression beadchip platforms. When multiple probes for a particular gene, we calculated its signal intensity as the mean of intensities of all these probe sets in this sample. The raw signal was normalized by robust multi-array average (RMA) normalization procedure to produce expression values [16].

2.2 miRNA–mRNA Interactions Analysis and Visualization

We integrated above miRNA–mRNA expression profiles data by MAGIA tool (http://gencomp.bio.unipd.it/magia) [17]. MAGIA first predicts the miRNA target by PITA, miRanda, TargetScan or Boolean combinations of these algorithms. Then, MAGIA combined miRNA targets with different statistical measures, such as Spearman and Pearson correlation and mutual information, to construct miRNA–mRNA bipartite networks. The complete list of identified significant interactions can be imported into other softwares, to allow further visualization and processing. To account for the multiple hypothesis testing, the q value is used to select significant results. Q value is a variant of the traditional p value but correcting for multiple comparisons.

In this study, we chose EntrezGene IDs, Pearson correlation measure and the intersection of PITA (score filter: −10) and miRanda (score filter: 500) target prediction algorithms. Since the data are normally distributed data and medium–large sample size (>5), Pearson correlation measure is used for miRNA–mRNA expression correlation. The miRNAs–mRNAs interaction network was visualized by Cytoscape [18, 19]. This software can get a visualization of nodes and edges as a two-dimensional network and change network factor, such as shape, color and size, based on attribute values. It also supports a number of automated network layout algorithms.

2.3 Functional Analysis Via GO, Pathways and Human Disease Resources

To elucidate the functional significances of identified miRNA–mRNA interactions, we combined GeneDecks, DAVID and Malacard to analyze gene ontology, pathway information and human disease information by list of significant down-regulated target genes in lung cancer miRNA–mRNA interactions network.

DAVID [20] (The Database for Annotation, Visualization and Integrated Discovery) Bioinformatics Resources (http://david.abcc.ncifcrf.gov/) is a tool to gain biological features/meaning through large gene lists. It is useful to understand biological themes. GeneDecks [21] (http://www.genecards.org/genedecks) can analyze the gene through sharing the same descriptors, using the built-in rich annotation of human gene in the GeneCards (a searchable, integrated tool of annotation information about human genes [22]). GeneDecks can provide the information about disorders, drug relationships, pathways, etc, by gene lists. Malacard (http://www.malacards.org/) [23] is an integrated database of human disorder and their annotations. This database lists the genes related to lung cancer. We can compare these lung cancer-related genes with the down-regulated target genes, which are provided by MAGIA. We always used the default settings in the GeneDecks, DAVID and Malacard analysis.

3 Results

3.1 Construction of miRNA–mRNA Interaction Network

Each patient has four profiling arrays measured the miRNA and mRNA expression together with their adjacent normal tissues. The characteristics of samples are given in Table S1. After normalization, the expression matrix was input into MAGIA pipeline, and an aggregate number of 290, 6230, 31,249 miRNA–mRNA interactions with correlations <−0.75, <−0.50, <−0.25, respectively, were identified. Figure 1, Fig.  S1 and Fig.  S2 show the global miRNA–mRNA interaction network in lung cancer, respectively based on the correlations <−0.75, <−0.50, <−0.25. We chose the genes whose correlations <−0.75 as highly negative correlated genes for follow-up analysis.

Fig. 1
figure 1

Interaction network of miRNAs–mRNAs in lung cancer by cytoscape 3 software. MiRNAs are presented in triangle and colored in pink, while mRNAs are expressed by circular and in sky blue

This network consists of 249 nodes for mRNA, 90 nodes for miRNAs and 290 edges that show regulations between these miRNAs and mRNAs. In total, the network involved 61 connected components, including 36 single edges, 11 double edges and 14 triple edges or with more than three edges. These types of connected components, respectively, involve 72, 33 and 234 nodes. Statistics of the connected components is shown in Fig. 2. These results indicated that the majority of miRNAs and their targets can be integrated into a big interlocking network.

In this network, miRNAs (presented in triangle and colored in pink as shown in Fig. 1) usually located in the center of the interaction network or module, while mRNAs (circular and sky blue nodes as shown in Fig. 1) are in the external of the interaction network or module.

According to Fig. 1, we found three miRNAs (miR-1207-5p, miR-1228* and miR-939) regulating 105 target mRNA, occupying 42.17 % of the total mRNA (249 target mRNA). In addition, 30 mRNAs are regulated by more than one miRNA. These mRNAs and their interactional miRNA are indicated in Table 1. From expression profile, hsa-miR-1228* and hsa-miR-939 were down-regulated in NSCLC (Table S2), suggesting their protective roles in normal tissues.

Fig. 2
figure 2

The statistics of connected components. There are 36 single edges, 11 double edges and 14 triple edges or with more than three edge components. These types of connected components, respectively, involve 72, 33 and 234 nodes

Table 1 The genes in the miRNA–mRNA network, which are regulated by more than one miRNA

3.2 mRNA Analysis in Lung Cancer

MalaCards is an integrated database of human disorders and their annotations, which can quantitatively measure degree of relationship of interested genes to certain disorders. Using MalaCards, we found that total 35 genes had a confirmed role in lung cancer, occupying 14.06 % in the total 249 mRNAs. The results are given in Table 2.

Table 2 The genes in the miRNA–mRNA network, which are related to lung cancer (provided by Malacards)

HDAC4, MED1, SPN and ST8SIA2 appear in both Tables 1 and 2. These four genes are regulated by more than one miRNA in lung cancer (Table 1) and their relationships to lung cancer were confirmed in previous reports (Table 2). Compared with the adjacent normal tissue, these mRNAs increase up to 2.5-fold changes, indicating that they potentially play an important role in miRNA–mRNA interaction network in lung cancer (Table S2).

3.3 Functional Annotation Analysis

Using DAVID and GeneDecks, we studied the functional annotations for mRNAs in the miRNA–mRNA interaction network. The results are given in Table 3.

Table 3 GO categories for the target genes that are participate in miRNAs–mRNAs interactions in cancer

Using DAVID, the result shows that the significant category of Gene Ontology (GO)-molecular function includes seven terms. The most important molecular function is transferase activity (35 genes). The other identified categories, which ranked 2–6, are subsets of transferase activity (Table 3). The most important biological processes are cellular process (143 genes), localization (49 genes) and regulation of gene expression (47 genes). The core of cellular components of genes is intracellular part (147 genes), plasma membrane part (38 genes) and organelle membrane (23 genes). The complete GO categories and included genes are given in Table S3.

Utilizing GeneDecks to analyze these 249 target miRNA, the miRNAs are classified into seven types of descriptors in phenotype attribution. The most three important descriptors are mortality/aging (67 genes), nervous system phenotype (63 genes) and growth/size /body phenotype (54 genes). Only one pathway—axon guidance—was found in significantly in our analysis (six genes). The complete analytical results are given in Table S4.

4 Discussion

4.1 Significant miRNA and mRNA in the Interaction Network

Many miRNAs are abnormally expressed in lung cancer and play a core role in the process of malignant transformation, angiogenesis and tumor metastasis [1013, 24]. Individual miRNA may regulate and control multiple target mRNA involved in different oncogenic or tumor suppressor pathways. Analyzing the miRNA–mRNA interaction network at systems level may open a new chapter in lung cancer, which will be helpful in elucidating disease mechanisms, providing better targeted agents and finding sensitive early biomarkers [6].

In this study, we constructed and analyzed the interaction network of miRNA and their target miRNA in lung cancer by using bioinformatics tools and matched mRNA, miRNA expression profiles data. This study showed that 105 of 249 mRNA in the network are down-regulated by three miRNA, miR-1207-5p, miR-1228* and miR-939. miR-1207-5p regulated the most number of mRNA in the interaction network. Chen et al. [25] reported that the expression of miR-1207-5p was significantly decreased in gastric cancer tissues compared with the adjacent tissues. miR-1207-5p could bind to the human telomerase reverse transcriptase (hTERT) 3′ UTR and down-regulate expression of hTERT. hTERT can reverse telomere shortening, as the catalytic subunit of telomerase, and then prevent proliferation, reduce invasion and induced cell cycle arrest in gastric cancer cells in vitro. Loss of miR-1207-5p may boost hTERT protein expression and elevate the development of gastric cancer. Previous reports indicated that hTERT gene amplification causes hTERT overexpression in lung adenocarcinoma and is an independent poor prognostic marker in NSCLC [26]. But there is no clear evidence on the relationship between miR-1207-5p and hTERT in NSCLC. Our analysis firstly pointed out miR-1207-5p as an important regulator in miRNA–mRNA interaction network.

miR-1228* was significantly decreased in human gastric cancer tissues compared with normal tissues. Overexpression of miR-1228* prevented xenograft tumor formation in vivo using the tumor xenograft model. Selective restoration of miR1228* might be advantageous for therapy of gastric cancer [27]. miR-939 is substantially highly expressed in ADC compared with their age- and gender-matched control sera [28]. miR-939 may have relevance for early diagnostic biomarker of ADC. But its mRNA targets and molecular mechanism are unknown. Both miR-1228* and miR-939 are found as essential nodes in our network analysis. Furthermore, our analysis provided the refined mRNA targets and potential mechanistic understanding of the two miRNAs.

4.2 Significant mRNAs in the Interaction Network

HDAC4, MED1, SPN and ST8SIA2 were identified as the most important controlled genes in this interaction network because they are regulated by more than one miRNA in lung cancer (Table 1) and their contribution to lung cancer were confirmed in previous reports (Table 2).

SPN (sialophorin) is known as CD43. In the interaction network, it linked to four miRNAs (miR-939, miR-1224-3p, miR-1236 and miR-1207-5p). Normally, SPN is produced by white blood cells as a transmembrane sialoglycoprotein. Its main cellular functions are intercellular adhesion, intracellular signaling, apoptosis, migration and proliferation [29]. But abnormal expression has been found in cancers, including adenoid cystic carcinoma [30, 31], SCLC and NSCLC. Fu et al. [29] found that SPN could cause lung cancer pathogenesis by various ways, including preventing malignant cells from NK attack and apoptosis, driving metastasis by mechanisms of anti-adhesion, pro-adhesion and migration. MED1 (mediator complex subunit 1) is regulated by three miRNA (miR-1207-5p, miR-1224-3p and miR-30c-2*). Kim et al. [32] demonstrated that the loss of MED1 expression was highly correlated with increased rates of invasion and metastasis in NSCLC. Knockdown of MED1 in NSCLC cell lines results in the change of mRNA levels of the metastasis-related genes. These results indicate that MED1 regulate the invasion and metastasis of NSCLC by regulating the expression of multifold metastasis-related genes. HDAC4 (histone deacetylase 4) gene is linked to two miRNA (miR-1270 and miR-1207-5p). HDAC4 encodes a histone deacetylase and represses gene transcription by influencing transcription factor access to DNA in cells [33]. Previously, HDAC4 is reported to influence cell differentiation in other types of cancer, such as leukemia [34, 35]. But its role in lung development of HDAC4 is unclear. ST8SIA2 (ST8 alpha-N-acetyl-neuraminide alpha-2, 8-sialyltransferase 2) is regulated by two miRNA (miR-623 and miR-1182). ST8SIA2 can synthesize polysialic acid (PSA) independently and transfer sialic acid through alpha-2, 8-linkages to the alpha-2, 3-linked and alpha-2, 6-linked sialic acid of N-linked oligosaccharides of glycoproteins [36]. Tanaka et al. [37, 38] indicated that PSA played an important role in tumor development, particularly formation of metastatic foci, and was associated with a poor postoperative prognosis and specifically expressed in advanced-stage NSCLC.

Our analysis of miRNA–mRNA provided many potential miRNA regulators of these four genes, uncovering another layer of modulation on lung cancer initiation and development at global level. It should be noted that all of these genes are targeted by more than one miRNA modulator, indicating that the combined effects of multiple miRNAs may play an important role in lung cancers.

4.3 Significant Gene Ontology Annotations

The result of gene ontology annotation manifested the molecular function is concentrated in transferase activity. Five terms of molecular function are subsets of transferase activity. The smallest subset is protein serine/threonine kinase activity, and it is a subset of all other terms about transferase activity in this analysis. Some members of serine/threonine kinases, such as protein kinase C (PKC), are involved in NSCLC [39]. They play a key role in cancer cell proliferation, polarity and survival [40].

There are seven GO terms involved in biological processes regulation. The term of regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process and the term of regulation of nitrogen compound metabolic process shared the completely same genes. The term of regulation of transcription is a subset of above two terms. The genes of regulation of cellular carbohydrate catabolic process and regulation of carbohydrate catabolic process are the same. The other regulation involves gene expression and macromolecule biosynthetic process. Most genes of the above terms are attached to the term of cellular process. Overall, these results suggested that lung cancer is derived from systemic dysfunction of various cellular aspects.

Lung cancer is a type of respiratory system disorder. Interestingly, we found that 63 network refined genes were classified as the nervous system phenotype and 25 genes were classified as the nervous system development (Table 3). Combining above two lists, we get 67 non-redundant genes associated with the nervous system. Among them, 21 genes have been associated with lung cancer in other reports (given in Table 2 by adding an asterisk (*) in Gene name column). It is worth mentioning that the genes ST8SIA2, MED1 and HDAC4, which are regulated by more than one miRNA in this miRNA–mRNA interaction network, appeared again here. Thus, the three genes may influence both lung cancer and nervous system.

The relationship between nervous system and lung cancers has been noticed in other’s studies. For example, [41] and Stephanie [42] found that NSCLC often affects the central nervous system (CNS). CNS frequently causes the decline of the life quality and the shortened survival as a complication. But at the same time, the available treatment options are limited. So focusing on the genes that both are related to NSCLC and nervous system may provide a novel approach to treat these complications.

5 Conclusion

The expression profiles of miRNAs and mRNAs in matched samples provided a good opportunity to construct miRNA–mRNA interaction network via bioinformatics tools. In this system-level analysis, we found a number of master miRNAs and mRNAs, which are potentially important for lung cancer initiation and development. This global interaction network of miRNA–mRNA will contribute to refine miRNA target predictions and developing novel therapeutic candidates.