Keywords

1 Introduction

MicroRNAs (miRNAs) are a class of small non-coding RNAs, approximately 22 nucleotides in length, which regulate gene expression at the post-transcriptional level through translation inhibition or mRNA cleavage (Bartel 2004). Since the discovery of the first miRNA (i.e., lin-4) in C. elegans (Lee et al. 1993; Wightman et al. 1993), enormous studies have been conducted to explore latent miRNA structures in a series of organisms, from viruses to advanced mammalians. Currently, within the most popular miRNA repository—miRBase (Release 20, June 2013) (Griffiths-Jones 2004), there have been 24,521 entries representing hairpin precursor miRNAs, expressing 30,424 mature miRNA products, among 206 distinct species. It is reported that most miRNAs are independently encoded in intergenic regions (Lee et al. 2004) or co-encoded within intron regions of other “host” protein-coding genes (Lin et al. 2008). Also, recent studies implicated the transfer RNAs (tRNAs) might be another origin for miRNA biogenesis (Schopman et al. 2010; Maute et al. 2013). The cardinal mechanism for the biological functions of miRNAs is to bind the 3’UTR (un-translated region) of their target genes through imperfect base pairing in animals (Reinhart et al. 2000), or perfect base pairing in plants (Dugas and Bartel 2004). Generally, the miRNA binding to its target genes will induce the mRNA degradation or protein translation inhibition, albeit several studies have shown its effect on the stabilization of target transcripts (Place et al. 2008).

Until now, there are more than 2,500 mature miRNAs identified in humans, which have potential to approximately regulate 33 % of human protein-coding genes (Lewis et al. 2005). It has been recently shown that miRNA regulations are involved in a wide variety of cellular processes, from cell proliferation, differentiation, development, to apoptosis (Ambros 2004; Bartel 2004). The alterations in miRNA expression have been associated with the pathogenesis and procession of various kinds of diseases, especially cancers (Jay et al. 2007). The abnormal miRNAs are reported to be capable as classifiers to distinguish tumour samples from Normal tissues (Raponi et al. 2009).

As implicated in the previous study (Bielekova and Martin 2004), some features were required to be as a disease biomarker: biological rational, clinical relevance, practicality and correlation with disease activity, etc. For miRNAs, they are involved in various biological processes and clinical related to the disease pathogenesis, as discussed in the above context. Besides, other advantageous features of miRNAs, such as manageable, durability and easily detected, make them more potential to be cancer biomarkers (Guerau-de-Arellano et al. 2012). In fact, enormous attempts have been made to explore candidate miRNA biomarkers in a series of cancers, such as breast cancer (Heneghan et al. 2010; Ramshankar and Krishnamurthy 2013), lung cancer (Gao et al. 2012), and gastric cancer (Li et al. 2012, 2013).

2 Advances of miRNA Expression Profiling Methods

The most straightforward approach to explore candidate cancer miRNA biomarkers should be the screening of abnormally expressed miRNAs between cancer tissues and normal tissues, via contemporary miRNA expression profiling detection platforms. Generally, there are two main experimental categories for these miRNA expression profiling techniques: high-throughput screening approaches and low-throughput detection methods. The first category consists of nucleic acid hybridization-based array technologies and cloning-sequencing based approaches, which can be used for simultaneous detection of many miRNAs in an individual experiment. The latter one includes small-scale detection methods, such as Northern Blot, Real time quantitative PCR (RT-qPCR), and in situ hybridization (ISH). The comparison information about these technologies is presented in Table 8.1.

Table 8.1 Current available miRNA expression profiling approaches

2.1 High-Throughput Screening Approaches

Currently, microarray (also called Gene chip) technology may be the most widely used in the filed of transcript expression profiling detection. The implementation of this method is based on the hybridization of slide-localized miRNA-specific probes (Babak et al. 2004). MiRNAs in the sample are first fluorescent dye-labeled, and then hybridized on slide array with glass-printed probes. After eluting unstable connections, the miRNA abundance is measured according its fluorescent luminance. The main advantage of this approach is parallelized detection of hundreds of miRNAs in an individual experiment.

Another large-scale miRNA expression profiling method is Beadarray technology. Unlike the classical microarray approach, it uses magnetic microspheres tagged with unique DNA sequence to identify the bead. This bead can specifically bind to a chimeric probe, which is utilized to recruit a specific miRNA into a complex. The fluorescent labeling of the complex is implemented with the biotin on the probe. Therefore, the abundance of miRNA expression can be quantified according to the amount of fluorescence on the microspheres through flow instruments.

The aforementioned array-based technologies can merely exploited for the expression profiling of known miRNAs. Deep sequencing techniques are novel miRNA profiling approaches that could detect the expression of all the miRNAs expressed in the target sample, even for miRNAs that are never previously reported. The detailed sequencing procedures may be diverse for different platforms. Generally, the first step for all these techniques is the generation of a miRNA library. During this process, miRNAs are ligated to 5′ and 3′ adaptors for reverse transcription and PCR amplification to generate this library. After that, the miRNAs in the library are further simultaneously sequenced and eventually abundance quantified. Due to their higher specificity and accuracy, the deep sequencing technologies are very promising to take over the dominating position of microarray technique in the area of transcript expression profiling.

2.2 Low-Throughput Detection Methods

Even with low flux, low-throughput miRNA expression profiling methods are considered to be more reliable than high-throughput techniques. This is a trade-off problem. More specifically, northern blot method is declaimed as the “gold standard” for characterizing miRNA expression (Ahmed 2007), due to its high specificity. The method is more like a mini-version of microarray. The miRNA abundance in the sample is determined by the hybridization signal from the binding complex of query miRNA and pre-set probes. Except for its low throughput, another defect of this method is low sensitivity for low-abundance samples.

Different from northern blot technique, RT-qPCR is a relative high sensitive approach for miRNA expression characterization. The basic idea is to make miRNA molecules amplifiable, through adding adapters to their fragment ends. After that, the amount of miRNA in the sample is relatively quantified. The PCR amplification process makes it accessible for low-abundance specimen measurement. This technology can be used for quantification of both miRNA precursors and mature miRNAs.

Another small-scale miRNA profiling method is in situ hybridization, which was invented by American cell biologist Joseph Grafton Gall in 1969 (Gall and Pardue 1969). This technology can be used to measure and localize miRNAs within tissue sections and cells.

With the recent progress in miRNA expression profiling detection technologies, there have been growing intense interests in cancer miRNA biomarker discovery studies, as documented by the numbers of related published literatures from NCBI pubmed searching engine in the last 10 years (Fig. 8.1). In the following parts, much more context will be spent on the descriptive review on these cancer miRNA biomarker discovery studies.

Fig. 8.1
figure 1

Number of publications related to cancer miRNA biomarker discovery studies during the past decade. Bars represent the number of NCBI pubmed hits for query “(cancer[ti] OR carcinoma[ti] OR tumor[ti]) AND (miRNA*[ti] OR microRNA*[ti])” (Pane a) and “(cancer[ti] OR carcinoma[ti] OR tumor[ti]) AND (miRNA*[ti] OR microRNA*[ti]) AND (biomarker*[tiab] OR marker*[tiab])” (Pane b). The numbers for year 2013 are enumerated until 18th July

3 Traditional Approaches for Cancer miRNA Biomarker Discovery

The general procedures of traditional approaches for novel cancer miRNA biomarker discovery can be divided into three steps: (a) detection of differentially expressed miRNAs with high-throughput method (e.g., microarray) between cancer samples and control groups; (b) low-throughput technology (e.g., RT-qPCR) validation of outlier miRNAs detected above; (c) further confirmation of potential miRNA biomarkers on large-scale of case and control specimen via low-throughput experiments. The simplest bioinformatics tool for outlier miRNA identification in the first step is fold-change filtering (usually 2-fold). Following, many well-established statistical tools are employed for this issue, such as z-score, t-test, and Mann–Whitney test. Nevertheless, all these approaches do not take the heterogeneity of cancer samples into account.(MacDonald and Ghosh 2006) proposed a novel bioinformatics tool to infer chromosomal translocations only existing in the subset of disease samples . Afterwards, this concept was implemented to generate several other outlier gene detection algorithms (Tibshirani and Hastie 2007; Wu 2007; Lian 2008). These available methods could be applied for outlier miRNA screening from high-throughput experiments.

Now, let us exemplify the general routine of traditional approaches for cancer miRNA biomarker discovery. Through the aforementioned procedures, the study of (Li et al. 2012) revealed that miR-199a-3p in plasma as a potential diagnostic biomarker for gastric cancer. Another research group conducted genome-wide miRNA expression profiles detection followed with Real-Time quantitative RT-PCR (qRT-PCR) assays on gastric cancer samples and normal samples, and discovered three elevated expressed miRNA (miR-187(*), miR-371-5p and miR-378) in gastric cancer. Further validation study showed that miR-378 alone could produce 87.5 % sensitivity and 70.73 % specificity in discriminating gastric cancer patients from healthy controls, thus this miRNA was a potential diagnosis biomarker for gastric cancer.

4 The Reconstruction of miRNA–mRNA Network

As the expression profiling data from high-throughput technologies (e.g., microarray) is always of high false positive rate, integrative analysis on high-throughput expression profiling data and miRNA–mRNA target network information may be a plausible approach for novel cancer miRNA biomarker discovery (Xu et al. 2011). The miRNA–mRNA network is more exactly a unidirectional graph, reflecting the regulation relationships from miRNAs to their target genes, as presented in Fig. 8.2. Due to the limit of current experimentally validated miRNA–mRNA target pairs (Sethupathy et al. 2006; Xiao et al. 2009; Hsu et al. 2011), the main resources for miRNA–mRNA network reconstruction are from computational prediction. Even with potential false positive or negative cases, many of the predicted miRNA–mRNA interactions are confirmed to be credible (Arora and Simpson 2008). Usually, the main resources for miRNA–mRNA network reconstruction are the common miRNA–mRNA target pairs shared by the prediction results from multiple computational approaches, such as PicTar (Krek et al. 2005), miRanda (Enright et al. 2003), and TargetScan (Lewis et al. 2003, 2005). Besides, the negative expression correlations between miRNAs and their putative targets, computed from matched miRNA and mRNA expression profiles can also be applied for the refinement of miRNA–mRNA network reconstruction (Tran et al. 2008; Zhang et al. 2013).

Fig. 8.2
figure 2

Schematic graph of miRNA–mRNA target network

5 Computational Based Approaches for Individual miRNA Biomarker Discovery

5.1 Based on Gene Expression Data and miRNA–mRNA Network Information

As the general miRNA functions are to regulate the gene expression at mRNA level, it is rational to infer miRNA deregulation from its target genes’ expression level changes. Indeed, a number of studies have reported the reverse correlations between expressions of miRNAs and their target genes (Krutzfeldt et al. 2005; Wang and Wang 2006). Base on this theory, (Cheng and Li 2008). proposed an algorithm to infer microRNA activities by combining gene expression data with miRNA–mRNA network information . The basic idea of this algorithm is to analyze the expression changes of target genes for miRNAs. The activity of a miRNA will be inferred to elevated, if the expressions of its target genes tend to be down-regulated, and vice versa. Applying this approach, the cancer miRNA expression patterns can be deduced according to the gene expression profiles between cancer and normal samples.

Similarly, another approach referred as Co-inertia analysis (CIA) was proposed for this issue. CIA is a multivariate coupling approach, which was initially introduced for ecological research (Doledec and Chessel 1994; Dray et al. 2003). It was used to explore the correlation of two sets of variables from two linked data tables. Stephen et al. (Madden et al. 2010) applied this method to detect miRNA activity in different biological conditions. In this case, the two linked tables were gene microarray expression data, and a miRNA frequency table on the same set of genes. The two linked tables were performed two simultaneous non-symmetric correspondence analyses (NSCs), which reduced each data table in a low dimensional space, by projecting each variable on to axes which best discriminate the coordinates of the projected points. Then these two reduced tables were linked to associate the miRNA activity with biological samples. This methodology can be used to identify miRNA deregulation patterns that distinguish disease and normal groups.

5.2 Based on miRNA and Gene Expression Profiles and miRNA–mRNA Network Information

As the miRNA–mRNA target relationships are presented a simplified network style, the topological features of this network may be helpful for the identification of candidate cancer miRNA biomarkers. (Xu et al. 2011). introduced an approach based on the miRNA-target dysregulated network (MTDN) to prioritize candidate disease miRNAs, and applied this method to predict novel miRNA biomarkers in prostate cancer. In this methodology, miRNA expression and mRNA expression data, and miRNA–mRNA interaction data were combined to construct MTDN in tumor and non-tumor conditions. Then a support vector machine (SVM) was trained with considering the expression fold change and network topological features of known prostate cancer miRNAs and non-prostate cancer miRNAs in MTDN. Finally, the novel prostate cancer miRNA biomarkers were prioritized with this SVM and in vitro experimentally validated in prostate cancer cell lines. This study also showed the function synergism of miRNAs that were involved in the specific disease or biological process.

In contrary to the functional cooperation, (Zhang et al. 2013). declaimed another bizarre characteristic of cancer miRNAs—strong independent regulation power, which denoted the number of exclusively regulated genes for an individual miRNA. This research group also proposed a novel pipeline to infer candidate cancer miRNA biomarkers. The negative correlations from paired miRNA and mRNA expression profiles, along with computational prediction miRNA-mRNA target pairs were combined to generate a reliable miRNA–mRNA network. This network was further reduced to a sub-network, which only consisted of miRNA nodes exhibiting deregulation patterns from the miRNA expression profiles. In this sub-network, the independent regulation power was calculated for each miRNA. Ultimately, miRNAs with significant great independent regulation power were predicted as potential cancer miRNA biomarkers. The afterwards in vitro experiment validation and systematic analysis confirmed the accuracy of this approach.

6 Computational Based Approaches for miRNA Network Biomarker Discovery

6.1 The Discovery of Cancer-Related miRNA–mRNA Regulatory Modules

The concept of miRNA regulatory modules (mRMs) was first proposed by Yoon and De Micheli (2005a, b) to indicate groups of miRNAs and target genes that were completely connected in the sub-groups and functionally corporate in specific biological processes. This notion was afterwards applied for cancer studies.

Through integrative analysis on matched miRNA expression and mRNA expression profiles, two cancer mRMs discovery algorithms were proposed based on fuzzy decision tree model (Bonnet et al. 2010) and correspondence latent dirichlet allocation (Liu et al. 2010). Afterwards, (Jayaswal et al. 2011) introduced a clustering method to infer miRNA regulatory modules involved in cancers, through deducing miRNA activities from microRNA gene expression data and computational miRNA–mRNA target information. Considering the miRNA function patterns, there should be negative expression correlations for an individual miRNA and its target genes. Based on this theory, Joung et al. and Tran et al. raised two approaches to discover functional mRMs by combining paired miRNA and mRNA expression data and miRNA–mRNA binding information, through Population-based probabilistic learning method (Joung et al. 2007) and rule induction method (Tran et al. 2008), respectively. Finally, a computational framework for the discovery of cancer related miRNA-gene modules was proposed by simultaneous integration of multiple types of genomic data, including matched miRNA and mRNA expression profile, computational miRNA–mRNA target information, and gene–gene interaction network data, which was generated by integrating protein–protein interaction data with DNA–protein interaction data (Zhang et al. 2011). The brief summary about the aforementioned approaches about cancer miRNA–mRNA regulatory modules discovery is presented in Table 8.2.

Table 8.2 Computational approaches on cancer mRMs discovery

6.2 The Discovery of Cancer-Related MicroRNA Network Biomarkers

The regulations of miRNAs on mRNAs are only a miniature for the whole biological regulatory network. The incorporation of other information, such as transcriptional factor (TF) regulations may be used for better understanding of specific biological process. (Lu et al. 2011) designed a computational approach for identification of potential microRNA network biomarkers for the progression stages of gastric cancer . Within this approach, computational miRNA–mRNA target information and TF-miRNA regulation data were combined to generate a novel miRNA network for each individual miRNA. The significance of each miRNA network was evaluated according to its GSEA score, and miRNA networks with higher GSEA scores than pre-set threshold were declaimed as potential gastric cancer miRNA network biomarkers.

Except for miRNA regulations, the ultimate expression levels of miRNAs are determined by many factors, e.g., TF regulations. Through integrating analysis on miRNA–gene binding information and TF–gene binding information, Tran et al. introduced a novel way to discover miR–TF regulatory modules in human genome. In this study, many identified modules have been previously reported to be involved in cancer genesis and development (Tran et al. 2010).

7 Databases on Potential Cancer miRNA Biomarkers

With the data accumulation from cancer–miRNA association studies, there have been a couple of online databases that collected the cancer–miRNA associations via text-mining approaches on previous publications. The brief summary about these databases can be referred in Table 8.3.

Table 8.3 Databases on cancer–miRNA association

MiR2Disease is a manually curated database collecting the miRNA deregulation patterns in various human diseases, including cancers (Jiang et al. 2009). It provides the detailed information about miRNA–disease relationships, experimentally validated miRNA targets, and corresponding literature references. Similarly, PhenomiR is another comprehensive repository of deregulation miRNA profiling data for different human diseases and biological processes (Ruepp et al. 2010, 2012). Based on self-defined text-mining rules, miRCancer (Xie et al. 2013) and dbDEMC (Yang et al. 2010) specially focus on the collection of cancer-related differentially expressed miRNAs information. More specifically, there have been also some established databases that merely provide miRNA expression profiling data on certain tumor type, such as S-MED (Sarver et al. 2010), CC-MED (Sarver et al. 2009), and others.

8 Future Directions

Although the current cancer miRNA studies have shed some light on our understanding of cancer genesis and development mechanisms, there is still a long way ahead in this new emerging research area. The heterogeneity of cancer is our main concern. Except the compensation of future more advanced miRNA expression detection technologies, another two research directions might also be potential solutions for this issue.

Recently, a new concept referred as personalized medicine (PM) has been proposed to indicate the customization of healthcare. The importance and urgency about PM has already been emphasized years ago (Long 2007). Therefore, the specific information about patients, such as race, genetic makeup, should also be integrated for the future discovery of cancer miRNA biomarkers.

Currently, network view shows that the disorder conditions (diseases, including cancers) may attribute to the deregulation of specific biological process, not simply the alteration of an individual biological molecule. Network biomarkers should be a better choice for cancer diagnosis. As reviewed above, there have been a couple of studies conducted for the identification of cancer miRNA network biomarkers. In the future, the integration of multi-layer information, such as genomic information, epigenomic information, and clinical information, is need for the discovery of miRNA network biomarkers.

9 Conclusions

This chapter summarizes current advances of miRNA expression detection technologies, the traditional approaches on cancer miRNA biomarker discovery based on these techniques. Computational-based methods on the identification of individual miRNA biomarker and miRNA network biomarker for cancer diagnosis and prognosis are also reviewed herein. Although studies of cancer miRNA biomarkers are still in their infancy, the evolving miRNA profiling measurement technologies, miRNA network information, and computational algorithms offer new insights on cancer mechanism investigation. We can expect the clinical application of miRNA biomarkers for the diagnosis, staging, and prognosis of cancers in the near future.