Key words

1 Introduction

MicroRNAs (miRNAs ) are small endogenous noncoding RNAs , consisting of 19–24 nucleotides in length [1]. Since the discovery of the first miRNA (lin-4) in C. elegans, thousands of miRNAs have been identified by experimental or computational approaches in a variety of species (XX). miRNAs have important roles in regulating protein coding genes’ functions by binding to the 3′-UTR sequences. The discovery of miRNAs and their biological functions could be one of the most exciting scientific breakthroughs in the last decade. For example, although miRNAs comprise up to 5 % of animal sequences, they can regulate approximately 30 % of protein coding genes, thus being the most abundant classes of regulators. Furthermore, given the important biological roles, moRNAs may have oncogenic functions in the development and progression of tumorigenesis and, could be used as biomarkers for malignancies. However, the investigation of the biological functions and clinical applications of miRNAs will be based on the development of miRNA expression profiling method. Indeed, miRNA profiling has helped to identify and detect miRNAs that regulate a range of processes, including organismal development and different diseases. In addition, the ability to effectively profile miRNAs could lead to the discoveries of disease- or tissue -specific miRNA biomarkers , our deep understanding of how miRNAs regulate cell differentiation and function. Therefore, miRNA expression profiling is crucial for the investigation of the biological functions and clinical applications of miRNAs. Several major profiling approaches for identification and validation of miRNAs are discussed below [26].

2 Profiling Approaches

2.1 cDNA Library-Based Platforms

Lee and Ambros [1] first proposed cDNA library-based platforms for searching miRNAs through discovering lin-4 and let-7 of C. elegans. Briefly, the cDNA was cloned and sequenced. The cloned sequences were homologically compared in the species genome database with NCBI Blast by using the related software. The secondary structure of homologous genomic sequence was predictably analyzed using the program mfold. A small-molecule RNA with a hairpin structure was detected by Northern blot. Although this method holds great promise, there are several challenges to overcome. These include the low abundance of miRNA expression , and its specific expression in different tissues and different stages of development. Furthermore, the degradation products of endogenous mRNA and other noncoding RNA have a certain interference effect.

2.2 Computation for the Prediction of miRNAs

Alternatively, computational methods for the prediction of miRNAs have gained popularity. Currently, two computer analytic tools are commonly used to support the approaches. The first one is called MiRscan [7, 8]. It produced an initial set of candidates by scanning the genome of C. elegans with a sliding-window of 110 nt. The regions were folded and filtered according to more permissive structural criteria. Potential homologues were sought in C. briggsae sequences and only conserved hairpins were retained, yielding a total of ∼36,000 candidates. The second one is miRseeker [4] that represents the first attempt to identify conserved stem-loops due to selection, and not as an artifact of considering genomes that are not sufficiently distant. One can align the non-annotated intergenic and intronic sequences of the genomes of D. melanogaster and D. pseudoobscura. Both tools have been successfully identified a large number of miRNA genes and confirmed by the experiments. Furthermore, some researchers have combined high-throughput experimental methods with computational procedures in order to identify a wider range of miRNAs [9]. However, the computed-generated data need to be vigorously and reliably validated by conventional and gold stand experimental approaches.

2.3 Quantitative Reverse Transcription PCR -Based Methods

Because mature miRNAs are very small, they require appropriate small size primers for the quantification. It was challenging to use qPCR for the analysis of miRNAs . However, successful real-time RT-PCR technologies are recently developed to amplify and quantify both the precursor and mature microRNA [10]. One major approach relies on reverse transcription from miRNA to cDNA , followed by qPCR with real-time monitoring of reaction product accumulation. An appealing aspect of this approach is the ease of incorporation into the workflow for laboratories that are familiar with real-time PCR . In order to scale this approach for miRNA profiling, reactions are carried out in a highly parallel, high-throughput form. Basically, qRT-PCR methods designed for miRNAs include SYBR green and TaqMan assays. Several manufacturers offer SYBR green detection for small RNA species. Generally, this method including Qiagen miScript and WaferGen system rely on polyadenylation of small RNAs , followed by a reverse transcription using an oligo-dT primer with tag. This tag sequence is then used as a universal reverse primer site SYBR-green detection [11]. Qigen maintains specificity for small RNA species using a proprietary Hi-Spec buffer, which inhibits the reverse transcription of longer coding and noncoding RNAs. Exiqon miRNA PCR. Exiqon’s microRNA qPCR system combines the speed of a Universal RT reaction with the sensitivity and specificity of LNA™-enhanced PCR primers, and based on SYBR green reagents [12]. Because of ribose modifications, locked nucleic acids increase the acidity of Watson-Crick binding and specificity of primers allowing for similar primer Tms with short sequences.

The TaqMan -probe method is designed to detect and accurately quantify mature miRNAs using real-time PCR system [10]. The principle of the TaqMan ™ microRNA assays is similar to conventional TaqMan™ RT-PCR ones. A major difference is the use of a novel target -specific stem-loop reverse transcription primer during the RT reaction, which address the challenge of the short length of mature miRNA . The primer extends the 3′ end of the target to produce a template that can be used in standard TaqMan ® Assay-based real-time PCR . Also, the stem-loop structure in the tail of the primer confers a key advantage to these assays: specific detection of the mature, biologically active miRNA . Moreover, TaqMan technology can detect mature miRNAs that differ by as little as one nucleotide.

Since mature miRNA exerts its activity by binding to the 3′ untranslated region of mRNA, quantification of the active, mature miRNA , rather than the inactive, premiRNA, is generally preceded. Pre-miRNA exists as a stable hairpin of approximately 70 nts in length [13]. To amplify the pre-miRNA, forward and reverse primers were designed to anneal to the stem portion of the hairpin. Isoforms present another issue that needs to be carefully considered when designing quantified miRNA. Numerous miRNAs exists as isoforms of identical mature and precursor sequences. Using SYBR green detection, it is often not possible for the PCR primers designed to the hairpin to discriminate among various isoforms. However, TaqMan ™ minor groove binding (MGB) probes can be used to detect a family of different isoforms [14]. Sequences of the primers and TaqMan ™ MGB probes for the analysis of the miRNA might be found in the website [15].

2.4 Microarray-Based Techniques for Quantification of miRNAs

Microarrays have been widely used to profile large numbers of mRNAs [16, 17]. cDNA microarrays are an increasingly popular technology to profile miRNAs [18], which includes synthesis of cDNA, labeling the product with fluorophore followed by dissociation and hybridization to complementary probes immobilized on a surface. It is practical to profile miRNA expression using real-time PCR in 384-well reaction plates. Gene expression profiling using real-time PCR has better sensitivity, which translates into smaller sample size. However, a disadvantage of real-time PCR profiling of gene expression is how to efficiently and accurately transfer small volumes of liquid into 384-well plates. Furthermore, some challenges also exist in microarray primer design.

The major commercial hybridized-based platforms, such as Affymetrix, Agilent , Exiqon (miRNA only), and Illumina BeadChip, have all been demonstrated to provide similar data quality [1921]. The commercial micro-assays have typically software for extracting probe intensities form hybridization images as well as preprocessing of the data, including background connection and normalization . In addition, ones need to acquire their own tools for the downstream data management and analysis by using bioinformatics tools. Table 1 shows comparison of some common sued platforms for various qualities, providing some information to help decide which platform might be chosen for certain purposes.

Table 1 The commonly used microarray platforms for profiling miRNAs

2.5 Deep Sequencing/Next-Generation Sequencing

qRT-PCR and hybridization-based microarray platforms have been used to identify cancer-associated miRNA aberrations [22]. Yet these technologies only measure relative abundant and known miRNA sequences, and have limited capacity in identifying novel miRNAs whose aberrations are associated with cancer. Next-generation deep sequencing has emerged as a powerful tool for global miRNA analysis. DNA sequencing was first reported by Sanger [23], providing a tool to decipher genes. However, low throughput and high cost stalled its use for deciphering the human genome. A more cost-effective sequencing technology was developed by 545 Life Sciences [24]. Since then, several next-generation sequencing (NGS) platforms, such as Illumina Genome Analyzer (Illumina, Inc., San Diego, CA, USA) and SOLiD™ (Life Technologies Corporation, Carlsbad, CA, USA) have been developed. The newly develop NGS platforms have been used to various fields of biological and medical research, including measuring expression levels of known miRNAs and detecting unknown miRNAs as shown in Table 2. Deep sequencing processes millions of independent sequencing events, allows providing billions of nucleotide information within a single experiment. Furthermore, deep sequencing system enables comprehensive analyses of large amounts of sequence data, resulting in dramatically accelerated research compared to traditional labor-intensive efforts and is a powerful approach to determine accurate encoded-information from nucleotide fragments [25]. Therefore, its advantages over the current techniques include pooling of samples for high-throughput purposes, a wide detectable expression range, analyzing expression of all annotated miRNAs , and detecting novel miRNAs [26].

Table 2 Next-generation sequencing platforms

Hu et al. [27] used Solexa sequencing to evaluate miRNA profiling in serum of patients with stages I to IIIa NSCLC. Levels of four serum-based miRNAs (miR-486, miR-30d, miR-1, and miR-499) were significantly associated with overall survival. Using SOLiD transcriptome sequencing of miRNAs in peripheral blood of lung cancer patients, Keller et al. [28] identified 32 annotated and seven unknown miRNAs that were altered in the blood specimens of cancer patients. We recently used next-generation deep sequencing to comprehensively characterize miRNA profiles in eight lung tumor tissues consisting of two major types of NSCLC. We successfully identified 896 known miRNAs and 14 novel miRNAs, of which 24 miRNAs displayed dysregulation with fold change ≥4.5 in either stage I ACs or SCCs or both relative to normal tissues [29]. In comparison with NGS platforms, microarray only covers known genes and probe design is based to the reference sequence. Therefore microarray is able to detect the concentration of known sequence fragments. Microarray may have better accuracy and precision than NGS, which is based on PCR character and sequencing by synthesis (SBS) technology. Using NGS, all fragments can be detected without reference sequence, and the fragment sequence is well presented. In addition, during building in sequencing library, PCR amplification increases a relatively sensitivity for detection, but following the imbalance of amplification , it is lacking in quantitative accuracy. Therefore, NGS can discover new and small fragments without tedious probe design. However, for detection of gene expression levels, we should choose microarray analysis in my studies.

454 deep sequencing system from Roche was one of the first NGS platforms on the market, launching in 2005. The system uses emulsion PCR (emPCR) to clonally amplify the fragments that are then sequenced via sequencing-by-synthesis (SBS) technology [30]. Differing from 454 deep sequencing system, the Illumina sequencing is a base-by-base sequencing technology using a reversible terminator-based method, enabling detection of single base that is incorporated into growing DNA strands complementary to the template [31]. Since this technology reads out one base at a time, the main error mode is substitution rather than insertion or deletion. However, Applied Biosystems’ SOLiD sequencing technology is based on ligation of oligonucleotides. 16 different dinucleotides are encoded with four fluorescent color dyes, each dye encoding four dinucleotides. SOLiD performs double interrogation of each base by combining the four-dye encoding scheme with a sequencing assay for every base in samples [32].

2.6 Droplet Digital PCR (ddPCR)

Quantitative polymerase chain reaction (qPCR) is one of the most commonly used techniques that can estimate expression levels of miRNAs in clinical specimens [22, 3335]. However, qPCR has two major challenges for the assessment of plasma miRNAs [36, 37]. First, qPCR is an indirect and labor-consuming approach to analyze miRNAs, as it relies on an increase in fluorescence signal that is proportional to the polymerase reaction product, and uses the cycle threshold (CT) as a metric. CT values for miRNA targets are referenced to endogenous small RNA controls across samples and used for normalization . This can become problematic, because expression levels of the endogenous controls and their transcripts may differ between samples [36, 38]. Furthermore, numerous endogenous genes have been evaluated for determination of target miRNAs , including U6, U6B, 18S rRNA, 5S RNA, RNU38B, and RNU43; yet none has been widely accepted as a standard control [22, 33]. These problems can be partially solved through the use of an exogenous “spike-in” control, which, however, does not account for any template-specific effect or bias introduced through primer design. Moreover, to estimate the absolute abundance of a given miRNA, data must be compared to a previously generated standard curve of the same template with identical primers and conditions. However, the additional manipulations are labor intensive, and extreme care should be taken when measuring the reference samples and comparing the references and experimental standard curves [36]. Second, the sensitivity of qPCR for the detection of low copy number of genes is not high enough, as it only resolves ~1.5-fold changes of nucleic acids [37]. Given that a proportion of the cancer -associated miRNAs is derived from primary tumor and could be “diluted” in a background of normal miRNAs [3941], the miRNAs presenting at low levels in plasma could be undetectable by qPCR.

ddPCR is a direct method for quantitatively measuring nucleic acids [4249], as it depends on limiting partition of the PCR volume, where a positive result of a large number of microreactions indicates the presence of a single molecule in a given reaction. The number of positive reactions, together with Poisson’s distribution, can be used to produce a straight and high-confidence measurement of the original target concentration [47]. Therefore, ddPCR does not require the reliance on rate-based measurements (CT values), endogenous controls, and the use of calibration curves. Furthermore, previous studies targeting low copy number of nucleic acids have demonstrated that ddPCR has a high degree of sensitivity and precision than does qPCR [5052]. We recently investigated the efficacy of using ddPCR for quantitative detection of two miRNAs (miRs-21-5p and 335-3p) in artificially seeded samples, RNA of cancer cells, and clinical plasma samples. miRs-21-5p and 335-3p were chosen, because our previous studies [22, 3335] showed that miR-21-5p displayed a high expression level, whereas miR-335-3p had an endogenously low level in plasma. We then used ddPCR to quantify copy number of plasma miR-21-5p and miR-335-3p in 36 lung cancer patients and 38 controls. ddPCR showed a high degree of linearity and quantitative correlation (R 2 = 0.96–0.99) of measuring the miRNAs in a dynamic range from one to 10,000 copies/μl of input with high reproducibility. qPCR exhibited a dynamic range from 100 to 1 × 107 copies/μl of input. ddPCR had a higher sensitivity to detect copy number of the miRNAs compared with qPCR (one vs. 100 copies/μl, P < 0.05). In plasma, ddPCR could detect copy number of both miR-21-5p and miR-335-3p, whereas qPCR was only able to assess miR-21-5p. Quantification of the plasma miRNAs by ddPCR provided 71.8 % sensitivity and 80.6 % specificity in distinguishing lung cancer patients from cancer-free subjects. Therefore, as ddPCR becomes more established, it might be a robust tool for quantitative assessment of miRNA copy number in cancer diagnosis.

3 In Summary

Each platform has the advantages and disadvantages. In addition to carefully selecting the appropriate one for each research or clinical applications regarding efficacy and cost, we also need to pay attention on intra- and interlaboratory reproducibility, developing method standardization, establishing guidelines for sample collection and preparation. For instance, one of the major technical challenges in applying the techniques in clinical settings is how to standardize protocols for miRNA extraction from biological specimens such as serum or plasma, and to normalizing measured values and controls. Nevertheless, the rapid advance in the development of the sophistication of miRNA profiling tools provides the technical capabilities required for function analysis of miRNAs and miRNA biomarker discovery and validation . Future use of the techniques will dramatically deep understanding of biological function of miRNAs, and develop the small molecules as important diagnostic and therapeutic targets for various human diseases.