Abstract
The accurate prediction and validation of microRNA targets is essential to understanding the function of microRNAs. Computational predictions indicate that all human genes may be regulated by microRNAs, with each microRNA possibly targeting thousands of genes. Here we discuss computational and experimental methods for identifying mammalian microRNA targets. We describe microRNA target prediction resources and procedures that are suitable for experiments where more accurate prediction of microRNA targets is more important than detecting all putative targets. We then discuss experimental methods for identifying and validating microRNA target genes, with an emphasis on the target reporter assay as the method of choice for specifically testing functional microRNA target sites.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- microRNA
- Target genes
- microRNA expression
- Gene regulation
- Experimental validation
- Target prediction algorithms
- Bioinformatics
3.1 Introduction
In 2002, Eric Lai [1] compared the sequences of 11 microRNAs to the K box and Brd Box motifs that were known to mediate post-transcriptional regulation in Drosophila. He demonstrated that the first eight nucleotides, now called the seed region, of microRNAs (miRNAs), were perfectly complementary to these motifs and concluded that this complementarity may be essential in post-transcriptional regulation by microRNAs. This simple bioinformatics analysis established one of the strongest predictive features used in target prediction to date. Since then, the microRNA repertoire has grown exponentially and numerous experimental methods have been developed to confirm microRNA targets. None of these advances has produced a unique feature of microRNA targeting that is more telling than the seed region. They have instead led to the conclusion that microRNA regulation is very intricate and diverse. For this reason, the computational and experimental methods that have been developed generally focus on specific aspects of microRNA regulation and are used to either investigate the physical interaction between microRNAs and their putative targets or the functional outcome of microRNA targeting. Here we describe these computational and experimental methods and explain which specific aspects of microRNA regulation they focus on.
3.2 Computational Methods to Identify microRNA Targets
Despite a plethora of different algorithms and methods to predict microRNA targets, most rely on similar sequence-based approaches for their starting point. These algorithms initially search for some degree of sequence complementarity between the miRNA of interest and the 3′ untranslated region (3′UTR) of mRNAs with emphasis on the miRNA seed region (nt 2–8). Because the miRNA:mRNA duplex can contain mismatches, gaps and G:U pairs, the number of possible targets based uniquely on this alignment is too large to be informative. Additional steps are therefore required to refine target predictions and rank them according to statistical confidence. Here we describe the most commonly used methods for detecting miRNA targets, classified according to the criteria used to refine the initial sequence analysis (Fig. 3.1). For each approach we provide examples of commonly used algorithms and discuss their limitations.
3.2.1 Thermodynamic Stability of the microRNA:mRNA Duplex
miRanda [2], the first freely-available prediction program measures the thermodynamic stability between a miRNA and its putative target to increase prediction accuracy. Different scores for the C:G, A:U, and G:U pairs are used to measure stability with a requirement for more stable energy scores at the 5′ end of the miRNA. A user-defined threshold can then be set to eliminate unstable duplexes. Since miRanda became available, more complete models to calculate the stability of RNA duplexes have been published and successfully used to predict miRNA targets. The standalone algorithm RNAhybrid [3], for example, calculates the most stable hybridization site between two sequences and can easily be incorporated into existing prediction algorithms. The PITA algorithm [4] also uses thermodynamic stability of a miRNA:mRNA duplex but compares it to the stability of local structures within the 3′UTR of the target mRNA. If the duplex is predicted to occur within a region of the 3′UTR that is already involved in a stable structure, the miRNA is less likely to bind to its target. This approach is limited by the accurate prediction of stable secondary structures, which becomes unreliable when considering long distance interactions and therefore larger RNA structures.
3.2.2 Sequence Conservation of the Target Site Between Multiple Species
Evaluating sequence conservation of predicted targets between distantly related species efficiently reduces the number of false positive predictions. Most algorithms will require that the predicted miRNA target site be located in homologous regions of the 3′UTR, and that the seed binding region be in a highly conserved region. TargetScan [5] initially searches for conserved seed pairing regions in 3′UTR alignments between 28 vertebrate species. This set of putative targets is then refined using a context score based on the target position in the 3′UTR and surrounding sequence composition and further refined by considering 3′ pairing of the miRNA within [6]. This approach is of little use in detecting species-specific binding sites or binding sites of species-specific miRNAs. TargetScan also provides non-conserved targets on their website.
3.2.3 Multiple Targets in the Same 3′UTR
Recent analysis demonstrates that numerous mRNAs are targeted by the same miRNA at different sites within their 3′UTR. This multi-targeting occurs at a significantly higher rate than expected. Focusing therefore on mRNAs that have more than one predicted site for the same miRNA in the 3′UTR can increase the signal to noise ratio for different algorithms [7, 8]. Although this approach will eliminate numerous true target sites it has the advantage of producing a list of high confidence gene targets. This method requires the user to first select one or more target prediction programs and subsequently refine their results for multi-targeting. This last step can be performed on the mimiRNA website [8] (http://mimirna.centenary.org.au). The PicTar [9] algorithm uses a combinatorial approach that not only accounts for multiple binding sites of the same miRNA but also computes the likelihood that a sequence is bound by a combination of input miRNA sequences. Filtering predictions based on multi-targeting drastically reduces the number of predicted targets and, because they increase the probability of discovering true target genes, they are useful for studies where experimental validation of miRNA targets is necessary.
3.2.4 Functional Category of Targets
Because miRNAs can often affect genes in a biochemical pathway or biological process [10], considering the function of target genes may eliminate biologically irrelevant predictions. mirBridge [11] starts with a set of genes with a known function and searches for enrichment of putative targets based on sequence analysis amongst this gene set. This approach is useful for experiments where a specific function or pathway is being dissected but may prove limiting in studies where a specific miRNA or mRNA is being analysed with no prior knowledge of its function.
3.2.5 Combining microRNA and mRNA Expression Data
Numerous miRNAs inhibit gene expression by destabilizing mRNAs [12]. As a consequence, mRNA targets should be expressed at lower levels in tissues where the miRNA is expressed. Correlating mRNA and miRNA expression across multiple tissues and selecting those pairs that are negatively correlated can successfully detect target genes [13]. Because this method is independent of any sequence analysis, it can be used to filter predictions made by any of the aforementioned algorithms. Another advantage of this approach is that it is not restricted to targets located in the 3′UTR. Although there are fewer published examples of miRNA targets in other regions of mature mRNAs, there may be numerous targets in the coding region that have been overlooked because the high level of sequence conservation in exons prohibits the use of sequence conservation-based techniques (see Sect. 3.3.3). The major drawback of this approach is that miRNAs that do not affect mRNA levels or that only “fine-tune” gene expression will not be identified. The mimiRNA website [8] provides correlation analysis in human samples and displays the predicted targets from TargetScan, miRanda, and PicTar.
3.2.6 Concluding Remarks Regarding Computational Methods
The goal of these different approaches is to reduce prohibitively large lists of predicted targets without losing too many true targets. Tuning these algorithms to find an optimal tradeoff between accuracy and sensitivity is currently impossible because relatively few targets have been validated experimentally. As a result, the efficiency of these algorithms is often tested by measuring the enrichment for predicted targets amongst a set of mRNAs or proteins for which the expression is subject to perturbation of miRNA expression. A recent study based on protein expression following both miRNA overexpression and knockdown found that TargetScanS and Pictar gave the best results [14]. However, this type of benchmark does not account for off-target effects which may be prevalent considering that miRNAs often target transcription enhancers and repressors [13]. One commonly used approach to enhance the quality of target predictions is to consider the overlap between multiple programs. We do not recommend this as there is no proof that this will increase prediction quality and it will systematically reduce the number of candidates [7].
3.2.7 Future Directions
The degree of sequence conservation of a target or its involvement in a pathway for which other targets are predicted (described in Sects. 3.2.2, 3.2.3 and 3.2.4 above) does not imply the biological mechanism through which a specific miRNA binds to its targets. Binding of miRNA:mRNA pairs is affected by spatial and temporal co-expression of the miRNA:mRNA pair, target site availability, and the formation of a stable duplex at the target site. Future algorithms will be required to investigate these three criteria to discover the whole repertoire of miRNA targets.
Co-expression of miRNA:mRNA pairs is often evaluated by simultaneous sequencing of mRNA enriched libraries and small RNA libraries from the same cells. As more of these experiments are performed on different cell types and even subcellular localizations, prediction tools will be able to integrate co-expression data with increasing efficiency.
Target site availability is currently evaluated by folding a small sequence of RNA around the putative target. As discussed above, this does not take into account long distance interactions between different regions of the same RNA molecule. Such interactions are currently impossible to predict because there is insufficient biochemical data on the stability of large RNA structures and because the number of possible suboptimal structures that could be predicted is prohibitively large. Moreover, target site accessibility should take into account RNA binding proteins, the prediction of which suffers the same limitations as miRNA targets.
The stability of the miRNA:mRNA duplex has been thoroughly investigated through machine learning models and in vivo mutagenesis assays [15]. The results of these studies show that there is no clear-cut rule on the amount of sequence complementarity required between the miRNA and its target or at what position complementarity should occur. These most likely depend on the region of the mature miRNA that is exposed in the active site of Argonaute proteins and are therefore available to interact with its target. Understanding the different conformations of the Argonaute proteins should therefore allow for more accurate target predictions.
3.3 Experimental Identification and Validation of microRNA Targets
The identification of microRNAs and their target genes was originally conducted through classic genetic studies in the worm Caenorhabditis elegans, whereby a miRNA mutant displayed an opposite phenotype to that shown by the corresponding target gene null mutant [16]. Although this method was appropriate for small organisms such as nematodes [17] or the fruit fly Drosophila melanogaster [18], it remains limited for larger animals like mammals. Therefore artificial systems are needed to identify and validate miRNA target genes. Validation of a putative miRNA target site requires that a physical interaction between a miRNA and its target mRNA will lead to decreased production of the corresponding protein. Such physical interaction implies the spatiotemporal co-expression of the regulating miRNA and its target gene. On this basis, modulating miRNA expression levels should result in changes in the amount of a reporter protein such as luciferase or GFP, which are quantified in comparison to controls. Several methods have been designed to experimentally identify targeted mRNAs at various steps along the miRNA regulatory pathway (Fig. 3.2). Since the net result of miRNA-mediated gene regulation is a decrease in the amount of target protein being produced, methods measuring changes in protein output resulting from variations of miRNA expression have become a standard approach to identifying and validating miRNA targets. In addition, a number of biochemical methods have been developed in order to experimentally identify miRNA:mRNA pairs isolated from immunopurified ribonucleoprotein complexes or enriched miRNA:mRNA duplexes. Here we describe some of the methods used to experimentally identify and validate miRNA target genes (see also refs 19–21 for review).
3.3.1 Reporter Assays
In vitro reporter assays have been designed to confirm the interaction between a given miRNA and a putative target mRNA. The rationale is that upon binding to its target site(s) a given miRNA will inhibit reporter protein production, thereby leading to reduced protein amount or activity which can be measured compared to relevant controls [22–25]. Typically the putative miRNA target site is cloned downstream of the open reading frame of a reporter gene, e.g. luciferase (Renilla or firefly) or GFP, and the recombinant plasmid is transfected into mammalian cells. Depending on the size of the 3′UTR to be tested, the full-length UTR or a fragment containing the predicted binding site is used. However, a partial UTR sequence may give erroneous positive results due to higher accessibility of the miRNA consequent to loss of secondary structures in the UTR. The recombinant reporter plasmid and a vector overexpressing the miRNA of interest, or a synthetic double-stranded oligonucleotide (miRNA mimic), are then transiently transfected into mammalian cells, usually HeLa or HEK293 cells, and luciferase activity or fluorescence intensity is measured 24–48 h later. It is important to assess endogenous miRNA expression levels in the cell system used for the assay, as the endogenous expression of miRNAs is not the same from one cell type to another, and some miRNAs display tissue-specific expression (e.g. hematopoietic-, brain-, embryonic stem cell-restricted miRNAs). Alternatively, cells can be transfected with the reporter vector alone if they express suitable endogenous levels of the candidate miRNA. Reduction of miRNA expression can be achieved using miRNA inhibitors such as modified antisense oligonucleotides [26] or sponge vectors [27], which constitute an elegant option when cells have high endogenous miRNA levels.
Importantly, transfection controls must be chosen carefully. These controls include reporter vectors without the UTR sequence, or with a UTR cloned in the antisense orientation. Also, cells must be co-transfected with a control luciferase reporter vector to normalize for variations in transfection efficiencies. Alternatively, dual luciferase reporter systems can be used, in which UTR sequences are cloned downstream to one luciferase gene (Renilla), while the other luciferase reporter (firefly) remains unaltered and is used for normalization. Specificity of miRNA regulation is assessed by co-transfection of an irrelevant miRNA or scrambled RNA duplexes. In these conditions, only transfection with the relevant miRNA should result in a decrease of reporter activity/expression. However, this result could be due to some off-target effect of the miRNA, which is provided in supra-physiological amounts to the cell when overexpressed, or indirect regulation by targeting genes that, in return, affect expression of the reporter. To confirm the specific inhibition of a miRNA on a target gene, it is therefore essential that the predicted binding sites be disrupted and that modified UTR sequences be tested in the reporter assay as well. This strategy not only definitively validates the miRNA:mRNA interaction and regulation, but also identifies which site(s) is/are true functional binding site(s) in the case of multiple predicted miRNA target sites. Last, a modified miRNA mimic harboring the complementary sequence to the mutated UTR can be used to rescue target regulation of the mutated UTR reporter constructs. In summary, a valid reporter assay should be carried out by co-transfecting (1) a reporter plasmid containing the full 3′UTR sequence, and (2) the same reporter construct with a disrupted target site, together with a miRNA overexpressing vector vs. scramble sequence.
The reporter assay described above indicates that, when a given miRNA and target gene are expressed simultaneously in the same cell, they are likely to interact and this interaction might result in miRNA-mediated reduced expression of the target gene. It remains, however, an artificial system in which both the miRNA and the targeted UTR are overexpressed in a heterologous system. It is thus recommended to confirm, when possible, that such regulation does occur on the endogenous gene. Changes in protein amounts upon miRNA overexpression/inhibition can be measured by Western blot, flow cytometry, or immunocytochemistry experiments. If antibodies are not available, other validation methods can be used, for example, based on enzymatic activity, ligand binding, etc. Another indication of miRNA-induced gene regulation can be provided by target transcript quantification. Although miRNAs were originally shown to regulate gene expression by repressing mRNA translation without affecting transcript level, it is now widely accepted that miRNA-mediated regulation is frequently accompanied by mRNA destabilization, essentially due to increased deadenylation [28, 29]. Transcripts displaying reduced levels upon miRNA ectopic expression are subsequently analysed for the presence of miRNA target sites in their 3′UTR using prediction algorithms (see Sect. 3.2) in order to identify putative miRNA target genes [12, 26, 30].
3.3.2 Proteomics Methods
Several proteomics studies have been designed to identify miRNA target genes. Vinther et al. [31] used stable isotope labeling by amino acids in cell culture (SILAC), in which proteins are metabolically labeled by cells growing in medium containing heavy isotopes of essential amino acids. Differences in protein synthesis are determined by mass spectrometry as the ratio of peptide peak intensities from light and heavy isotopes [32]. Of 504 proteins investigated by SILAC, they identified a set of 12 proteins with reduced expression in HeLa cells overexpressing miR-1 and grown in medium containing heavy isotopes, as compared to control cells grown with light isotopes. Seed region complementary sites were found in the 3′UTR of corresponding genes for 8 of these proteins, which was a significant enrichment for miR-1 seed motif when compared with entire 3′UTR sequence databases. These investigators used the luciferase reporter assay to confirm miR-1 regulation for 6 out of 11 target genes tested [31].
The SILAC method was subsequently used in two large-scale proteomics studies to identify target genes of several miRNAs [14, 33]. In both cases, HeLa cells were transfected with different miRNA duplexes, and protein output was measured 48 h post-transfection. Selbach et al. used a modified version of SILAC in which cells were pulse-labeled (pSILAC) so that heavy isotopes were primarily incorporated into newly synthesized proteins [14]. In addition, SILAC was used to study the impact of miR-223 deficiency in mouse neutrophils [33] and let-7b knockdown in HeLa cells [14]. The authors concluded that each miRNA regulates hundreds of target proteins, though to a relatively modest degree. Motif analysis revealed a significant enrichment for corresponding miRNA seed complementary sites in the 3′UTR of repressed genes, as compared to an unmodified protein set. While Baek et al. found that most repressed targets displayed detectable mRNA destabilization [33], Selbach et al. identified substantial direct regulation by translation inhibition [14]. Overall, these studies suggested that miRNAs act primarily by fine-tuning expression of a large number of target genes.
Zhu et al. used two-dimensional differentiation in-gel electrophoresis (2D-DIGE) to identify miR-21 targets in a mouse breast cancer model [34]. Proteins were extracted from tumors derived from human MCF7 cells treated with anti-miR-21 antisense or control oligonucleotide. After labeling with two different fluorescent dyes, both protein samples were separated by 2D-polyacrylamide gel electrophoresis (PAGE) in the same gel. Fluorescence intensity was measured by gel imaging, and differentially expressed proteins were purified from the gel prior to identification by mass spectrometry. This method identified seven proteins that were up-regulated in anti-miR-21 treated tumors, including tropomyosin (TPM) 1, which was further validated by reporter assay and Western blot [34]. Of note, several proteins were also found to be down-regulated upon anti-miR-21 treatment in this study, which suggests an indirect effect of miR-21.
Another approach for target identification combined miRNA and protein expression analysis with computational predictions. miRNA profiling was performed to identify differentially expressed miRNAs between two samples, which were compared to proteomics data generated by 2D-PAGE associated to mass spectrometry [35] or reverse-phase protein arrays [36]. Reciprocally expressed miRNAs and proteins were then compared to miRNA target predictions to identify relevant target genes. This analysis resulted in the identification of 52 and 17 miRNA:gene target pairs in rat kidney [35] and human cartilage [36], respectively. More recently, a targeted proteomics approach was designed to identify let-7 miRNA target genes in C. elegans [37]. The method combined isotope-coded affinity tag (ICAT) protein labeling [38] and detection by selected reaction monitoring mass spectrometry [39] to quantify protein levels between wild type and let-7 mutant whole worms. By definition, the ICAT labeling is restricted to proteins harbouring mass spectrometry-detectable peptides that contain cysteine residues [38]. This limitation implied working on a predefined set of proteins predicted as let-7 targets that met these requirements, leading to consequent reduced proteome coverage. Of 161 proteins analysed, 29 were significantly altered in mutant worms, including ten that were downregulated, suggesting an indirect effect of miRNA regulation [37]. Ten of the identified targets were further validated by genetic analysis and, for one of them, by reporter assay. The authors then used a modified method based on metabolic labeling of worms using heavy isotopes [40], to facilitate full coverage of the C. elegans peptide repertoire. Of 27 predicted miR-58 targets, four were identified as significantly upregulated in a miR-58 mutant using this modified method [37].
3.3.3 Biochemical Approaches
miRNA-mediated gene silencing in mammals requires a functional miRNA-loaded RNA-induced silencing complex (miRISC) machinery (Fig. 3.2). Several studies identified miRNA target transcripts by virtue of their association with miRISC components by co-immunoprecipitation of human or Drosophila Argonaute (AGO) proteins [35, 41–46], human TNRC6 proteins [45], or nematode GW182 protein family AIN1-2 [47]. This strategy was originally used by Mourelatos et al. to identify new miRNAs that were co-immunoprecipitated with AGO2/EIF2C2-containing complex in HeLa cells [48]. Immunoprecipitated mRNAs were then identified by cloning, microarray analysis, or deep sequencing. A first strategy consists in the purification of all miRISC-associated mRNA species in a given cell type, in order to identify the global “targetome” of that cell type, without preliminary knowledge of the presence of any specific miRNA. Sequence motif analysis is then performed to identify miRNA complementary sites enriched in miRISC-bound mRNAs compared to whole cell mRNAs, thus inferring which miRNAs are co-expressed. Easow et al. used this approach in Drosophila S2 cells stably expressing FLAG/HA-Ago1 [42]. Microarray analysis revealed significant enrichment of transcripts containing complementary sites for miR-184, miR-7 and miR-314, in anti-HA pulled down mRNAs. Similarly, Beitzinger et al. pulled down AGO1- and AGO2-associated transcripts from HEK293 cells and identified immunoprecipitated mRNAs by complementary DNA (cDNA) library preparation and sequencing [41]. Another approach consists in comparing miRISC-associated mRNAs of cells transfected with, or deprived of, a given miRNA to mock-transfected or unmodified control cells. Easow et al. found a significant overrepresentation of miR-1 complementary sequences in Ago1 co-purified transcripts from miR-1 transfected S2 cells compared to untransfected cells [42]. Several studies using this method, also called RIP-Chip (ribonucleoprotein immunoprecipitation-gene chip), reported identification of miRNA targets in 293 cells [43–45], Hodgkin lymphoma cell lines [49], human H4 glioneuronal cells [46], and C. elegans [47]. In this latter study, high-throughput sequencing was used to identify co-immunoprecipitated miRNAs as well. Notably, this experimental procedure allowed the identification of miRNA target genes with stable mRNA levels that are likely to be primarily regulated by translational repression [43].
Recently, the HITS-CLIP method (high-throughput sequencing by crosslinking and immunoprecipitation) was developed to identify direct protein/RNA interactions [50]. This approach uses UV irradiation to crosslink nucleic acids and proteins in close proximity, which are then immunopurified using an antibody to a miRISC component. Partial RNA digestion leaves miRISC-protected RNA fragments, which are then identified by high throughput sequencing. Chi et al. used HITS-CLIP to purify Ago2-bound mRNA and miRNA species from mouse brain as well as miR-124 transfected HeLa cells [51]. As in other studies, bound mRNAs were enriched for complementary sites to miRNAs that were either highly endogenously expressed or over-expressed following transfection. This approach, also called CLIP-Seq, was used to isolate Argonaute protein ALG-1-bound mRNAs in C. elegans [52] and Ago2-purified transcripts in wild type versus dicer −/− mouse ES cells [53]. An improvement of the method, named photoactivatable-ribonucleoside-enhanced (PAR)-CLIP, was recently described, in which crosslinking efficiency was enhanced by incorporation of the photoactivatable nucleoside analog 4-thiouridine into transcripts of cultured cells [54]. Upon UV crosslinking at 365 nm, thymidine located at the crosslinking sites are converted to cytidine, which allows for the precise identification of RNA-protein binding site. PAR-CLIP method was used to identify miRNA target sites of mRNAs associated to AGO and TNRC6 family proteins in 293 cells. Deep sequencing of bound RNAs revealed enrichment of complementary sites for the most highly expressed miRNAs [54].
Interestingly, these high-throughput studies revealed that a high proportion (25–50 %) of the binding sites were located within the coding sequence (CDS) region of bound mRNAs [51, 52, 54]. This observation suggests that functional miRNA target sites may not only be located in 3′UTRs as previously thought, in agreement with a number of recent reports identifying miRNA target sites in CDS [55–58]. Furthermore, Schnall-Levin et al. recently demonstrated frequent CDS targeting through repeated miRNA binding sites, of paralogous families of the C2H2 zinc-finger genes, which typically contain many tandem repeats of the finger motif [59]. Similarly, building on previously published microarray data in mammalian cells transfected with, or deprived of, specific miRNA [14, 33], Fang and Rajewsky showed that CDS target sites act synergistically with 3′UTR sites for miRNA-mediated regulation of gene expression [60]. Of importance, most prediction algorithms could not identify this class of miRNA target sites because of the “3′UTR-only” rule. However, the PITA algorithm [4], which mainly identifies target site accessibility, and the rna22 program [57, 61], which identifies over-represented sequence patterns, can be used to detect miRNA binding sites located outside the UTR. In addition, the mimiRNA algorithm [8], which identifies miRNA:mRNA pairs that display conserved negative correlation of expression across several tissues, can be used to select candidate target genes prior to searching for putative binding sites. Of note, CDS target site validation requires a modified reporter assay, whereby the target-site-containing sequence is fused in frame with a reporter CDS [42, 59]. Alternatively, co-transfection of wild type and mutated versions of the targeted CDS associated with two different epitope tags, e.g. Myc and FLAG, has been used to monitor by Western blot the level of protein down-regulation upon miRNA co-expression [58].
An alternative strategy to the aforementioned protein pull-down methods was proposed by Orom and Lund, who developed an affinity-based target gene identification procedure [62]. In this case, transfection of a biotinylated synthetic miRNA allows the purification of miRNA:mRNP complexes using streptavidin-agarose beads. This strategy is attractive since it allows target gene identification of a specific miRNA, whereas other methods seek to isolate virtually all miRNA-regulated transcripts. By purifying a biotin-tagged bantam miRNA in Drosophila S2 cells, the endogenous target gene Hid was efficiently identified [62]. The same group subsequently used this technique to isolate mRNAs bound to biotinylated miR-10a in mouse ES cells. Surprisingly, microarray analysis revealed that 55 of the 100 most enriched mRNAs corresponded to ribosomal protein genes, with no enrichment for known miR-10a targets or transcripts with miR-10a complementary sites [63]. They further showed that miR-10a bound conserved sites in the 5′ UTR of these genes, leading to upregulation of ribosomal protein translation and ribosome formation, resulting in a ∼30 % increase of global protein synthesis [63]. Combined to 4-thiouridine modified nucleotides and UV crosslinking, biotin-tagged miRNA ‘pullout’ was used to demonstrate direct interaction between miR-34a and MYC transcript in human fibroblasts. Similarly the LAMP (labeled miRNA pull-down) assay was developed [64, 65], in which synthetic miRNAs were labeled with digoxigenin (DIG), and binding RNAs were isolated using anti-DIG agarose beads. The LAMP method was used to isolate known targets of C. elegans let-7 and lin-4, and zebrafish let-7 and miR-1. Specifically, 302 transcripts enriched using DIG-tagged miR-1 pull down (compared to mutated miR-1 control) were identified, including the known miR-1 target Hand2 [66]. An improvement of the method, called TAP-Tar (tandem affinity precipitation target identification) was recently described, which combined HA-tagged AGO1-2 immunoprecipitation followed by biotinylated miRNA pull down using streptavidin beads in HeLa cells [67]. This two-step procedure was shown to recover the known miR-20a target E2F1 more efficiently than each pull down method used separately.
3.3.4 Molecular Methods
Vatolin et al. reported the use of endogenous miRNAs as primers for cDNA synthesis by reverse transcriptase on the targeted mRNA template [68]. Although pairing of the target mRNA to the miRNA 3′ end is usually weaker than to the 5′ end (the seed region), the hypothesis underpinning this work was that the miRNA 3′ end could form a temporary stable duplex with the target mRNA to initiate cDNA synthesis (Fig. 3.2). Using cytoplasmic extracts, a first round of reverse transcription elongates the miRNA sequence to generate cDNA-miRNA molecules, which are purified and used as secondary primers to drive a second round of reverse transcription, thereby increasing the specificity of the reaction. After ligation of an adapter sequence at the 5′ end, cDNAs are PCR amplified using a primer from the adapter and a gene-specific primer corresponding to a target RNA of interest. PCR products are then cloned and sequenced to identify the regulatory miRNA based on homology searches of the appropriate databases. Vatolin et al. recovered partial sequences of miRNAs associated to β-actin, N-Ras and K-Ras mRNAs from human hTERT-RPE1 epithelial cells, and confirmed their functional regulation by Western blot and luciferase assay upon miRNA overexpression [68].
Andachi modified the method by ligating an adapter sequence to the 3′ end of the cDNA and by using a biotinylated, miRNA-specific primer together with an adapter-specific primer for PCR amplification [69]. The amplification product was purified using avidin beads, and further PCR amplified with adapter-specific and nested miRNA-specific primers. When applied to C. elegans, this method isolated the known lin-4 target gene lin-14, and identified the K10C3.4 gene as a new target for let-7, which was further validated through reporter assay and genetic complementation analysis [69]. The two methods described above allow identification of miRNA:mRNA pairs by either target gene- or miRNA-specific analysis, and are not suitable for high-throughput identification of miRNA targets.
In the specific context of miRNA-mediated cleavage of a target gene (Fig. 3.1), several studies identified mRNA cleavage products by RNA ligase mediated-5′ rapid amplification of cDNA ends (RLM-RACE) [70–77]. In the original method, an RNA adapter was ligated to the 5′ phosphate of cleaved, uncapped poly-A+ RNAs. After reverse transcription with oligo-(dT), cDNAs were amplified using adapter- and gene-specific primers, before cloning and sequencing. This approach was used to validate miR-171-mediated cleavage of several transcripts of the SCL family of transcription factors in Arabidopsis thaliana [70], as well as Hoxb8 mRNA cleavage by miR-196 in mouse embryos [25]. In addition, the 5′ end of the cloned mRNA was shown to map to the nucleotide pairing with the tenth nucleotide of the miRNA.
An improved method, named PARE (parallel analysis of RNA ends), was developed for genome-wide identification of miRNA-induced cleavage products [71, 72]. In this modified protocol, the 5′ RNA adapter was engineered to contain an MmeI restriction site, and after reverse transcription and second strand cDNA synthesis, double-stranded molecules were digested with MmeI, generating 20–21 nt tag sequences attached to the adapter. A DNA adapter was then ligated at the 3′ end of the tag, which was PCR amplified using 5′adapter- and 3′adapter-specific primers. Tags were analysed by high-throughput sequencing and matched to the Arabidopsis genome to identify corresponding target genes and infer regulatory miRNAs. This ‘degradome’ tag analysis identified a large proportion of known Arabidopsis miRNA and trans-acting siRNA (ta-siRNA) target genes, although most of the tags represented mRNA degradation products unrelated to these small RNAs [71, 72]. PARE was also used to identify miRNA and ta-siRNA target genes in rice [75]. A modified RLM-RACE methodology was also developed, in which Arabidopsis cleaved transcripts were linearly amplified by in vitro transcription using a T7 promoter, prior to microarray analysis [73, 78]. Of the 228 candidate targets identified, 14 corresponded to previously known miRNA targets [73].
Although this approach is most suited to plants, in which extensive base pairing between miRNA and mRNA leads to miRISC-mediated cleavage of targeted mRNA, several studies reported PARE analysis of the degradome in mammalian cells [74, 76, 77]. Karginov et al. compared degradome tags from wild type versus Ago2 −/− mouse ES cells, in order to identify miRNA-specific cleavage products. Tag abundance peaked at nucleotide position 10 of the miRNA in wild type cells, whereas no peak was identified in Ago2 −/− cells [74]. This study also identified a number of target genes subjected to direct Drosha-mediated endonucleolytic cleavage, as well as Ago2- and Drosha-independent cleavage sites that were conserved in human 293 cells. In another study, Shin et al. defined a class of metazoan target sites named ‘centered sites’, which lack perfect seed pairing and 3′-compensatory pairing, but instead harbour 11–12 contiguous nucleotides that pair with miRNA nt 4–15 [76]. Using RLM-RACE degradome sequencing, they identified a set of genes targeted for miRNA-mediated cleavage in HeLa cells and human brain, though of low abundance. Although most of the putative target genes were attributed to three highly expressed miRNAs (miR-196a, -28, -151-5p), a total of 18 additional miRNA target genes were identified [76]. Likewise, Bracken et al. performed degradome analysis on six adult mouse tissues and d16.5 whole mouse embryo, resulting in the identification of 23 putative miRNA-mediated cleavage sites, most of which displayed low read frequency [77]. Although these studies revealed the existence of miRNA-guided cleavage of target mRNAs in mammals, such targeting remains restricted to a limited number of genes. In addition, degradome analyses showed that a substantial proportion of transcripts were subjected to endonucleolytic cleavage, though most of them were not related to miRNA regulation [76, 77].
3.3.5 Concluding Remarks
Here we have considered a diverse array of computational and experimental methods used for genome-wide identification of miRNA target genes, each of which exhibits its own strengths and weaknesses. Yet, high-throughput approaches require formal validation to discriminate direct from indirect targeting, and identify functional miRNA target sites among the plethora of predictions. In this regard, reporter assays can provide such information, although they should be supported by other validation analysis, notably showing miRNA and mRNA co-expression and targeted protein output variations upon miRNA expression modulation.
The experimental methods described above highlight the existence of a large number of miRNA binding sites outside the 3′UTR of interacting mRNAs, particularly in the CDS. Although target sites in the CDS do not appear to be as effective in regulating protein output as those present in the 3′UTR [59], their contribution to the fine-tuning of gene expression has been essentially ignored so far. In addition, most of the widely used target prediction algorithms consider sites solely located within the 3′UTR, which renders CDS target site analysis even more difficult. Implemented computational methods will undoubtedly be developed in the future in order to investigate CDS target sites, together with the recently identified centered sites [76].
New models to explore miRNA function are regularly described, among which miRNA loss- and gain-of-function approaches will play an increasing role. Such models have proved useful for functional analyses of miRNA activity and target gene identification in nematode and Drosophila, and to a lesser extent in mouse ([79–83], see ref [84] for review). The mirKO resource [85] that was recently made available for the scientific community should aid in deciphering new miRNA functions and targets in the mouse. Likewise, the generation of miRNA/mRNA targeting networks through computational analysis of putative target gene function [86] should provide additional hints towards functional miRNA target gene identification.
References
Lai EC (2002) Micro RNAs are complementary to 3′UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet 30:363–364
John B, Enright AJ, Aravin A et al (2004) Human MicroRNA targets. PLoS Biol 2:e363
Kruger J, Rehmsmeier M (2006) RNAhybrid: microRNA target prediction easy, fast and flexible. Nucleic Acids Res 34:W451–W454
Kertesz M, Iovino N, Unnerstall U et al (2007) The role of site accessibility in microRNA target recognition. Nat Genet 39:1278–1284
Lewis BP, Burge CB, Bartel DP (2005) Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120:15–20
Grimson A, Farh KK, Johnston WK et al (2007) MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell 27:91–105
Ritchie W, Flamant S, Rasko JE (2009) MicroRNA target prediction: traps for the unwary. Nat Methods 6:397–398
Ritchie W, Flamant S, Rasko JE (2010) MimiRNA: a microRNA expression profiler and classification resource designed to identify functional correlations between microRNAs and their targets. Bioinformatics 26:223–227
Krek A, Grun D, Poy MN et al (2005) Combinatorial microRNA target predictions. Nat Genet 37:495–500
Xiao C, Rajewsky K (2009) MicroRNA control in the immune system: basic principles. Cell 136:26–36
Tsang JS, Ebert MS, van Oudenaarden A (2010) Genome-wide dissection of microRNA functions and cotargeting networks using gene set signatures. Mol Cell 38:140–153
Lim LP, Lau NC, Garrett-Engele P et al (2005) Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433:769–773
Ritchie W, Rajasekhar M, Flamant S et al (2009) Conserved expression patterns predict microRNA targets. PLoS Comput Biol 5:e1000513
Selbach M, Schwanhausser B, Thierfelder N et al (2008) Widespread changes in protein synthesis induced by microRNAs. Nature 455:58–63
Brennecke J, Stark A, Russell RB et al (2005) Principles of microRNA-target recognition. PLoS Biol 3:e85
Lee RC, Feinbaum RL, Ambros V (1993) The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75:843–854
Reinhart BJ, Slack FJ, Basson M et al (2000) The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature 403:901–906
Brennecke J, Hipfner DR, Stark A et al (2003) Bantam encodes a developmentally regulated microRNA that controls cell proliferation and regulates the proapoptotic gene hid in Drosophila. Cell 113:25–36
Orom UA, Lund AH (2010) Experimental identification of microRNA targets. Gene 451:1–5
Thomas M, Lieberman J, Lal A (2010) Desperately seeking microRNA targets. Nat Struct Mol Biol 17:1169–1174
Thomson DW, Bracken CP, Goodall GJ (2011) Experimental strategies for microRNA target identification. Nucleic Acids Res 39:6845–6853
Doench JG, Petersen CP, Sharp PA (2003) SiRNAs can function as miRNAs. Genes Dev 17:438–442
Zeng Y, Yi R, Cullen BR (2003) MicroRNAs and small interfering RNAs can inhibit mRNA expression by similar mechanisms. Proc Natl Acad Sci U S A 100:9779–9784
Kiriakidou M, Nelson PT, Kouranov A et al (2004) A combined computational-experimental approach predicts human microRNA targets. Genes Dev 18:1165–1178
Yekta S, Shih IH, Bartel DP (2004) MicroRNA-directed cleavage of HOXB8 mRNA. Science 304:594–596
Krutzfeldt J, Rajewsky N, Braich R et al (2005) Silencing of microRNAs in vivo with ‘antagomirs’. Nature 438:685–689
Ebert MS, Neilson JR, Sharp PA (2007) MicroRNA sponges: competitive inhibitors of small RNAs in mammalian cells. Nat Methods 4:721–726
Standart N, Jackson RJ (2007) MicroRNAs repress translation of m7Gppp-capped target mRNAs in vitro by inhibiting initiation and promoting deadenylation. Genes Dev 21:1975–1982
Guo H, Ingolia NT, Weissman JS et al (2010) Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature 466:835–840
Farh KK, Grimson A, Jan C et al (2005) The widespread impact of mammalian MicroRNAs on mRNA repression and evolution. Science 310:1817–1821
Vinther J, Hedegaard MM, Gardner PP et al (2006) Identification of miRNA targets with stable isotope labeling by amino acids in cell culture. Nucleic Acids Res 34:e107
Ong SE, Blagoev B, Kratchmarova I et al (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1:376–386
Baek D, Villen J, Shin C et al (2008) The impact of microRNAs on protein output. Nature 455:64–71
Zhu S, Si ML, Wu H et al (2007) MicroRNA-21 targets the tumor suppressor gene tropomyosin 1 (TPM1). J Biol Chem 282:14328–14336
Tian Z, Greene AS, Pietrusz JL et al (2008) MicroRNA-target pairs in the rat kidney identified by microRNA microarray, proteomic, and bioinformatic analysis. Genome Res 18:404–411
Iliopoulos D, Malizos KN, Oikonomou P et al (2008) Integrative microRNA and proteomic approaches identify novel osteoarthritis genes and their collaborative metabolic and inflammatory networks. PLoS One 3:e3740
Jovanovic M, Reiter L, Picotti P et al (2010) A quantitative targeted proteomics approach to validate predicted microRNA targets in C. elegans. Nat Methods 7:837–842
Gygi SP, Rist B, Gerber SA et al (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17:994–999
Anderson L, Hunter CL (2006) Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol Cell Proteomics 5:573–588
Krijgsveld J, Ketting RF, Mahmoudi T et al (2003) Metabolic labeling of C. elegans and D. melanogaster for quantitative proteomics. Nat Biotechnol 21:927–931
Beitzinger M, Peters L, Zhu JY et al (2007) Identification of human microRNA targets from isolated argonaute protein complexes. RNA Biol 4:76–84
Easow G, Teleman AA, Cohen SM (2007) Isolation of microRNA targets by miRNP immunopurification. RNA 13:1198–1204
Karginov FV, Conaco C, Xuan Z et al (2007) A biochemical approach to identifying microRNA targets. Proc Natl Acad Sci U S A 104:19291–19296
Hendrickson DG, Hogan DJ, Herschlag D et al (2008) Systematic identification of mRNAs recruited to argonaute 2 by specific microRNAs and corresponding changes in transcript abundance. PLoS One 3:e2126
Landthaler M, Gaidatzis D, Rothballer A et al (2008) Molecular characterization of human Argonaute-containing ribonucleoprotein complexes and their bound target mRNAs. RNA 14:2580–2596
Wang WX, Wilfred BR, Hu Y et al (2010) Anti-Argonaute RIP-Chip shows that miRNA transfections alter global patterns of mRNA recruitment to microribonucleoprotein complexes. RNA 16:394–404
Zhang L, Ding L, Cheung TH et al (2007) Systematic identification of C. elegans miRISC proteins, miRNAs, and mRNA targets by their interactions with GW182 proteins AIN-1 and AIN-2. Mol Cell 28:598–613
Mourelatos Z, Dostie J, Paushkin S et al (2002) miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev 16:720–728
Tan LP, Seinen E, Duns G et al (2009) A high throughput experimental approach to identify miRNA targets in human cells. Nucleic Acids Res 37:e137
Licatalosi DD, Mele A, Fak JJ et al (2008) HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456:464–469
Chi SW, Zang JB, Mele A et al (2009) Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature 460:479–486
Zisoulis DG, Lovci MT, Wilbert ML et al (2010) Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis elegans. Nat Struct Mol Biol 17:173–179
Leung AK, Young AG, Bhutkar A et al (2011) Genome-wide identification of Ago2 binding sites from mouse embryonic stem cells with and without mature microRNAs. Nat Struct Mol Biol 18:237–244
Hafner M, Landthaler M, Burger L et al (2010) Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141:129–141
Duursma AM, Kedde M, Schrier M et al (2008) miR-148 targets human DNMT3b protein coding region. RNA 14:872–877
Forman JJ, Legesse-Miller A, Coller HA (2008) A search for conserved sequences in coding regions reveals that the let-7 microRNA targets Dicer within its coding sequence. Proc Natl Acad Sci U S A 105:14879–14884
Tay Y, Zhang J, Thomson AM et al (2008) MicroRNAs to Nanog, Oct4 and Sox2 coding regions modulate embryonic stem cell differentiation. Nature 455:1124–1128
Schnall-Levin M, Zhao Y, Perrimon N et al (2010) Conserved microRNA targeting in Drosophila is as widespread in coding regions as in 3′UTRs. Proc Natl Acad Sci U S A 107:15751–15756
Schnall-Levin M, Rissland OS, Johnston WK et al (2011) Unusually effective microRNA targeting within repeat-rich coding regions of mammalian mRNAs. Genome Res 21:1395–1403
Fang Z, Rajewsky N (2011) The impact of miRNA target sites in coding sequences and in 3′UTRs. PLoS One 6:e18067
Miranda KC, Huynh T, Tay Y et al (2006) A pattern-based method for the identification of MicroRNA binding sites and their corresponding heteroduplexes. Cell 126:1203–1217
Orom UA, Lund AH (2007) Isolation of microRNA targets using biotinylated synthetic microRNAs. Methods 43:162–165
Orom UA, Nielsen FC, Lund AH (2008) MicroRNA-10a binds the 5′UTR of ribosomal protein mRNAs and enhances their translation. Mol Cell 30:460–471
Christoffersen NR, Shalgi R, Frankel LB et al (2010) p53-independent upregulation of miR-34a during oncogene-induced senescence represses MYC. Cell Death Differ 17:236–245
Hsu RJ, Yang HJ, Tsai HJ (2009) Labeled microRNA pull-down assay system: an experimental approach for high-throughput identification of microRNA-target mRNAs. Nucleic Acids Res 37:e77
Zhao Y, Samal E, Srivastava D (2005) Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 436:214–220
Nonne N, Ameyar-Zazoua M, Souidi M et al (2010) Tandem affinity purification of miRNA target mRNAs (TAP-Tar). Nucleic Acids Res 38:e20
Vatolin S, Navaratne K, Weil RJ (2006) A novel method to detect functional microRNA targets. J Mol Biol 358:983–996
Andachi Y (2008) A novel biochemical method to identify target genes of individual microRNAs: identification of a new Caenorhabditis elegans let-7 target. RNA 14:2440–2451
Llave C, Xie Z, Kasschau KD et al (2002) Cleavage of Scarecrow-like mRNA targets directed by a class of Arabidopsis miRNA. Science 297:2053–2056
Addo-Quaye C, Eshoo TW, Bartel DP et al (2008) Endogenous siRNA and miRNA targets identified by sequencing of the Arabidopsis degradome. Curr Biol 18:758–762
German MA, Pillay M, Jeong DH et al (2008) Global identification of microRNA-target RNA pairs by parallel analysis of RNA ends. Nat Biotechnol 26:941–946
Franco-Zorrilla JM, Del Toro FJ, Godoy M et al (2009) Genome-wide identification of small RNA targets based on target enrichment and microarray hybridizations. Plant J 59:840–850
Karginov FV, Cheloufi S, Chong MM et al (2010) Diverse endonucleolytic cleavage sites in the mammalian transcriptome depend upon microRNAs, Drosha, and additional nucleases. Mol Cell 38:781–788
Li YF, Zheng Y, Addo-Quaye C et al (2010) Transcriptome-wide identification of microRNA targets in rice. Plant J 62:742–759
Shin C, Nam JW, Farh KK et al (2010) Expanding the microRNA targeting code: functional sites with centered pairing. Mol Cell 38:789–802
Bracken CP, Szubert JM, Mercer TR et al (2011) Global analysis of the mammalian RNA degradome reveals widespread miRNA-dependent and miRNA-independent endonucleolytic cleavage. Nucleic Acids Res 39:5658–5668
Jiao Y, Riechmann JL, Meyerowitz EM (2008) Transcriptome-wide analysis of uncapped mRNAs in Arabidopsis reveals regulation of mRNA degradation. Plant Cell 20:2571–2585
Xiao C, Calado DP, Galler G et al (2007) MiR-150 controls B cell differentiation by targeting the transcription factor c-Myb. Cell 131:146–159
Zhao Y, Ransom JF, Li A et al (2007) Dysregulation of cardiogenesis, cardiac conduction, and cell cycle in mice lacking miRNA-1-2. Cell 129:303–317
Johnnidis JB, Harris MH, Wheeler RT et al (2008) Regulation of progenitor cell proliferation and granulocyte function by microRNA-223. Nature 451:1125–1129
Ventura A, Young AG, Winslow MM et al (2008) Targeted deletion reveals essential and overlapping functions of the miR-17 through 92 family of miRNA clusters. Cell 132:875–886
Patrick DM, Zhang CC, Tao Y et al (2010) Defective erythroid differentiation in miR-451 mutant mice mediated by 14-3-3zeta. Genes Dev 24:1614–1619
Park CY, Choi YS, McManus MT (2010) Analysis of microRNA knockouts in mice. Hum Mol Genet 19:R169–R175
Prosser HM, Koike-Yusa H, Cooper JD et al (2011) A resource of vectors and ES cells for targeted deletion of microRNAs in mice. Nat Biotechnol 29:840–845
Tsang JS, Ebert MS, van Oudenaarden A (2010) Genome-wide dissection of microRNA functions and cotargeting networks using gene set signatures. Mol Cell 38:140–153
Acknowledgments
The authors thank DIM Biothérapies, Stem Pole Ile-de-France, Cure The Future (Cell and Gene Trust), the Rebecca L Cooper Medical Research Foundation and the Cancer Council NSW [Project Grant 1006260] and the Australian National Health and Medical Research Council [Training Fellowship 571156] for support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Ritchie, W., Rasko, J.E.J., Flamant, S. (2013). MicroRNA Target Prediction and Validation. In: Schmitz, U., Wolkenhauer, O., Vera, J. (eds) MicroRNA Cancer Regulation. Advances in Experimental Medicine and Biology, vol 774. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-5590-1_3
Download citation
DOI: https://doi.org/10.1007/978-94-007-5590-1_3
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-5589-5
Online ISBN: 978-94-007-5590-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)