Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Introduction

It has been strikingly reported that more than ninety percent of the human genome is potentially transcribed (Carninci et al. 2005; Kapranov et al. 2007; Willingham and Gingeras 2006). However, a whole fraction of human HeLa cell RNA at a denatured RNA agarose gel displays mostly the 18S and 28S bands of ribosomal RNA and just smear bands that include mRNA, tRNA, and noncoding (nc) RNA (Fig. 2.1). This observation implies that the human genome generates vast number of ncRNAs, but most of them are as low copy number RNA molecules. The number of ncRNA species is huge, although each copy number is very low, suggesting that significant fractions of the ncRNAs might be involved in the regulation of various cellular functions instead of cellular structure. Actually, micro (mi) RNA, one of the most well-studied ncRNA functions as a translational repressor (Ambros 2001; Fire et al. 1998). Recently, transcription regulatory functions have been found in certain kinds of ncRNAs. Most of such kinds of ncRNAs have reported as “long” ncRNA of which length is more than 200 nucleotides (Kurokawa et al. 2009; Ponting et al. 2009). Mechanisms of the transcriptional regulations are divergent for various kinds of ncRNAs. In this review, I overview recent papers regarding the transcriptional regulation through the long ncRNAs and discuss heterogeneity of mechanisms of these transcriptional regulations.

Fig. 2.1
figure 1

Denatured electrophoresis of total RNA fractions of human fetal kidney cell line 293. Detection of RNA was performed with ethidium bromide

2.2 Long ncRNAs

Long ncRNAs that regulate transcription are divergent molecules. Classification of long ncRNA is attempted in this section.

2.2.1 Length of Long ncRNAs

Long ncRNAs are tentatively defined as molecules of ncRNA more than 200 nucleotides long. Actually, their lengths are ranging from 200 bp to 2.2 kb of HOTAIR and 17 kb of Xist. Therefore, naming as “long ncRNA” is merely based on its nucleotide length.

2.2.2 Single- or Double-Stranded Long ncRNAs

There have been reported both single-stranded and double-stranded long ncRNAs. Sense and antisense strands of Alu repeats are transcribed and form a double-stranded RNA (Wang et al. 2008a). The functional consequence of the formation of a double-stranded ncRNA remains unclear. A possible explanation for double-strandedness of ncRNAs is that the double-stranded ncRNA might not bind a target molecule, and formation of double-strand of the ncRNA presents repression of the ncRNA function.

2.2.3 Subcellular Localization

Matured mRNAs after processing are transported to cytoplasm, while most of ncRNAs are known to be localized in nuclei. Some ncRNAs are localized both in nuclei and cytoplasm (Imamura et al. 2004). Only one ncRNA has been reported to be exclusively localized in cytoplasm (Louro et al. 2009). The long ncRNAs mainly reside in nuclei, suggesting their involvement in transcription.

2.2.4 Transcription of Long ncRNAs

Many of long ncRNAs represent tissue-specific pattern of expression. This suggests that the expression of these long ncRNAs should be strictly regulated and transcribed mostly by RNA polymerase II. Analysis of 1,600 ncRNAs showed that most of long ncRNAs are similar to authentic RNA polymerase II transcript as follows (Guttman et al. 2009; Khalil et al. 2009). First, these long ncRNAs contain trimethyl marks of histone H3-lysine (K) 4 at their promoter regions and trimethyl marks of histone H3-K36 along the length of the transcribed region, which are observed in usual transcripts by RNA polymerase II. These trimethyl marks are designated as “chromatin signature (a K4-K36 domain)” (Guttman et al. 2009). Second, the long ncRNAs generally possess the 5′CAP (7-methylguanosine cap) structure at the 5′ edge and also poly (A) tail at their 3′ end as well (Guttman et al. 2009). Third, the long ncRNAs have well-defined transcription factor binding sites like NF-kB in their promoter regions (Martone et al. 2003). These data strongly support that the transcription of the long ncRNAs is performed by RNA polymerase II (Martone et al. 2003). However, it has not been well identified which type of transcription factor could induce the long ncRNA transcription. Thus, regulation of transcription of the long ncRNA still remains uncovered.

2.3 Long ncRNAs Regulate Transcription

Divergent mechanisms of the transcriptional regulation by the long ncRNAs have been reported. At this section, the transcriptional regulations are attempted to categorize into three types (1) the regulation at the basic transcription factors including RNA polymerase II; (2) the regulation at the histone modification; (3) the regulation at the DNA methylation. The predominant type of the regulations appears to be mediated through the histone modification.

2.3.1 Transcriptional Regulation Through Targeting Basic Transcription Factors and RNA Polymerase II by Long ncRNAs

Direct interaction of long ncRNAs with basic core machinery is one of efficient mechanisms of transcriptional repression .

2.3.1.1 Alu RNA

SINE retrotransposon elements including Alu repeats generate numerous species of long ncRNAs (Maraia et al. 1993). It has been reported that Alu RNAs and SINE B2 RNAs exert transcriptional repression under the heat-shock condition (Allen et al. 2004; Espinoza et al. 2007; Mariner et al. 2008). SINE B2 and Alu RNA directly target RNA polymerase II. Furthermore, Alu RNA possesses a regulatory domain for function of RNA polymerase II (Mariner et al. 2008). Biochemical experiments demonstrated that Alu RNAs inhibit association of RNA polymerase II to the promoter DNA and represses the transcription (Mariner et al. 2008: see Chap. 6). SINE B2 turns out to have similar repressive effect on the transcription as well (Mariner et al. 2008). These data suggest that the repetitive sequence that occupies the half of the human genome could be transcribed, and their transcripts, the long ncRNAs, exert transcriptional repression. This presents the biological significance of the repetitive sequence in the human genome.

2.3.1.2 Dehydrofolate Reductase ncRNA

In quiescent mammalian cells, expression of dehydrofolate reductase (DHFR) is repressed. It has been reported that a transcript of a minor promoter located upstream of a major promoter is involved in the repression of DHFR (Martianov et al. 2007). In the quiescent cells, the transcript of the minor promoter was found to inhibit transcriptional initiation from the major promoter through direct binding to TFIIB of the preinitiation complex (Fig. 2.2). The alternative promoters within the same gene have been observed in various loci. It could be a general mechanism that the transcripts from the alternative promoters have a regulatory role in transcription of the promoter.

Fig. 2.2
figure 2

Transcriptional repression of dehydrofolate reductase (DHFR) gene by the ncRNA transcribed from the minor promoter of the DHFR gene. The DHFR ncRNA represses the DHFR gene expression by blocking the preinitiation complex through targeting TFIIB and RNA polymerase II

2.3.2 Transcriptional Regulation Through Histone Modification by the Long ncRNAs

The regulation of transcription by long ncRNA has been reported to be performed mainly through histone modification or DNA methylation. Some long ncRNAs activate transcription, while others repress it.

2.3.2.1 Steroid Receptor RNA Activator

Nuclear receptor (NR) forms a super family consisting of more than 50 members in the human genome and is the transcription factor that regulates divergent biological functions such as homeostasis and cellular differentiation and growth (Glass and Rosenfeld 2000). NR activates transcription through exchange of corepressor for coactivator upon specific binding of low molecular weight lipophilic compounds designated as ligands. The corepressor and coactivator were all supposed to be protein molecule. However, steroid receptor RNA activator (SRA) had been reported as a first example of the NR coactivator of RNA molecule (Hatchell et al. 2006; Lanz et al. 1999). SRA was found to activate various NR, for example, steroid hormone receptors such as glucocorticoid and estrogen receptors, retinoic acid, thyroid hormone, and vitamin D receptors. It has been suggested that SRA should activate transcription through recruitment of steroid receptor coactivator 1 (SRC1) and SRC1 with histone acetyltransferase (HAT) activity, and release of histone deacetylase (HDAC).

2.3.2.2 Embryonic Ventral Forebrain-2

During early development, 3.8-kb long ncRNA, embryonic ventral forebrain-2 (Evf2) is transcribed from intergene region between loci Dlx-5 and Dlx-6 (Bond et al. 2009; Feng et al. 2006). The Dlx gene related to Distalless gene (dll) homeodomain protein family of Drosophila plays a pivotal role in neuronal development. The dll gene forms a bigene cluster of Dlx5/6 and Dlx1/2. There are well-conserved enhancer regions, ei and eii, located between Dlx5 and Dlx6. Evf2 is transcribed from the ei and eii enhancer regions and binds the Dlx2 protein and activates the transcription of Dlx5/6 gene. The Evf2 ncRNA exerts transcriptional activation through the protein–protein interaction as follows (Fig. 2.3a). Dlx5/6 regions in their repression status are methylated at the CpG repeat, which is bound by MeCP2 and HDAC, while Evf2 activates them through removing MeCP2 and release of HDAC from the CpG repeat (Bond et al. 2009).

Fig. 2.3
figure 3

The long ncRNAs involving in transcriptional regulation through chromosomal modification (a) Evf2 activates transcription by removing the methylase MeCP2 on CpG regions and releasing HDAC activity from the target gene. (b) HOTAIR activates transcription by binding PRC2 and histone methylation of HOXD locus

2.3.2.3 HOX Antisense Intergenic RNA

HOX gene clusters are essential for formation of body axis and segments during embryogenesis. In the human genome, four clusters of HOX genes have been identified, that is, HOXA (chromosome 7), HOXB (chromosome 17), HOXC (chromosome 12), and HOXD (chromosome 7). The tilling array analysis of these four clusters showed 231 novel ncRNAs and a highly conserved ncRNA in vertebrates, the HOX antisense intergenic RNA (HOTAIR) (Rinn et al. 2007). HOTAIR is a 2.2-kb ncRNA transcribed from noncoding region of HOXC cluster and recruited to HOXD locus upon binding the Polycomb repressive complex (PRC) 2. PRC2 possesses the H3K27 histone methyl transferase (HMTase) EZH2, Suz12, and EED as the components of the complex and induces histone methylation to repress expression of the gene. Then, HOTAIR represses the transcription of HOXD by recruitment of PRC2 and trimethylated histone H3-K27 (Fig. 2.3b). PRC2 is also involved in the X-chromosome inactivation (discussed later, see Chap. 3), suggesting that the complex has versatile epigenetic functions to mediate the transcriptional regulation by the long ncRNAs.

2.3.2.4 Cyclin D1

Recently, our group reported that an RNA-binding protein TLS (Translocated in liposarcoma) inhibits histone acetyltransferase (HAT) activity of CBP and p300 (Wang et al. 2008b). The HAT inhibitor, TLS, turns out to have specific target genes, cyclin D1 and E1, and represses the expression of cyclin D1 upon binding the RNA containing the GGUG-consensus sequence (Lerga et al. 2001). Expression of cyclin D1 gene has been repressed with treatment of ionizing radiation (IR) and the DNA damaging reagents (Miyakawa and Matsushime 2001). Our quest for any alteration of level of transcript after the IR treatment has demonstrated the increase of ncRNAs from the cyclin D1 promoter. These ncRNA [promoter (p)-ncRNA ] transcribed from the cyclin D1 promoter was found to have the GGUG consensus sequence.

Binding of pncRNAs to TLS induces its recruitment to CBP/p300, major HAT activity in animal cells, and inhibition of their HAT activity (Fig. 2.4). Together with these data, it is suggested that expression of cyclin D1 gene could be repressed by pncRNAs through binding to TLS. This should be a mechanism like autorepression: a transcript from a gene represses its expression itself. We present the mechanism as an ncRNA-dependent transcriptional repression and have been pursuing the fact that the similar promoter-derived ncRNAs repress expression of other genes in the human genome. This could be a genome-wide network of cellular transcription repression.

Fig. 2.4
figure 4

The cyclin D1 pncRNA-dependent transcriptional repression Genotoxic factors like ionizing irradiation and DNA damaging reagents induce the pncRNA transcription. The pncRNAs bind TLS and inhibit the HAT activity of CBP/p300 to exert repressive effect on the cyclin D1 expression

2.3.3 DNA Methylation

An antisense RNA is known to induce gene-silencing through DNA methylation. This tells us tight relations between ncRNAs and DNA methylations.

2.3.3.1 P15AS

Antisense RNA of the tumor-suppressor gene p15 repressed the expression of p15 itself (Yu et al. 2008). In leukemia cells, the expression of p15 was reduced, while the level of antisense RNA of p15 was increased. The detailed analysis of the p15-antisense RNA using the leukemia cells showed that the antisense RNA induces methylation of the p15 locus DNA and its heterochromatinization to exert transcriptional repression. In the human genome, antisense RNAs of the 70% of coding genes are supposed to be expressed (Katayama et al. 2005). Taken together, these antisense RNAs might have regulatory role in gene expression.

2.3.3.2 Khps1

Khps1 is an antisense RNA transcribed from T-DMR (tissue-dependent differentially methylated region) of Sphk1 (sphingosine kinase-1). Overexpression of Khps1 stimulates demethylation of the CpG island of T-DMR but the methylation of its non-CG region (Imamura et al. 2004). The modulation of the methylation status of Sphk1 locus has been found to regulate expression of this locus. These data show tight relations between long ncRNA functions and DNA methylations.

2.4 ncRNAs as a Sensor for Cellular Signals

There have been reported divergent long ncRNAs transcribed from numerous regions of the human genome. Expression of long ncRNAs is supposed to be regulated by various “signals”, and suggested to have a role in “sensor” toward the signals. Actually, we have found that the cyclin D1-pncRNA could work as a sensor for genotoxic signal of ionizing radiation (Wang et al. 2008b).

X-chromosome inactivation employs the ncRNA, the 1.6-kb RepA that is transcribed from the fragment of the Xist locus as an antisense RNA (Zhao et al. 2008). The reduction of expression of Tsix that is a full-length antisense RNA of Xist has a function as a signal. RepA as the sensor receives the reduction of the Tsix expression as the signal, recruits PRC2 to the Xist locus, and induces X chromosome inactivation. During embryonic development, HOTAIR also functions as a sensor and exerts gene silencing effect upon recruitment of PRC2 (Rinn et al. 2007). The long ncRNAs with the function of the sensors have been found to require histone-modifying enzymes. These observations suggest that long ncRNAs function as a sensor for various biological signals and execute regulation of gene expression through histone modification .

2.5 Mechanisms of Transcriptions of Long ncRNAs

Majority of long ncRNAs have been shown to be transcribed through RNA polymerase II, although some long ncRNAs are generated by RNA polymerase III (Dieci et al. 2007; Liu et al. 1995; Nguyen et al. 2001). Although the prevailing analyses of RNA polymerase II indicate that its major function is the precise initiation and elongation of protein-coding genes, early studies showed that RNA polymerase II possesses the ability to catalyze randomly initiated transcription from a calf thymus DNA or other crude DNA fractions as a template (Barbiroli et al. 1977; Legraverend and Glazer 1980; Reinberg and Roeder 1987). Indeed, RNA polymerase was shown to initiate transcription from nicked, gaped, and edge of DNA molecules in a sequence-independent manner (Sekimizu et al. 1979). This led to the notion that RNA polymerase II has potential to generate divergent transcripts from numerous and discrete sites in the genome.

Biochemical approaches using nuclei of the rat livers indicated that RNA polymerase I resides in nucleolus and is involved in generating ribosomal RNAs, while RNA polymerase II is located in nucleus (Roeder and Rutter 1970). RNA polymerase II was found to synthesize the “DNA-like RNA” that is the RNA having a base composition similar to that of total cellular DNA and predicted to work on transcription of the protein-coding genes (Roeder and Rutter 1970). Extensive biochemical and molecular biological studies have demonstrated that RNA polymerase II comprises multiple components, TFIIB, TFIID, TFIIE, TFIIF and TFIIH, and that precise initiation of the transcription requires the RNA polymerase II with its essential components, that is, the holoenzyme of RNA polymerase II (Roeder 1991; Weake and Workman 2010). This shows that RNA polymerase II alone could not initiate specific and precise transcription and that for specific transcription RNA polymerase II needs to form the holoenzyme with general transcription factors like TFIIB and TFIID, while RNA polymerase II is able to catalyze a random transcription reaction with induction by some protein fractions as described below.

The fractions of Ehrlich ascites tumor cells (SII) and of HeLa cell (TFIIS) were shown to stimulate nonspecific transcription by RNA polymerase II (Reinberg and Roeder 1987; Sekimizu et al. 1979). These data give rise to a clue to understanding heterogeneously initiated transcription of ncRNAs from divergent sites of the human genome. Biochemical assay with nuclei of the mouse ascitic carcinoma Krebs II cells and RNA polymerase II with endogenous DNA as templates revealed strong activity of the transcription (Shenkin and Burdon 1966). Indeed, using 0.84 ml of the nuclear fraction, the yield of [3H] RNA was achieved to range from 0.175 to 0.50 mg, indicating that significant percentage of the mouse genome is potentially transcribed at least in the experimental condition. Taken together with these data, the genome has the potential to be transcribed to create divergent RNA species. Yet unidentified protein factor will be shown to stimulate RNA polymerase II to make the great numbers of the long ncRNAs that have been identified recently.

2.6 Perspectives

The mechanisms of the transcriptional regulations discussed in this review indeed appear to be heterogeneous. Majority of the long ncRNAs utilizes histone modification to regulate transcription but not all. One common element for the transcriptional regulation by long ncRNAs is RNA–protein interaction through RNA-binding proteins. Formation of the RNA–protein complexes is one of key events of the long ncRNA-dependent transcriptional regulation. More generally, ncRNAs require their specific binding proteins in order to exert their biological functions, suggesting that identification of an RNA-binding protein specific to an unknown ncRNA should indicate its biological significance. Why are so many long ncRNAs generated in living cells? It should be informative for understanding the diversity of the long ncRNAs to elucidate mechanisms of the transcription of the long ncRNAs themselves. Considering that 90% of the genome is transcribed, the genomic DNA sequence intrinsically possesses the ability to be transcribed. It is likely that the protein-coding genes are evolutionally selected to acquire high efficiency of transcription (Fig. 2.5). The transcription mechanisms of long ncRNAs are supposed to be a primitive one compared to that of messenger RNAs of protein-coding genes, and a prototypic to the more refined RNA polymerase II transcription mechanism. To know more about the transcription of long ncRNAs will facilitate elucidation of the transcription of the coding genes in eukaryote. Employing the long ncRNAs as a regulator for transcription might be a way to salvage junks of the genome, long ncRNAs. Intense investigation of the long ncRNA transcription would lead to a crucial clue to understanding the origin of the long ncRNAs and also a whole structure of the human genome.

Fig. 2.5
figure 5

Quantitative models of genomic DNA, protein-coding messenger RNAs, and long ncRNAs