Introduction

Extensive transcriptome analyses have revealed that up to 90 % of eukaryotic genomes are transcribed (Wilhelm et al. 2008), whereas only 1–2 % of the genome encodes proteins (Birney et al. 2007). This suggests that a large proportion of the eukaryotic genome produces an unexpected plethora of RNA molecules that have no protein-coding potential. These are collectively called noncoding RNAs (ncRNAs).

NcRNAs can be grouped into two classes according to the size of transcripts. NcRNAs with more than 200 nucleotides are considered as long ncRNAs (lncRNAs) whereas short ncRNAs are less than 200 nucleotides. Short ncRNAs include micro RNA (miRNAs), small interfering RNAs (siRNAs), and Piwi-interacting RNAs (piRNAs) (Ghildiyal and Zamore 2009). Mechanistic details on the roles of short ncRNAs in transcriptional and post-transcriptional regulation in eukaryotes are fairly well characterized (Chitwood and Timmermans 2010). In contrast, the molecular mechanism of gene regulation by lncRNAs is not well understood. Most lncRNAs that have been characterized participate in the regulation of gene expression. One prevailing regulatory theme for those lncRNAs is that they modulate transcriptional activity through the interaction with regulatory protein complexes (Chaumeil et al. 2006; Rinn et al. 2007; Wang et al. 2011). LncRNAs can act as a decoy for splicing factors (Tripathi et al. 2010) and as a competitor for miRNA binding sites (Cesana et al. 2011;Karreth et al. 2011), indicating that lncRNAs potentially play roles in various regulatory circuitries in eukaryotes. In plants, a few lncRNAs have been identified and functionally characterized (Ben Amor et al. 2009; Heo and Sung 2011; Swiezewski et al. 2009). In this review, we provide an overview on epigenetic regulation of gene expression by plant lncRNAs and discuss the current and future research that could shed further light on understanding the functional role of plant lncRNAs.

Identification of long ncRNA

A large-scale sequencing of full-length cDNA library unexpectedly identified a number of lncRNAs in mammals (Okazaki et al. 2002). Some lncRNAs are transcribed from intergenic regions or introns, and others overlap with protein-coding regions. They are transcribed in either sense or antisense direction compared to neighboring protein-coding transcripts. Some are polyadenylated and others are not polyadenylated. Currently, in silico identification differentiates lncRNAs from protein-coding transcripts based on the absence of discernible open-reading frames. Full-length targeted cDNA sequencing may be the best method to determine the coding potential of transcripts, although this approach is time-consuming and expensive. Alternatively, tiling DNA microarray for genome-wide transcriptome analysis has been used to detect and to determine expression of lncRNAs in plants. For example, tiling DNA microarray analyses identified novel stress-induced lncRNAs in Arabidopsis (Matsui et al. 2010; Rehrauer et al. 2010) and intergenic lncRNAs in rice (Li et al. 2006). Using a similar approach, a group of antisense lncRNAs, collectively known as COOLAIR, was identified from FLOWERING LOCUS C (FLC) in Arabidopsis (Swiezewski et al. 2009). High-throughput deep RNA sequencing can be used to identify missing or incomplete transcripts of lncRNA and to determine the levels of lncRNA expression. A similar approach also identified rare alternative splicing variants of an lncRNA, HOTAIR (Mercer et al. 2011). Genome-wide histone modification profiles indicate that a number of long intergenic ncRNAs (lincRNAs) are transcribed from a K4-K36 domain, which marks active promoters with trimethylation of lysine 4 of histone H3 (H3K4me3) and trimethylation of lysine 36 of histone H3 (H3K36me3), suggesting that most lncRNAs are transcribed from independent promoters (Zhang et al. 2009). Unlike short RNAs and proteins, function of lncRNAs cannot simply be inferred from their sequence or structure. In this review, we focus on a few lncRNAs whose function is relatively well characterized in plants. In particular, we describe examples of lncRNAs that function to regulate gene expression at the level of chromatin modification and in the recruitment of chromatin-modifying complexes.

Role of lncRNAs in the recruitment of polycomb repression complex 2 in animals

It has been known that purified chromatin contains both RNA and DNA, suggesting that RNA may affect chromatin structure and gene regulation (Paul and Duerksen 1975). Earlier genetic studies showed that a few lncRNAs are associated with heterochromatin formation and genomic imprinting (Barlow et al. 1991; Brown et al. 1991). Functional analyses of identified lncRNAs demonstrate that lncRNAs are required for proper chromatin structure and recruitment of the chromatin-modifying complexes to DNA (Bernstein and Allis 2005). One well-known function of lncRNAs is to mediate epigenetic changes by recruiting chromatin-remodeling complex to specific genomic loci. For example, Xist lncRNA is expressed from the inactive X chromosome and “coats” the X chromosome, leading to the recruitment of polycomb repressive complex 2 (PRC2), which trimethylates histone H3 at lysine 27 to silence transcription of local genes as a cis-acting lncRNA (Chaumeil et al. 2006). A small internal noncoding transcript from the Xist locus, RepA, recruits PRC2 to silence one X chromosome, whereas PRC2 is titrated from the remaining active X chromosome by the antisense transcript Tsix (Zhao et al. 2008). Another study showed that Xist and Tsix form an RNA duplex that is processed by Dicer to generate siRNAs that are required for the repressive chromatin modification on the inactive X chromosome (Lander et al. 2001). Other lncRNAs, Air and Kcnq1ot1, also interact with chromatin and target repressive histone modifiers to specific cis-linked gene loci (Nagano et al. 2008). In humans, the antisense lncRNA HOTAIR, which is transcribed from the HOXC locus, regulates epigenetic changes at the HOXD locus in trans by recruiting PRC2 (Rinn et al. 2007). HOTAIR physically associates with the PRC2 and modulates PRC2 activity to deposit H3K27me3 marks at target chromatin throughout the genome (Rinn et al. 2007; Tsai et al. 2010). Additional studies of both HOTAIR and Xist revealed that the methyltransferase subunit EZH2 of the PRC2 complex physically associates with both lncRNAs (Kaneko et al. 2010; Zhao et al. 2008). Although molecular nature of the interaction between lncRNAs and PRC2 is yet to be determined, the interaction between lncRNAs and chromatin-modifying complexes appears to be a general mechanism for epigenetic repression in animals.

Polycomb-mediated repression by vernalization in plants

Plants respond to environmental cues to trigger developmental changes (i.e., flowering) only during a certain period of the year. One example of such environmental cues is prolonged cold of winter known as vernalization (Sung and Amasino 2004b). Vernalization results in epigenetic silencing of FLC, and the repression of FLC is stably maintained even after winter cold. Molecular studies have revealed that both activation and repression of chromatin-remodeling complexes are involved in the regulation of FLC expression (Kim et al. 2009). A high expression level of FLC results in delayed flowering, whereas flowering is promoted when FLC is repressed by vernalization. Genetic approaches identified that several protein components are necessary for establishing the stable repression of FLC by vernalization (Sung and Amasino 2004a, Kim and Sung 2012). VERNALIZATION INSENSITIVE 3 (VIN3) was identified as an essential gene to trigger repression of FLC by vernalization (Sung and Amasino 2004b). VIN3 encodes a plant homeodomain (PHD) finger protein that is induced only during the cold. The PHD finger motif in VIN3 is often found in various components of chromatin-remodeling complexes (Sung et al. 2006, Kim and Sung 2013). VIN3 was biochemically co-purified with PRC2 (De Lucia et al. 2008). This result suggests that the PHD-PRC2 association is required for the function of PRC2. Vernalization results in increased association of PHD-PRC2 with FLC chromatin and increased deposition of H3K27me3 marks at the FLC chromatin (De Lucia et al. 2008; Wood et al. 2006; Kim and Sung 2013). Increased enrichments of PRC2 and H3K27me3 are hallmarks for the stable repression of FLC by vernalization.

PRC2 is a conserved repressive chromatin-remodeling complex in higher eukaryotes. In mammals, PRC2 contains four core subunits, including SUPPRESSOR OF ZESTE-12 [Su(Z)12], enhancer of zeste [E(Z)], ESC, and P55. Arabidopsis VRN2 encodes a homolog of Su(Z)12, which is a core component of PRC2 (Wood et al. 2006; De Lucia et al. 2008). There are at least three homologs of E(Z) (i.e., CLF, SWN, and MEDEA) and five homologs of p55 (MSI1 ~ 5). CLF, SWN, MSI1, and FIE are unambiguously identified as components of the vernalization PRC2 complex (De Lucia et al. 2008; Wood et al. 2006). A homolog of ESC, FIE, is the only PRC2 component that is represented by a single member in Arabidopsis (De Lucia et al. 2008).

Roles of long noncoding RNAs derived from FLC locus

Two classes of lncRNAs were identified to be involved in vernalization-mediated FLC repression (Fig. 1). A group of lncRNAs, COOLAIR, are transcribed from the 3’ end of FLC in an antisense direction compared to FLC mRNA. Six different COOLAIR transcripts can be grouped into two groups based on two alternative polyadenylation sites (Swiezewski et al. 2009). The expression levels of COOLAIR increase transiently at an early stage of vernalization, and distally polyadenylated variants of COOLAIR partly overlap with the 5’ end of FLC mRNA (Fig. 1). It has been suggested that COOLAIR plays a role in transcriptional interference of FLC (Swiezewski et al. 2009). However, insertional mutants (in which COOLAIR transcription is impaired) show reduced FLC expression by vernalization (Helliwell et al. 2011). Alternatively, it has been proposed that COOLAIR may function in “co-transcriptional” regulation during the early phase of vernalization (De Lucia and Dean 2010). In this model, pervasive transcription of antisense lncRNAs would affect sense FLC transcription. Molecular details on the function of COOLAIR in vernalization-mediated FLC repression still remain to be addressed.

Fig. 1
figure 1

Noncoding RNAs from FLC. From FLC, both sense and antisense nonprotein coding transcripts are detected. COLDAIR is transcribed from the first intron of FLC in a sense direction compared to FLC mRNA. Two classes of antisense ncRNAs are transcribed. COOLAIR is largely classified based on their polyadenylations (Proximal pA and Distal pA). Splicing variants are also detected for both classes of COOLAIR

The second lncRNA is another cold-inducible intronic lncRNA, COLDAIR (Fig. 1), which is transcribed from the first intron of FLC in a sense direction compared to FLC mRNA (Heo and Sung 2011). COLDAIR is 1.1 kb long. COLDAIR contains 5’-capped structure but no apparent polyadenylation tail, although it is transcribed by RNA polymerase II. COLDAIR is transiently induced by vernalization, and its induction is controlled by a cryptic cold-inducible promoter located within the vernalization response element (Heo and Sung 2011). In vitro transcription and pull-down assays showed that COLDAIR binds to CLF protein, a homolog of E(z) component of the PRC2 chromatin-modifying complex. RNA immunoprecipitation assays verified that COLDAIR associates with PRC2 in vivo. A genetic study of COLDAIR using RNA-interference system demonstrates that COLDAIR is a necessary component for the vernalization response in Arabidopsis. COLDAIR knockdown lines do not display a decrease in the level of FLC in response to vernalization compared to that in parental lines, and FLC is derepressed in the knockdown lines when the plants are returned to warm condition (Heo and Sung 2011). The enrichment of CLF and H3K27me3 at the FLC locus chromatin in response to vernalization is largely impaired in COLDAIR knockdown lines, which indicates that COLDAIR is necessary for recruiting PRC2 to FLC chromatin. Therefore, COLDAIR is similar to HOTAIR and Xist ncRNAs in that they act to recruit PRC2 to target chromatin. Taken together, lncRNA-mediated epigenetic gene silencing by PRC2 appears to be an evolutionarily conserved mechanism between plants and animals.

Role of lncRNAs in the recruitment of polycomb repression complex 1

The primary research focus has been on PRC2-mediated epigenetic gene silencing by lncRNAs. However, there is increasing evidence that lncRNAs also play role in the recruitment of the PRC1 complex. PRC1 recognizes the H3K27me3 mark and executes stable transcriptional repression (Zhang et al. 2007). In Drosophila, the PRC1 complex contains the two RING finger proteins posterior sex combs (Psc) and dRING, which are orthologs of the mammalian PRC1 subunits BMI1 and RING1A/RING1B, respectively (Scheuermann et al. 2010). Mammalian PRC1 is an E3 ubiquitin ligase complex that performs mono-ubiquitination on histone H2A lysine 119, which results in stable repression of the target locus (Cao et al. 2005). RNA immunoprecipitation followed by sequencing or RT-PCR analysis identified thousands of lncRNAs associated with the PRC2 and other histone-modifying complexes (Zhao et al. 2008; Guttman et al. 2011). Therefore, lncRNA-mediated chromatin modification is not restricted to PRC2. For example, HOTAIR functions as a modular scaffold for both PRC2 and LSD1/coREST complex (Tsai et al. 2010). Another example of an lncRNA as a modular scaffold is ANRIL, which associates with both PRC2 and PRC1 (Yap et al. 2010). ANRIL appears to recruit PRC1 together with PRC2 to repress specific target genes. A tumor suppressor locus, INK4b/ARF/INK4a, is repressed by PRC1 in a ANRIL lncRNA-dependent manner (Yap et al. 2010). CBX7, which is a component of the core PRC1 complex in mammals, is localized to Xi domains. RNase treatment disrupted CBX7 association with Xi domains (Yap et al. 2010), suggesting that the lXist lncRNA has a direct role in localizing PRC1 to target chromatin.

PRC1 subunits in plants have been diverged. For example, polycomb protein does not exist in plants. Instead, LIKE HETEROCHROMATIN PROTEIN1 (LHP1) binds to H3K27me3 and functions similarly to polycomb (Mylne et al. 2006; Sung et al. 2006, Turck et al. 2007; Zhang et al. 2007). Molecular genetic studies suggest that EMF1 and VRN1, which are non-sequence-specific DNA-binding proteins, also exhibit PRC1-like function (Aubert et al. 2001; Bastow et al. 2004). Apparent functional homologs of PRC1 RING finger proteins have been identified in Arabidopsis. They are associated with LHP1 and function to repress PcG target genes in planta (Xu and Shen 2008). AtBMI1A and AtBMI1B are RING finger proteins that function similarly as mammalian BMI1 (a component of PRC1); they interact with LHP1 and EMF1 and repress PcG target genes (Bratzel et al. 2010). It will be interesting to determine whether Arabidopsis PRC1-like complex includes lncRNA components for its function in gene repression (Fig. 2).

Fig. 2
figure 2

Vernalization-mediated FLC repression by polycomb group protein complexes. FLC chromatin undergoes dynamic changes from Trithorax-dominant chromatin to polycomb-dominant chromatin. COLDAIR is required for the recruitment of PRC2 during the cold exposure

Role of lncRNAs in activating chromatin-remodeling complexes

Some lncRNAs are implicated in establishing active chromatin states. Chromatin immunoprecipitation followed by sequencing analysis showed that H3K4me1 and H3K27ac histone marks are associated with known enhancer loci. Interestingly, these loci produced lncRNAs, implicating their function in gene activation (De Santa et al. 2010; Kim et al. 2010). The brain-specific lncRNA Evf-2 in human is transcribed within an enhancer between Dlx5 and Dlx6, which are members of the DLX/dll homeodomain-containing protein family. Evf-2 specifically cooperates with Dlx-2 to increase the transcriptional activity of Dlx5 and Dlx6 (Feng et al. 2006). In XCL, Jpx RNA is involved in the induction of Xist expression. Two lncRNAs were characterized in the HoxA locus. The first was an enhancer-like lncRNA termed HOTTIP (Wang et al. 2011). HOTTIP is transcribed from the distal 5’ end of the HOXA gene cluster and plays a role in the maintenance of H3K4me3 at HOXA locus (Wang et al. 2011). HOTTIP physically interacts with WDR5 protein, which is a key component of the mixed-lineage leukemia (MLL)/Trx complex that catalyzes the deposition of the H3K4me3 mark on target genes (Wang et al. 2011). The second lncRNA, Mistral, also interacted with MLL and was considered to activate neighboring HoxA6 and HoxA7 (Bezhani et al. 2007). Therefore, lncRNAs may also be an important component in activating chromatin-remodeling complexes.

In plants, there is no evidence to support the role of lncRNAs in activating chromatin-remodeling complexes up to date. In Arabidopsis, five ARABIDOPSIS TRITHORAX (ATX1−ATX5) and two ATX-RELATED (ATXR3 and ATXR7) proteins have been proposed as H3K4 methyltransferases (Ng et al. 2007). ATX1−ATX5 are TRX-group H3K4 methyltransferases. ATXR7 is an ortholog of yeast Set1, and ATXR3 is considered to be a different class of H3K4 methyltransferase (Alvarez-Venegas and Avramova 2001). Among these Arabidopsis H3K4 methyltransferases, ATX1, ATX2, ATXR3, and ATXR7 are involved in FLC activation through H3K4 methylation (Berr et al. 2009; Pien et al. 2008; Saleh et al. 2008a, 2008b; Tamada et al. 2009; Yun et al. 2012). Another TRX-like set domain protein, EARLY FLOWERING IN SHORT DAYS (EFS), is required for di- and trimethylation of histone H3 Lys36 (H3K36) at target chromatin (Xu and Shen 2008). A knockout mutant of EFS exhibits reduced FLC expression and early flowering due to decreased H3K4me3 at FLC chromatin (Kim et al. 2005; Ko et al. 2010). This suggests that EFS is necessary for the enrichment of H3K4me3, H3K36me2, and H3K36me3 at its target chromatin. It will be interesting to determine whether lncRNAs function in these active chromatin-modifying complexes (Fig. 2).

Other classes of lncRNAs in plants

Besides lncRNAs that clearly function in chromatin modifications, a few other ncRNAs also play roles in other biological processes in plants. The soybean early nodulin gene, Enod40, is involved in the regulation of symbiotic interactions between leguminous plants and soil bacteria (Fujita et al. 1993). Enod40 transcript contains only small predicted open-reading frames for two small peptides (12 and 24 amino acid residues). These peptides interact with sucrose synthase, suggesting that Enod40 plays a role in regulation of sucrose utilization in nodules (Rohrig et al. 2002). Interestingly, Enod40 transcripts themselves interact with MtRBP1 (Medicago truncatula RNA-binding protein1) and function to re-localize MtRBP1 from nuclear speckles to cytoplasmic granules during the nodulation in M. truncatula (Rohrig et al. 2002). These results suggest that Enod40 transcripts function as a lncRNA for the MtRBP1 re-localization.

Another lncRNA, INDUCED BY PHOSPHATE STARVATION 1 (IPS1), sequesters miRNA-399, which regulates the expression of PHO2. IPS1 contains complementary sequences to phosphate (P i ) starvation-induced miRNA399. IPS1 competes with PHO2 for the binding of miRNA-399, resulting in attenuating miR399-mediated repression of PHO2 (Franco-Zorrilla et al. 2007). The association of IPS1 with miRNA399 does not affect cleavage of the IPS1 transcript because of a mismatched loop at the cleavage site (Franco-Zorrilla et al. 2007; Liu et al. 1997). Therefore, IPS1 plays a role as a mimic of a miRNA target.

Perspectives

A large number of lncRNAs have been identified in animals, particularly through high-throughput transcriptome sequencing. Further high-throughput sequencing with various alterations will enable identification and characterization of numerous lncRNAs in plants in the next few years. Indeed, such approach has been employed to identify lncRNAs in plants and revealed that there are correlative expressions between lncRNAs and their neighboring protein-coding genes, suggesting that a majority of identified lncRNAs are cis-acting (Liu et al. 2012). Although there are correlative expression patterns, it still remains to be determined how lncRNAs affect expression of neighboring genes. It is also expected that more targeted approaches (i.e., RNA immunoprecipitation followed by sequencing) will identify different classes of lncRNAs. It is also noteworthy that regulation of transcription and processing of lncRNAs are poorly understood. Given that lncRNAs can be transcribed in various loci, it will be interesting to address how these transcripts are regulated and processed. For example, the homeodomain protein AtNDX has been identified to influence the transcription of COOLAIR, the antisense lncRNA transcribed from the 3’ end of FLC (Sun et al. 2013). AtNDX associates with the COOLAIR promoter and inhibits COOLAIR transcription through the R-loop stabilization (Sun et al. 2013). Although our understanding of biological function and regulation of plant lncRNAs is still in its infancy, unraveling function of plant lncRNAs will provide fertile ground for exciting discoveries in understanding the fundamental principles of lncRNA-mediated regulation of gene expression.