Keywords

5.1 Introduction

In eukaryotes, transcriptome studies showed that >90% of the genome is transcribed and a myriad of transcripts corresponds to noncoding RNAs (ncRNAs) [1, 2], including long ncRNAs (lncRNAs), which are classically >200 nt long and have no discernable coding potential [3,4,5]. Plant genomes produce tens of thousands of lncRNAs from intergenic, intronic, or coding regions. RNA Pol II transcribes most lncRNAs (from the sense or antisense strands); plants also have Pol IV and Pol V, the two plant-specific RNA polymerases that can produce lncRNAs [6, 7]. Majority of described up-to-date plant lncRNAs are polyadenylated, while in yeast and mammals, there are many non-polyadenylated lncRNAs as well [8]. However, there are several well-studied important functional non-polyadenylated lncRNAs [9,10,11]; and the recent work in Arabidopsis found that abiotic stress induced the production of hundreds of non-polyadenylated lncRNAs [12,13,14].

Most lncRNAs can be broadly classified based on their relationships to protein-coding genes: (1) long intergenic ncRNAs (lincRNAs) (Fig. 5.1A); (2) lncRNAs produced from introns (incRNAs), which can be transcribed in any orientation relative to coding genes (Fig. 5.1B); and (3) antisense RNAs and natural antisense transcripts (NATs), which are transcribed from the antisense strand of genes (Fig. 5.1C and D) [15]. Various types of lncRNAs are also transcribed near transcription start sites (TSSs) and transcription termination sites (TTSs) or from enhancer regions (eRNAs) (Fig. 5.1G) and splice sites. For example, yeast produces cryptic unstable transcripts (CUTs) and stable unannotated transcripts (SUTs) from around TSSs [16], Xrn1-sensitive XUTs [17], and Nrd1-dependent NUTs [18, 19], and mammalian cells produce PROMPTs and upstream antisense RNAs (uaRNAs) [20] and others (Fig. 5.1E and F).

Fig. 5.1
figure 1

Classification of lncRNAs based on their relationship to protein-coding genes. Orange boxes correspond to the protein-coding genes and pink lines correspond to lncRNAs. Arrows indicate the direction of transcription. Each panel depicts a subtype of lncRNAs: intergenic or long intergenic noncoding RNAs (lincRNAs) (A), intronic RNAs (B), antisense RNAs (C), natural antisense transcripts (NATs) (D), promoter-proximal sense (E) and upstream antisense RNAs (F), eRNAs (G)

Information about TSS-proximal lncRNAs in plants remains scant. However, recent analyses of nascent RNA from Arabidopsis seedlings obtained using a combination of global nuclear run-on sequencing (GRO-seq), 5′ GRO-seq, and RNA-seq did not detect upstream antisense TSS-proximal ncRNAs [21]. These data suggest a possibility that divergent transcription is lacking in Arabidopsis (and likely maize), in contrast to the situation in many other eukaryotes, indicating that eukaryotic promoters might be not inherently bidirectional. In Arabidopsis TSS-proximal lncRNAs that were observed in the RNA exosome-deficient lines include the upstream noncoding transcripts (UNTs), which are transcribed as sense RNAs and are colinear with the 5′ ends of the associated protein-coding gene, extending into the first intron. The UNTs resemble yeast CUTs and mammalian PROMPTs [1].

The exosome-sensitive enhancer RNAs (eRNAs) produced from enhancer regions make up a large proportion of non-polyadenylated lncRNAs in mammalian cells (Fig. 5.1G) [8]. However, information about plant enhancers has only recently started to emerge. An analysis of chromatin signatures predicted over 10,000 plant intergenic enhancers [22]. However, their potential roles as transcriptional enhancers in vivo will require follow-up experiments, and eRNAs have not yet been reported in plants.

5.2 Recent Advances in Studying Plant lncRNAs

Mammalian lncRNAs are by far the best-studied. However, in recent years, identification of plant lncRNAs has largely caught up with mammalian field. The plant databases where the information on lncRNAs can be found are summarized in Table 5.1.

Table 5.1 List of plant lncRNA databases

An examination of >200 transcriptome data sets in Arabidopsis identified ~40,000 candidate lncRNAs; these included NATs (>30,000) and lincRNAs (>6000) [3, 4, 30]. Most of the lincRNAs did not produce smRNAs, and, like mammalian lncRNAs, the lincRNA transcript levels were 30–60-fold lower than that of transcript levels of the associated mRNA. Work in Arabidopsis found that NAT pairs, lncRNAs transcribed from opposite strands, occur widely: ~70% of protein-coding loci in Arabidopsis produce candidate NAT pairs 200–12,370 nt long (average length of 731 nt) [4]. Some NAT pairs show complete overlap (~60%), but others have complementary segments at their 5′ or the 3′ ends.

The expression levels of many lincRNAs differ significantly depending on the tissue and also change during stress; this indicates that lncRNAs undergo dynamic regulation and act in regulation of development and stress responses [30]. The expression levels of many NATs also are tissue-specific and change in response to biotic or abiotic stresses. For example, a recent study identified ~1400 NATs that respond to light; of the NAT pairs, about half respond in the same direction, and half respond in opposite directions. For the light-responsive NATs, the associated genes also showed peaks of histone acetylation; the acetylation levels changed with the changes in NAT expression in response to light [4].

Among the lncRNAs, Arabidopsis and rice have intermediate-sized ncRNAs (im-ncRNAs), which are 50300 nt long [31, 32] and originate from 5′ UTRs, coding regions, and introns. The genes associated with 5′ UTR im-ncRNAs tended to have higher expression and H3K4me3 and H3K9ac histone marks, which are associated with transcriptional activation. Plants that have reduced levels of some im-ncRNAs showed developmental phenotypes or detectable molecular changes [31].

While we continue to gain better understandings of the mechanisms of lncRNA action, the mechanisms that regulate lncRNAs in plants remain limited. Like all transcripts, lncRNAs undergo transcriptional level regulation and regulation that affects lncRNA biogenesis, processing, and turnover. One of the players in this regulation is the exosome complex, which plays a major role in regulating the quantity, quality, and processing of various transcripts, including lncRNAs. The exosome complex is a conserved machinery with 3′–5′ exoribonuclease activity that consists of nine-subunit core associated with its enzymatic subunits, Rrp44 and Rrp6. The depletion of the Arabidopsis exosome allowed identification of a number of Arabidopsis ncRNAs as well as the genomic regions where the exosome is involved in their metabolism [1].

5.3 Molecular Functions of Plant lncRNAs

lncRNAs are present at low levels and show little sequence conservation compared with mRNAs; therefore, early studies questioned their importance and necessity and also suggested that lncRNAs might result from transcriptional noise. Indeed, considerable debate remains about the functionality of lncRNAs. However, evidence has emerged in recent years to indicate that many lncRNAs function in a large number of diverse molecular processes in eukaryotic cells; these include the regulation of yeast mating type [33, 34] and modulation of embryonic stem cell pluripotency and various diseases [35]. In plants, lncRNAs function in gene silencing, flowering time control, organogenesis in roots, photomorphogenesis in seedlings, abiotic stress responses, and reproduction [5, 11,12,13,14, 36,37,38,39,40].

For their effects on gene regulation, lncRNAs act at multiple levels and with simple or complex mechanisms. lncRNAs can act in cis or trans, function by sequence complementarity to RNA or DNA, and be recognized via specific sequence motifs or secondary/tertiary structures (Fig. 5.2a). At the most simple level, lncRNAs can serve as precursors to smRNAs (Fig. 5.2b), as in the case of RNA Pol IV transcripts [6, 42,43,44,45,46]. Some lncRNAs keep regulatory proteins or microRNAs from interacting with their DNA or RNA targets by acting as decoys that mimic the targets (Fig. 5.2c). Some of the plant examples include the Arabidopsis microRNA target mimics IPS1 lncRNA and the decoy ASCO-lncRNA [38, 47].

Fig. 5.2
figure 2

Example lncRNAs and the mechanisms of their action. (a) Specific sequence motifs or secondary structures could be required for lncRNA function. (b) lncRNAs, specifically, the double-stranded transcripts, can serve as precursors to smRNAs in the RNA interference (RNAi) pathway. (c) lncRNAs can function as scaffolds for the recruitment of chromatin-modifying factors or as a platform for assembly of protein complexes. (d) lncRNAs can function as molecular sponges or decoys for smRNAs and also act as decoys to titrate away RNA-binding proteins. (e) The eRNAs, which are expressed from enhancers, are regulated by the exosome and can interact with other regions of DNA, such as enhancers or promoters, affecting the topology of the local DNA and thus altering gene expression. Adapted from [41]. (f) lncRNAs that interact with several chromatin-remodeling proteins and chromatin regions could affect higher-order nuclear structure

In animal systems, some lncRNAs directly affect Pol II and its associated transcriptional machinery by promoting phosphorylation of transcription factors (TFs) regulating their DNA-binding activity [48]. Many lncRNAs affect different processes related to transcription, including the initiation and elongation of transcripts, by affecting the pausing of RNA Pol II. Other lncRNAs act as scaffolds to recruit enzymes that remodel chromatin and thus alter chromatin structure and nuclear organization (Fig. 5.2d) (reviewed in [49]). Examples of plant lncRNAs that regulate transcription have started to emerge; for example, HID1 binds to the promoter of PIF3 gene to downregulate its expression [39]. However, no plant lncRNAs have yet been implicated in regulation of transcription elongation or Pol II pausing.

Different types of lncRNAs associate with chromatin and act as scaffolds that allow the assembly of complexes of chromatin-modifying enzymes. Recruitment of these proteins can require small RNAs or not. For example, the siRNA-directed DNA methylation (RdDM) pathway, which occurs specifically in plants, requires small RNAs [37]. Other lncRNAs can recruit complexes of enzymes that remodel chromatin but do not require smRNAs. The mechanism that provides targeting specificity for these lncRNAs remains to be discovered. Work in mammalian systems showed that lncRNAs can interact with proteins of the Trithorax group and activate transcription via trimethylation of histone H3K4 [50]. Other lncRNAs interact with proteins that modify histones with repressive marks, such as Polycomb Repressive Complex 2 (PRC2), to repress transcription via methylation of histone H3K27 [51]. The best-studied RNAi-independent pathway that relies on lncRNAs interacting with Polycomb is epigenetic regulation via histone modifications and expression of Arabidopsis FLOWERING LOCUS C (FLC).

Additional examples include enhancer RNAs (eRNAs), shown to be involved in regulation of transcription initiation. Enhancers are regulatory genomic regions that are shown to be involved in transcriptional regulation through targeting promoters of protein-coding genes in a tissue-specific and developmental manner as well as modulating spatial organization of the genome [52]. Work in mammalian systems has shown that exosome-sensitive eRNAs function in activation of transcription, consistent with the enhancer function. Some eRNAs act in cis to recruit complexes of coactivator proteins that form chromosome loops that connect the enhancer with its promoter, thus activating gene expression (Fig. 5.2e) [41, 53]. However, no eRNAs have not been identified in plants yet. The exosome function of resolving R-loops, which are RNA-DNA triplexes, might also reduce genomic instability in the regions expressing eRNAs [41]. R-loops form during transcription and can persist in regions that are divergently transcribed [54]. These results suggest that the exosome modulates the interactions among the key elements that regulate gene expression and the organization of the nucleus.

The examples of the well-studied plant lncRNAs with established functions and mechanisms of action are listed in Table 5.2.

Table 5.2 List of plant lncRNAs

5.4 Plant lncRNAs Functioning as Molecular Sponges and Decoys

Work in Arabidopsis identified lncRNAs that compete with microRNAs (miRNAs) or mimic the targets of miRNAs; similar function was also identified in animal systems. For example, the IPS1 lncRNA plays a role in regulating phosphate balance and uptake by competing for binding the PHO2 mRNA. PHO2 negatively regulates phosphate transporters and is itself downregulated by miR399 cleavage of its mRNA; IPS1 serves as mimic that cannot be cleaved by miR399 due to the mismatch but can titrate off miR399 [47]. Bioinformatics approaches also have predicted many additional miRNA target mimics in Arabidopsis, but the functions of many of these remain to be deciphered [63].

The Arabidopsis ASCO-lncRNA functions as decoy and regulates plant root development. ASCO-RNA competes with the binding of nuclear speckle RNA-binding proteins (NSRs), regulators of alternative splicing, to their targets; hijacking the NSRs changes the splicing patterns of NSR-regulated mRNA targets resulting in the production of alternative splice isoforms and leading to switch of developmental fates in plant roots (Fig. 5.3) [38].

Fig. 5.3
figure 3

Plant lncRNAs can affect the expression of proteins that regulate alternative splicing. The ASCO-lncRNA functions as a decoy that competes with mRNAs for binding to NSR splicing regulators

5.5 Plant lncRNAs Functioning in Regulation of Transcription and Silencing

5.5.1 Regulation of PIF3 Transcription by HID1 im-ncRNA

One of the interesting Arabidopsis lncRNAs, HIDDEN TREASURE 1 (HID1), also classified in original study as im-ncRNA with a length of 236 nt, is involved in the regulation of transcription of the transcription factor PIF3, a member of “phytochrome-interacting factors” (PIFs), a family of basic helix-loop-helix (bHLH) transcription factors [39]. HID1 is evolutionarily conserved in land plants and functions in trans as a component of an RNA-protein complex. It interacts with the promoter region of PIF3 and suppresses PIF3 transcription. The HID1 im-ncRNA is among rare examples of lncRNAs for which it was shown that its function requires its secondary structure. The secondary structure of HID1 in Arabidopsis and rice shows substantial conservation and expression of OsHID1 could complement the Arabidopsis hid1 mutant phenotype, indicating its importance in regulation of photomorphogenesis in seedlings.

5.5.2 Role of lncRNAs in RdDM

In plants, lncRNAs also function in epigenetic silencing, acting via siRNA-dependent DNA methylation (RdDM) (Fig. 5.4). RdDM in plants has similar mechanisms to gene silencing mediated by siRNAs in S. pombe [64,65,66,67]. RdDM primarily functions to repress transcription of transposons and repetitive sequences and requires RNA Pol IV and Pol V, two plant-specific RNA polymerases [6], and perhaps some involvement of RNA Pol II [68]. RNA Pol IV produces ncRNAs that serve as templates for 24 nt siRNAs, and RNA Pol V transcribes lncRNAs, which act as scaffolds that the AGO-siRNA complex recognizes through sequence complementarity (reviewed in [37]). In Arabidopsis, most siRNAs are generated by Pol IV; however, Pol V and Pol II can also make siRNA templates, suggesting additional complexity involved in siRNA biogenesis [69,70,71,72].

Fig. 5.4
figure 4

LncRNAs participating in the RdDM pathway. Transcripts produced by Pol IV are precursors for 24 nt siRNA; transcripts produced by Pol V are scaffolds and siRNA targets. SHH1 reads the H3K9me status of chromatin and recruits Pol IV; then the chromatin-remodeling protein CLSY1 assists in the passage of Pol IV [73]. Pol IV transcripts are transcribed by RDR2 into double-stranded RNAs (dsRNAs) before they are processed by DCL3 into 24 nt siRNAs and stabilized by methylation at the 3′ end by HEN1. These siRNAs associate with AGO and return to the nucleus as a part of the AGO-siRNA complex, which targets nascent Pol V scaffold transcripts. Pol V is recruited by SUVH2 or SUVH9 to its target genomic loci marked by DNA methylation [74], and Pol V transcription is facilitated by the DDR complex [75].The IDN2-IDP complex binds to Pol V scaffold RNAs and interacts with the SWI/SNF complex, which adjusts the position of nucleosomes [76]. The AGO4-siRNA complex interacts with Pol V; in this interaction, the siRNA in the complex base pairs with the transcript produced by Pol V to target a chromatin-modifying complex that catalyzes de novo methylation at the genomic loci. Then, the silencing mediated by DNA methylation is further amplified by methylation of histone H3K9 by KYP, SUVH5, and SUVH6 (reviewed in [37]). The silencing of solo LTRs requires the exosome, which does not act via siRNAs and DNA methylation. Rather, the exosome interacts with transcripts from a nearby scaffold-producing region and acts in silencing the solo LTR by altering chromatin structure via H3K9 histone methylation, suggesting this may function in parallel with RdDM

Identification of the Pol IV- and particularly Pol V-produced lncRNAs has remained challenging until recently [57,58,59,60]. One of the recent genome-wide studies identified Pol IV/RDR2-dependent transcripts (P4RNAs) from thousands of Arabidopsis loci. Interestingly, these P4RNAs are transcribed mainly from intergenic regions; 65% of the P4RNAs overlapped with transposable elements or repeats, and 9% of the RNAs overlapped with genes [57]. The Pol IV/RDR2-dependent transcripts are non-polyadenylated and produced from the sense and antisense DNA strands. Surprisingly, instead of a 5′ triphosphate, the P4RNAs have a monophosphate [57].

Until very recently Pol V transcripts eluded detection on the genome-wide scale due to the very low levels of their accumulation, which made them difficult to detect using RNA-seq. Based on the analysis of the individual transcripts, Pol V lncRNAs are non-polyadenylated and either tri-phosphorylated or capped at the 5′ ends [6]. Recent genome-wide study using RIP-seq identified 4502 individual Pol V-associated transcripts [60]. It was previously annotated that the Pol V-transcribed regions have an average length of 689 nt. Surprisingly, it was found that experimentally identified Pol V lncRNAs are shorter than previously annotated, with their median size ranging from 196 to 205 nt yet spanning the entire region. This data suggested that Pol V might not transcribe the entire regions continuously but is possibly controlled by internal promoters situated within the annotated regions that lead to active Pol V transcription.

Unlike RNA polymerases I, II, and III, which use specific sequence elements that identify their promoters, no specific DNA sequence elements were found in Pol V-transcribed regions. Instead internal repressive chromatin modifications appeared to control Pol V transcription and contribute to initiation by internal promoters. Interestingly, Pol V produces lncRNAs bidirectionally on annotated Pol V transcripts with no correlations in strand preference. However, despite Pol V that transcribes both strands of DNA, a subset of Pol V transcripts on transposons was found to be enriched on one strand in a way that indicated that limited strand preference of Pol V in these loci may be involved in determining boundaries of heterochromatin on transposons.

Previous genome-wide studies using ChIP-seq identified Pol V-associated genomic regions and found Pol V may also function in pathways other than the RdDM pathway [6, 75,76,77,78,79,80]. About 75% of Pol V-occupied genomic sites are transposons and repetitive sequences that also have 24 nt siRNAs and high levels of DNA methylation, indicating that Pol V induces RdDM at these sites. The other 25% of Pol V-associated sites include many protein-coding genes that have lower methylation levels and do not associate with siRNAs. This indicates that Pol V may also function in other silencing pathways [77]. Pol II also can produce scaffold transcripts that recruit siRNAs bound by AGO [68]. However, it remains unclear how Pol II targets specific intergenic loci and how Pol II interacts with Pol IV and Pol V.

Interestingly, the exosome also appears to play some role in silencing of these regions. A genome-wide study that identified exosome targets found many polyadenylated substrates of the exosome complex that corresponded to ncRNAs from centromeric regions, repetitive sequences, and other siRNA-producing loci and undergo RdDM-mediated silencing [1]. However, when we explored the connection between the two silencing pathways, RdDM and the exosome in Arabidopsis, we found that mutants of the core exosome subunits only produce a small effect on smRNAs [81]. This differs from results found in studies of the exosome in fission yeast, as in this system, the exosome prevents RNAs from spuriously entering into smRNA pathways [65]. Instead, less H3K9me2 was observed at several loci controlled by RdDM in exosome-deficient lines. The exosome interacts genetically with RNA Pol V and physically associates with polyadenylated Pol II transcripts from the regions generating Pol V scaffold RNAs [81]. These observations indicate that the exosome functions in lncRNA metabolism or processing in scaffold-generating regions. The exosome may also mediate the interactions among Pol II, Pol V, and Pol IV, modulating transcriptional repression. One outstanding question is whether and how the exosome (possibly acting through lncRNAs) contributes to silencing of loci via fine-tuning histone modifications and if the same mechanism of action can be observed genome wide.

However, Arabidopsis exosome subunits have diverse functions [1]. The additional enzymatic subunit, AtRRP6L1, is independent of the exosome core functions [10]. Mutations in AtRRP6L1 effect siRNA metabolism and DNA methylation [82]. Therefore, the exosome and the additional enzymatic subunits played an important role in regulation of ncRNAs, including siRNAs, in the RdDM pathway.

5.6 lncRNAs in the Regulation of Flowering

Because of the importance of flowering time regulation for plant adaptation to different latitudes, the lncRNAs that regulate flowering are among the best-studied functional plant lncRNAs. Work in Arabidopsis has shown that these lncRNAs regulate the initiation of flowering by modulating the expression of FLOWERING LOCUS C (FLC), which encodes a MADS-box transcription factor. FLC represses downstream genes required for flowering and thus negatively regulates flowering, acting in a dose-dependent manner. FLC functions in the vernalization pathway, which modulates flowering time in response to prolonged low temperature, and in the autonomous pathway, which modulates flowering time independently of environmental factors [83].

The regulation of flowering time involves epigenetic silencing of FLC, mainly via modification of histones. Repression of FLC requires PRC2, which is recruited to FLC and methylates histone H3K27. Alteration of chromatin, particularly changes in histone modifications that remove H3K4me3, H3K36me3, and H2Bub1 and replace those modifications with H3K27me3, epigenetically represses FLC expression (reviewed in [36]).

The lncRNAs COLDAIR, COLDWRAP, and COOLAIR are transcribed from FLC and function in FLC epigenetic silencing (Fig. 5.5) [9, 11, 84]. Vernalization induces transient transcription of COLDAIR, a 5′ capped, non-polyadenylated lncRNA, transcribed from FLC intron 1, in the same direction as FLC (Fig. 5.5). CURLY LEAF (CLF), a homolog of mammalian EZH2 (an enzymatic component of PRC2), binds to COLDAIR, and knockdown of COLDAIR decreases CLF and H3K27me3 enrichment at FLC in response to cold. This thus hampers the repression of FLC during vernalization and indicates that COLDAIR’s repression of FLC is essential for the vernalization response [9]. Previous work suggested that PRC2 recruitment to FLC requires COLDAIR for the initiation of epigenetic silencing, analogous to the functions of the mammalian lncRNAs HOTAIR and Xist [51]. However, mammalian PRC2 shows high-affinity binding to unrelated RNAs; therefore, other factors, in addition to lncRNAs, may provide the specificity that targets PRC2 to FLC [85].

Fig. 5.5
figure 5

Regulatory lncRNAs produced from the FLC locus. Diagram of the FLC locus [84]. The FLC transcriptional start site is indicated by black arrow, and the vertical bars indicate exons in the FLC sense transcript. During vernalization, the COLDAIR lncRNA (pink) is transcribed in the sense direction, starting in the first intron of FLC. Another sense lncRNA, COLDWARP, is transcribed from the repressed promoter of FLC (green). The COOLAIR (blue) and ASL (red) lncRNA transcripts are transcribed from the indicated start sites (purple arrow) in the antisense direction; both result from alternative polyadenylation at poly(A) site either in the sense promoter region or intron 6. The ASL lncRNA also undergoes alternative splicing. Blue boxes indicate the exon of COOLAIR; red boxes indicate the exons of AS I and II; dotted lines indicate the spliced regions. ASL covers FLC intron I. Yellow dotted lines indicate the R-loops, in the COOLAIR promoter region, and repress COOLAIR transcription

An additional Polycomb-interacting lncRNA, cold of winter-induced noncoding RNA from the promoter (COLDWRAP), was identified to be expressed from the upstream promoter region of FLC locus and shown to function in repression of FLC (Fig. 5.5) [11]. COLDWRAP is a 316 nt lncRNA that is transcribed in the sense direction with its transcription start located 225 nt upstream from the FLC mRNA. COLDAIR and COLDWRAP both have 5′ caps, but most transcripts of COLDWRAP appear to be non-polyadenylated. Interestingly, association of the Polycomb complex with COLDWRAP appears to be specific, as native CLF binds significantly to the sense strand of COLDWRAP but only weakly to the antisense strand. In addition, the 5′ half of COLDWRAP and several stable secondary structures identified in this region are needed for RNA-protein interactions. Importantly, COLDWRAP working in a cooperative manner with COLDAIR is necessary for vernalization-mediated FLC silencing. COLDWRAP functions to retain Polycomb at the FLC promoter through the formation of a repressive intragenic chromatin loop forming a stable repressive chromatin structure.

The COOLAIR is a set of lncRNAs transcribed from the 3′ end of FLC in the antisense direction, which are alternatively spliced and polyadenylated, proximal AS I and distal AS II [55]. In response to cold, the locus first produces COOLAIR, then COLDAIR, before H3K27me3 accumulates; therefore, initial studies indicated that COOLAIR may act early in vernalization [55]. However, knockdown of COOLAIR did not affect the vernalization response [86]. Rather, COOLAIR increases the rate of FLC transcriptional repression during vernalization, and its function does not require PRC2 or H3K27me3 [36, 87]. The COOLAIR knockdown desynchronized the change from H3K36me to H3K27me3 in FLC; therefore, this switch at FLC may require COOLAIR or transcription in the antisense direction [87].

COOLAIR represses FLC in the vernalization and autonomous pathways. In the autonomous pathway, COOLAIR 3′ end processing affects the FLC chromatin [84]. The autonomous pathway factors FCA, FY, and FPA, along with the polyadenylation cleavage factors CstF64 and CstF77, and the spliceosome component PRP8, favor the production of AS I by increasing usage of the proximal COOLAIR polyadenylation site [84, 88, 89]. This increases levels of the FLOWERING LOCUS D (FLD) histone demethylase at FLC leading to H3K4me2 demethylation of FLC [90].

Unraveling the functional importance of transcription of COOLAIR and the functions of COOLAIR transcripts remains challenging. Since it is difficult to determine whether it is the COOLAIR transcription, COOLAIR transcripts, or both that are functionally important, the secondary RNA structure of COOLAIR was recently determined experimentally [91]. It was found that even despite the relatively low sequence identity between Arabidopsis and evolutionarily divergent Brassicaceae species, the structures showed remarkable evolutionary conservation. This conservation applied to multi-helix junctions and through covariation of a non-contiguous DNA sequence. The observed conservation of COOLAIR lncRNA structure in the Brassicaceae indicates that the COOLAIR lncRNA itself is very likely to function in regulation of FLC, although the process of antisense transcription from FLC may also affect FLC regulation.

Recent work also discovered the Antisense Long (ASL) transcript in early-flowering Arabidopsis ecotypes that do not require vernalization for flowering [10]. In contrast to the other lncRNAs transcribed from FLC, ASL does not get polyadenylated, although it is alternatively spliced. The ASL transcript is >2000 nucleotides long and is transcribed from the antisense strand, starting at the same promoter as COOLAIR. The 5′ regions of COOLAIR and ASL overlap, but ASL spans intron 1 (important for maintenance of FLC silencing) and includes the COLDAIR region, which is transcribed in the sense direction. The ASL transcript physically associates with the FLC locus and H3K27me3 [10], suggesting that ASL and COOLAIR play different roles in FLC silencing and perhaps in the maintenance of H3K27me3.

It is interesting that the exosome again is involved in the regulation of the antisense transcript and does so in a surprising way. Two of the exosome components, RRP6-Like (RRP6L) proteins, are involved in lncRNA-mediated regulation of flowering. RRP6, one of the catalytic subunits, has both core-complex-dependent and core-complex-independent functions [92, 93]. In Arabidopsis, RRP6L1 and RRP6L2 regulate COOLAIR and ASL expression or processing in the exosome core-complex-independent way [10]. Mutations of RRP6L also derepress FLC; this delays flowering. The AS I and II downregulation observed in RRP6Ls multiple mutants resembled the patterns that occur in CstF64 and CstF77 mutants, which are 3′ end processing factors [10, 84], indicating that COOLAIR 3′ end processing may require RRP6Ls.

Very surprisingly, emerging work indicates that RRP6Ls have a major role in regulation of the synthesis or biogenesis of ASL, as RRP6Ls mutants lack (or have minuscule amounts of) ASL transcript. This result finding is unexpected because RRP6 functions as a 3′–5′ exoribonuclease and RRP6 mutants generally fail to degrade or process certain RNAs; thus, these mutants usually overaccumulate certain RNAs. However, recent work found that the abundance of many yeast mRNAs also decreased in the rrp6Δ mutants [19]. Similarly, in humans, inactivation of the RRP6 homolog also causes a dramatic decrease in Xist levels [94].

Another function of RRP6Ls involves affecting the epigenetic marks at FLC; mutants of RRP6L have decreased H3K27me3 levels and decreased density of nucleosomes at FLC. These mutants therefore show increased expression of FLC and delayed flowering. RRP6L1 physically interacts with the ASL RNA and with chromatin at FLC; this indicates that RRP6Ls may regulate ASL to maintain H3K27me3 levels at FLC. Therefore, RRP6Ls regulate FLC lncRNAs, and their regulation of various antisense RNAs may affect FLC silencing [10].

R-loops that form over the COOLAIR promoter region affect COOLAIR transcription, although effects of R-loop formation on FLC expression are not fully unclear [95]. Failure of the termination of transcription can often produce R-loops [96], which can recruit the exosome co-transcriptionally through the noncanonical pathway for 3′ end processing [19]. Work in mammals showed that RRP6 can resolve deleterious R-loops [41]; thus, plant RRP6Ls may affect both the processing and expression of antisense transcripts from FLC in a similar manner.

In mammalian systems, lncRNAs have key roles in molding the three-dimensional organization of the nucleus (Fig. 5.2f) [97,98,99]. In plants, emerging research is beginning to reveal the role of lncRNAs in architecture of the nucleus, and some RNA studies also indicate that lncRNAs may have similar roles in 3-D nuclear architecture in plants and animals. Several studies have also addressed genome organization using Hi-C approach in Arabidopsis [100,101,102,103,104]. The RdDM pathway likely also affects the higher-order structure of chromatin by acting with MORC proteins. In Arabidopsis, MORC6 may have ATPase activity and interact with the DDR complex component DMS3; the action of this complex may be analogous to that of mammalian cohesin-like proteins that function in inactivation of the X-chromosome in mice. Consistent with this, MORC1 and MORC6 mutant plants have de-condensed pericentromeric heterochromatin [105]. The promoter and 3′ terminator of FLC form gene loops [106, 107], and COLDAIR and COLDWRAP lncRNAs participate in this process [11]. FLC alleles also undergo long-distance interactions, clustering during vernalization-mediated epigenetic silencing. This interaction requires VRN5 and VERNALIZATION 2, two PRC2 trans-acting factors [108]. However, we lack information on how lncRNAs function in long-distance interactions of the chromatin at FLC. As illustrated by FLC, plant lncRNAs carry out diverse, varied, and important functions. Our understanding of lncRNA functions continues to emerge as new studies uncover the mechanisms controlling lncRNA transcription and processing.

5.7 Concluding Remarks

The recent discovery that genomes undergo pervasive transcription opened many questions on the functions of these RNAs. Since then, studies in the various kingdoms of eukaryotes have broadened our understanding of the biogenesis and functions of various lncRNAs. However, although various studies have identified and classified many categories of lncRNAs, the functions of lncRNAs, and how they carry out these functions, remain to be discovered. Work in plants identifying lncRNAs systematically has caught up with work in other systems. Plant studies have also discovered lncRNA functions in controlling flowering time and RdDM-mediated silencing of genes. However, many other lncRNAs remain to be examined. The regulation of plant lncRNA synthesis and biogenesis also will require further work to elucidate. Understanding the mechanisms that control plant lncRNA expression and biogenesis will require integration of bioinformatics, genetic, and biochemical data to provide a complete understanding of lncRNA function and biology. A complete understanding of the various facets of plant lncRNAs will reciprocally advance our understanding of lncRNAs in other species.