Significance Statement

As the role of epigenetics in the control of gene expression is being progressively unveiled, it is becoming apparent that epigenetic modifiers are underexploited breeding tools. This article reviews the epigenetic modifications in plants and their key molecular players, and evaluates various technologies for programmable epigenomic editing, including recent applications for specific and tunable regulation of gene expression. The review provides an overview of the latest developments in plant epigenomic editing and discusses the potential of these tools for their application in crop breeding.

Amit Dhingra, Department of Horticulture, Washington State University, Pullman, WA USA

Introduction

The phenotypic traits found in eukaryotic organisms are not only due to the polymorphisms present in their genomes. Traditionally, it was thought that only DNA sequence variants could be inherited in subsequent generations, and therefore the phenotypic variation was only due to such transmissible DNA polymorphisms. However, this paradigm was broken with the arrival of epigenetics. The modification of the chromatin structure plays a fundamental role in the functionality of the genome and the regulation of gene expression. These chromatin modifications are carried out by covalent epigenetic marks on DNA or associated histones, which are also responsible for the maintaining of the shape and structure of a nucleosome. It has also been shown in different studies that these epigenetic marks can be heritable (Wolffe and Matzke 1999; Henderson and Jacobsen 2007) or they can vary at different stages of development or through of the interaction of the organism with the environment. The study of epigenetics not only offers a vision of how modifying the structure of DNA can affect plant gene expression, but also opens the door to new gene editing technologies to generate and select those phenotypes of interest in an efficient and specific way. In this review, we will present how new editing technologies, in particular CRISPR systems, can be combined with epigenetic effectors and transcription regulatory domains to offer programmable systems for epigenetic editing and transcriptional regulation in plant species of agronomic interest.

The agronomic value of the study of the epigenetics in plants

The epigenetic regulatory mechanisms occur during development and also as an adaptation to the environment. The molecular basis of these adaptive systems can be especially attractive from an agronomic point of view, since finding the epigenetic mechanisms that influence important agronomic traits such as yield, biomass, flowering time, resistance to abiotic stresses or fruit ripening will allow the improvement of crops in an specific and efficient way.

Heterosis or “hybrid vigour” is the improved or increased function in a biological quality occurring in a hybrid offspring. Heterosis brings a lot of attention due to its obvious agronomic value. It has been classically explained by different mechanisms such as dominance, over-dominance, and epistasis (Shull 1908; Bruce 1910; Jones 1917). However, recent studies suggest the involvement of epigenetic components that result in differential expression of genes between parentals and their progeny (Groszmann et al. 2011; Barber et al. 2012; Chodavarapu et al. 2012; Zhang et al. 2016). Some authors suggest a direct relationship between the epigenetic and the phenotypic variation found in several plant species (Jin et al. 2008; Shen et al. 2012). An example of heterosis associated with dynamic epigenetic marks was described by Ni et al. (2009). They showed that the circadian clock genes CCA1 (CIRCADIAN CLOCK ASSOCIATED 1) and LHY (LATE ELONGATED HYPOCOTYL) present transcriptional repression due to histone modifications. The presence of these marks in Arabidopsis hybrids was associated with an increase in biomass. These studies were also applied to rice (He et al. 2010), where the parental contribution of epigenetic states in the progeny was analysed and shown to affect to differential gene expression in hybrids. These findings can give clues on how to improve crop yields by engineering epigenetic marks.

Flowering, one of the best studied developmental events in plant physiology, is also partially controlled by epigenetics. Overlapping environmental stimuli are regulating the timing of flowering such as temperature or photoperiod. An example is vernalization, which causes early flowering through exposure to low temperatures (Luo and He 2020). This process has allowed plant species to adapt their reproductive cycles to the most favourable annual season and allows also the control of flowering in certain species through heat treatment. The two of the main genes involved in the vernalization in Arabidopsis, FRIGIDA (FRI) and FLOWERING LOCUS C (FLC) (Whittaker and Dean 2017; Li et al. 2018c; Zhang and Jiménez-Gómez 2020), are strong flowering repressors. FLC encodes a MADS-Box domain protein that represses flowering and, likewise, FRI is required for FLC transcriptional activation. It has been shown that low temperature regulates the expression of the flowering repression genes through a complex epigenetic mechanism of histone and DNA methylation that silences the FLC locus (Sung et al. 2006; Wood et al. 2006; Shafiq et al. 2014). This mechanism is mediated by the action of Polycomb proteins (Amasino 2010; Kim and Sung 2013). The understanding of the vernalization pathway has allowed the transfer of this process to the field, obtaining early flowering in wheat through the treatment with the cytosine methyltransferase inhibitor 5-azacytidine (Brock and Davidson 1994; Horváth et al. 2003). Also, one of the first methyltransferase mutants studied in Arabidopsis showed early flowering, and this phenotype was inherited to the progeny (Finnegan et al. 1998). However, these approaches may not be suitable for field application since they lack specificity and generate pleiotropic problems due to the alteration of global methylation patterns (Ronemus et al. 1996). Furthermore, the general derepression of transposable elements might be armful for the stability of the plant genome (Underwood et al. 2017). Thus, a specific approach can be performed when only the histone epigenetic marks related with FLC and FT gene are selectively edited (Jeong et al. 2015; Han et al. 2019).

We are starting to understand how epigenetics mediates both immediate and long-term stress responses. Plants can “memorize” a stress situation and transmit the active defensive state to the progeny (Wibowo et al. 2016; Furci et al. 2019; Cong et al. 2019). For instance, it has been found that increased salt concentrations induce differential patterns of DNA methylation in rice (Wang et al. 2011; Karan et al. 2012), and that the expression of stress responsive genes involves phosphorylation of histone H3 and acetylation of histone H4 in Arabidopsis and tobacco (Sokol et al. 2007). Cold stress is known to induce DNA demethylation in different plant species (Shan et al. 2013; Song et al. 2015a), and also upregulates histone deacetylases leading to H3 and H4 histone deacetylation and the subsequent activation of responsive genes such as COR (cold-regulated) (Zhu et al. 2008; Hu et al. 2011).

Finally, fruit ripening has been also shown to be partially controlled by dynamic epigenetic mechanisms. In tomato, the DNA Methyltransferase Domains Rearranged Methyltransferase 2 (SlDML2) regulates genes involved in fruit ripening (Lang et al. 2017). Spontaneous mutations that induce hypermethylation in the promoter region of the Colorless Non-Ripening (CNR) gene result in a non-ripening phenotype (Manning et al. 2006). In apple, DNA methylation at the MdMYB10 gene modulates its expression, influencing the pigmentation of the fruit (El-Sharkawy et al. 2015). Moreover, it has been found that the global DNA methylation levels in citrus increase during ripening (Huang et al. 2019). Understanding the mechanisms that regulate fruit ripening in different plant species of agricultural interest has made possible to obtain late-maturing plant varieties (Liu et al. 2016a) or with greater fruit quality (Liu et al. 2015; Osorio et al. 2020).

The influence of epigenetic marks on gene expression

The status of chromatin influences the accessibility of the transcriptional machinery and therefore, gene expression. The chromatin status is strongly regulated by epigenetic marks that respond to environmental stimuli or specific physiological conditions. Epigenetic marks include DNA methylation and histone modification and numerous studies have identified their link with transcriptional activation or repression of genes (Fig. 1).

Fig. 1
figure 1

Schematic diagram of epigenetic regulation and the epigenetic marks associated to gene expression and gene repression. The epigenetic effectors and marks that modulate the “relaxed” or “condensed” state of chromatin are represented. The condensed representation of the chromatin include DNA methylation, generated by the methyltransferase MET, histone dimethylation and monomethylation H3K9, carried out by histone methyltransferases (HMT) and histone deacetylation carried out by histone deacetylases (HDACs). The transition of condensed state to more relaxed is carried out by JMJ histone demethylases, that removes the trimethylation of H3K27 and ROS1, a DNA demethylase. The relaxed stated of the chromatin, include histone acetylation carried out by histone acetyltranferase (HAT) and histone trimethylation of H3K4. A representation of gene transcription is included in the relaxed section of the chromatin. Created with BioRender.com

DNA methylation

DNA methylation refers, classically, to the addition of a methyl group to the 5th carbon of the cytosines to form 5-methylcytosine (5mC), although it has been recently found in mammals that there are some additional DNA modifications based on methylation such as 3-methylcytosine (Sadakierska-Chudy et al. 2015). In plants, 5mC DNA methylation is found mainly in symmetrical CG and CHG sequences, but also to some extent in non-symmetrical CHH sequence contexts (Kumar et al. 2018), where H is A, C or T. DNA methylation plays a prominent role in gene silencing and it is mainly found in heterochromatic regions, such as pericentromeric and telomeric regions (Zhang et al. 2018b). Moreover, DNA methylation found around gene regulatory regions, usually comprising promoters and terminators, is linked to gene repression. On the other hand, demethylation of promoter regions is related to active gene expression (Zilberman et al. 2007). Although this is the general rule, it has been recently found that DNA methylation can promote gene expression through the SU(VAR)3‐9 DNA methyl-readers SUVH1 and SUVH3 (Harris et al. 2018). DNA methylation status is controlled by DNA methylases and demethylases, which can also interact with other modulators in response to different stimuli to control gene expression.

Depending on the cytosine context (CG, CHG or CHH) methylation is mediated by a different type of methyltransferases. CG methylation, the most abundant in plant genomes, is carried out mostly by Methyltransferase1 (MET1), an orthologue of the mammalian DNA (cytosine-5)-methyltransferase 1 (DNMT1) (Cokus et al. 2008). MET1 maintains the methylation status acting over hemi-methylated CG dinucleotides during DNA replication (Aufsatz et al. 2004). CHG methylation is maintained by CHROMOMETHYLASE 3 (CMT3) (Bartels et al. 2018), with a marked preference for CWG (W = A or T) cytosines (Gouil and Baulcombe 2016). Finally, the CHH methylation is maintained by DNA-methyltransferase 2 (DRM2) via RNA-directed DNA methylation pathway (RdDM) (Matzke and Mosher 2014) and CHROMOMETHYLASE 2 (CMT2), which is mostly active in pericentromeric heterochromatin and long transposable elements (Bewick et al. 2017; Harris and Zemach 2020).

Methylation is a reversible process and can be reversed actively by specific demethylases. In plants, the active demethylation is mediated mostly by DNA glycosylases through a base-excision-repair pathway, although another mechanism such as passive demethylation or blocking de novo methylation pathway has been reported in early stages of development and imprinting (Penterman et al. 2007). In Arabidopsis, a group of glycosylases have been found to be involved in active demethylation, namely DEMETER (DME) and REPRESSOR OF SILENCING 1 (ROS1) (Choi et al. 2002; Morales-Ruiz et al. 2006). Also, DEMETER-LIKE 2 (DML2) and DEMETER-LIKE 3 (DML3) proteins show an active role in demethylation processes (Ortega-Galisteo et al. 2008).

Histone modifications

Covalent modifications in histones can alter chromatin structure and change the accessibility of the transcriptional machinery, causing changes in gene expression. The structure of the histones allow a greater variety of modifications than DNA, and therefore creates a “histone code” that is usually linked to active or repressive states of chromatin. Modifications are usually located at the N-terminal tails of histones and, in some cases, at nucleosome core regions (Zhao et al. 2019b). The most studied histone modifications involve methylation, acetylation, ubiquitination, and phosphorylation. For the purpose of this review we will focus in the description of methylation and acetylation, which are the epigenetic marks whose edition that have been attempted so far.

The process of histone methylation is one of the most frequently observed in epigenetic regulation and the proteins involved in methylation and demethylation have been identified also in plants. Histone methylation takes place at lysine and arginine residues. Lysines can present mono-, di- or trimethylation, usually located in lysines 4, 9, 27 and 36 of histone H3 (Zhang and Reinberg 2001; Cheng et al. 2020). Arginine residues present only mono and di-methylation at N-terminal tails of arginine 2, 8, 7, 17 and 26 of Histone H3 and arginine 3 of Histone 4 (Bedford and Richard 2005). It has been shown that lysine and arginine methylation are involved in transcriptional regulation, RNA processing, nuclear transport, DNA-damage repair and signal transduction (Liu et al. 2010). Also, the number of methyl groups have different biological significances. For example, trimethylations of H3K4 and H3K36 are present in transcriptionally active genes, whereas trimethylation of H3K27 and mono- and dimethylation in H3K9 are associated with transcriptional repression (Tariq et al. 2003; Wang et al. 2016).

In Arabidopsis, a wide variety of proteins are involved in controlling the histone methylation status. Histone methyltransferases (HMTs) involved in lysine methylation contain the conserved so-called SET domains that catalyse the transfer of one or multiple methyl groups to the ε-nitrogen of specific lysine residues (Xiao et al. 2016; Zhou et al. 2020). The H3K4 methylation is mediated by various proteins of the ARABIDOPSIS TRITHORAX (ATX) family (Alvarez-Venegas et al. 2003; Saleh et al. 2008; Chen et al. 2017), with homologous found in rice and maize (Springer et al. 2003; Choi et al. 2014). H3K9 is methylated by the methyltransferase SUVH family such as KRYPTONITE (KYP), also known as SUVH4, SUVH5, SUVH6 or SUVR4 (Ebbs et al. 2005; Li et al. 2018b). The proteins of Polycomb group that form Polycomb Repressive Complex 2 (PRC2) are responsible for the trimethylation of H3K27. Interesting, it has been found that ARABIDOPSIS TRITHORAX RELATED 5 and 6, (ATXR5 and ATXR6) can also target H3K27 by monomethylation, producing the opposite biological effect despite targeting the same histone residue (Jacob et al. 2009; Wiles and Selker 2017; Zhou et al. 2020). Finally, the epigenetic mark H3K36 is mostly regulated by EARLY FLOWERING IN SHORT DAYS (EFS)/SET DOMAIN GROUP 8 (SDG8) methyltransferases, but other members of the TRITHORAX family, such as SDG26 can catalyse the reaction at specific loci (Xu et al. 2008; Berr et al. 2015; Liu et al. 2019).

Histone demethylation is a dynamic reaction that shows the reversibility of the histone methylation process. Histone demethylases can remove the methyl group from tri- to di- methylations, from di- to mono methylations and mono to non-methylated lysine (Xiao et al. 2016). They are classified based on the number of methylations and the target lysine. The firstly identified demethylase was the human Lysine Specific Demethylase 1 (LSD1) (Shi et al. 2004). This cluster of demethylases performs a FAD-dependent oxidation reaction to remove the methyl group from lysine residues. In Arabidopsis, four LSD1 homologs have been identified; LSD1-like 1 (LDL1), LSD1-like 2 (LDL2), LSD1-like 3 (LDL3) and FLOWERING LOCUS D (FLD) (Jiang et al. 2007). As with the human LSD1, the Arabidopsis homologs are only able to remove groups from mono- and di-methylated H3K4.

The second group of demethylases belong to the JmjC domain-containing protein family (JMJ), which are subdivided, in Arabidopsis, in five subfamilies. These demethylases can recognize and remove the methyl group from mono-, di- and trimethylated lysine histones residues through the action of Fe (II) and α-ketoglutarate (αKG) cofactors (Duan et al. 2018). For example, JMJ14, a flowering repressor, and JMJ15 and JMJ18, flowering activators, were a identified as H3K4demethylases in Arabidopsis (Lu et al. 2010; Yang et al. 2012a, b), and JMJ703 in rice (Cui et al. 2013). The JMJ proteins involved in H3K9 demethylation comprises JMJ25 (or IBM1) and JMJ27 in Arabidopsis (Saze and Kakutani 2007; Dutta et al. 2017). and JMJ706 in rice (Sun and Zhou 2008). The Histone H3K27 can be demethylated by several proteins such as JMJ11 (ELF6), JMJ12 (REF6) and JMJ13, which are involved in flowering (Lu et al. 2010; Gan et al. 2015; Zheng et al. 2019) in Arabidopsis and JMJ705 in rice (Li et al. 2013). The demethylation over histone H3K36 is not so well elucidated, but some studies relate the JMJ12 (REF6) protein with reduced H3K36me2 in FLC locus (Gan et al. 2015) and, more recently, the protein JMJ30 as a possible demethylation effector over H3K36 (Cheng et al. 2020).

Histone methylation takes also place in arginine residues. The arginine methyltransferases (PRMTs) are a group of evolutionarily conserved methyltransferases that can replace the 5-hydrogen of the guanidine nitrogen atoms present in the arginine by methyl moieties to form different methylated arginine isomers. Depending on the isomer and symmetrical or asymmetrical methylation that is formed, the PRMTs are divided into four subgroups (Type I, II, III and IV) (Ahmad and Cao 2012; Hartley and Lu 2020). In Arabidopsis, the H3R2 and H3R17 can be asymmetrically di-methylated by AtPRMT4a/b and H4R3 can be symmetrically di-methylated by AtPRMT5, AtPRMT10 and AtPRMT1a/b. Both epigenetic marks are involved in flowering time, but performing different roles (Liu et al. 2010). Unfortunately, the mechanisms of the arginine demethylation are less understood, and this activity is linked to certain JmjC domain-containing proteins (Cho et al. 2012; Li et al. 2018a).

Histone acetylation takes place at lysine residues through the addition of an acetyl group, which is provided by acetyl-CoA. The acetylation neutralizes the positive charge of the amino residues and increases the size of the lysine tail, which causes relaxation of chromatin and creates a landing pad for the transcriptional machinery. In addition to increase the accessibility, the acetylation located in the histones also can recruit reader proteins, as chromatin remodelers (Marmorstein and Zhou 2014). For this reason, the epigenetic mark of acetylation is linked to transcriptionally active genes (Liu et al. 2016b). In plants, the lysine residues that are frequently acetylated are K9, K14, K18, K23 and K27 of histone H3 and residues K5, K8, K12, K16 and K20 of histone H4. Histones can present up to four acetylations. Acetylations are often linked to specific responses in plants; for example H3K9/14ac or H3K27ac correlate with activation of developmental and stress-induced genes (Weiste and Dröge-Laser 2014; Kim et al. 2017b). Similarly, histones can also undergo active deacetylation that represses transcription (Chen and Tian 2007), this linked to a great variety of processes such as photomorphogenesis (Benhamed et al. 2006; Jang et al. 2011) or flowering (Yu et al. 2011).

Histones acetyltransferases (HATs) identified in plants can be divided into four families, HAC, HAF, HAG and HAM. The HAC group contains HATs of the p300/CREB binding protein family. The HAF group is formed by HATs from the TATA-binding protein-associated factor (TAFII250) family. The HAG group contains HATs of the general control non-repressible 5-related N-terminal acetyltransferase (GNAT) family. Finally, the HAM group is formed by the MYST family (Liu et al. 2016b; Jiang et al. 2020). The acetyl group can be removed by other proteins named histone deacetylases (HDACs), which are linked with transcriptional repression. HDACs are well conserved proteins and they can be divided into four subclasses. In Arabidopsis, the class I contains 6 protein members, HDA6, HDA7, HDA9, HDA10, HDA17, and HDA19. Class II contains five protein members, HDA5, HDA8, HDA14, HDA15, and HDA18. The class III, which comprises sirtuin-like members, present fewer homologs in Arabidopsis than in humans, with only two members SRT1 and SRT2. Finally, the HDA2 protein is the unique component of class IV of HDACs in plants (Alinsug et al. 2009; Yu et al. 2017; Zhang et al. 2018a; Zhao et al. 2019a; Chen et al. 2020). Finaly, a group of four plant-specific HDACs, HD2A, HD2B, HD2C, and HD2D (Bourque et al. 2016) are involved in biotic and abiotic stress resistance (Luo et al. 2012; Ding et al. 2012; Han et al. 2016; Park et al. 2018).

Non-catalytic domains that regulate the gene expression

Whereas catalytic domains in epigenetic effector proteins directly modify the chromatin structure to modulate gene expression, other non-catalytic regulatory domains are capable of recruiting epigenetic effectors, ultimately regulating the transcriptional activity of the target gene. Other regulators do not recruit effectors but interact with the transcriptional machinery in other forms, also influencing transcription.

In general, non-catalytic regulatory domains have been isolated from transcription factors (TFs), which usually comprise, next to DNA-recognition and binding domains, an effector domain that recruit chromatin modifiers and transcriptional machinery (Ma 2011; Yamasaki et al. 2013; Song et al. 2015b). One of the first described transcriptional regulation domains came from the yeast GAL4 TF, whose trans-activation domain (TAD) was identified and isolated. Likewise, the DNA-binding domain of this TF was also isolated and synthetically fused to other potential TADs, which allowed not only the identification of more activation and repression domains, but also demonstrated the orthogonality of the TADs and, therefore, its potential to operate in various species and various genomic contexts (Keegan et al. 1986; Hope and Struhl 1986; Hebbes et al. 1988). The modular nature on many transcriptional regulation domains led to the identification of powerful viral TADs, such as the VP16 domain of the herpes simplex virus (Campbell et al. 1984; Carey et al. 1990), which proved to be a powerfull activator also in plants (Moore et al. 1998; Schwechheimer et al. 1998). In addition, this domain offered the possibility of increasing its transcriptional activation potential through the fusion of several repetitions in tandem, originating the synthetic activation domains VP64, VP128, etc. These synthetic domains offered a greater activation range in target genes (Ordiz et al. 2002; Li et al. 2017) and also the ability to maintain this activation in the following generations (Utley et al. 1998; Guan et al. 2002).

On the other hand, endogenous regulatory domains have also been identified in plants that can offer either activation or repression. Two examples of plant TADs are the ERF2 (m) and the EDLL domains, both from the Ethylene Response Factor family (ERF) (Tiwari et al. 2012; Li et al. 2013). However, the ERF family has also proteins with identified transcriptional repression domains (TRDs, EAR motifs), with which it has been possible to obtain efficient transcriptional repression. A remarkable example is ERF3, (Ohta et al. 2001; Tiwari et al. 2004) from which widely-used TRDs SRDX and BRD are derived (Utley et al. 1998; Hiratsu et al. 2003).

Programmable epigenetic editors for regulation the gene expression

Epigenetic tools has emerged as important assets in crop breeding. The first attempts to take advantage of epigenetic control of gene expression involved the mutation or deletion of specific epigenetic effectors, thus generating new epigenetic mutants with potential agronomic interest. For example, ddm1 and met1 hypomethylated mutants in Arabidopsis cause transposon mobilization. These mutant backgrounds show multiple phenotypes in plant structure such as the short and compact inflorescence found in bns (BONSAI) plants (Saze and Kakutani 2007). However, the great variety of genes and transposons targeted by these effectors makes this approach very unspecific and unpredictable, generating in most cases aberrant phenotypes and developmental abnormalities (Kakutani et al. 1996; Tsukahara et al. 2009).

Target-specific epigenetic interventions require programming the DNA binding specifity of the effectors. The traditional approach for this involves the ectopic expression of heterologous TFs under the control of inducible promoters, therefore connecting promoter-specified inputs to a cascade of TF-targeted activated/repressed genes as output (Li et al. 2013; Petolino and Davies 2013). A limitation of this approach is that it does not allow free selection of the output response, as the collection of target genes is restricted by the DNA binding specificities of the TFs employed. An alternative is the engineering of specific binding stites in the promoters of the target genes (Kumar et al. 2015; Mohan et al. 2017; Dong et al. 2019; Bai et al. 2020). However, the complexity of the design and the effort required to generate a specific synthetic promoters driving each downstream gene is a factor to consider.

It was not until the discovery of artificial zinc fingers (ZFs) and transcription activator-like effectors (TALEs) that the possibility of creating custom-programmable epigenetic factors arose, opening the field to bioengineering and synthetic biology approaches. The artificial ZFs are custom specific DNA binding domains that typically recognize 3–6 nucleotide triplets. The initial engineering approach based on producing double-strand DNA breaks (DBS) with the fusion of Folk1 nucleases (Durai et al. 2005), rapidly evolve to artificial TFs containing transcriptional regulator domains or epigenetic effectors (Shrestha et al. 2018). Likewise, the engineered TALEs share many similarities in operation and structure with ZFs but offer a higher degree of specificity. They are proteins derived from Xanthomonas bacteria to aid the infection of plant species and promote the expression of host genes. TALEs consist of a specific and customizable DNA binding domain comprising tandem repeats arrays of amino acids, which can recognize a specific DNA target sequence (Boch et al. 2009; Moore et al. 2014). However, the nature of TALEs requires of a new design of the protein for each target sequence, what makes them efficient but labor-intesive tools.

In plants, several works describe ZF and TALEs being employed as artificial TFs, thus enabling programable gene regulation. The first ZFs examples targeted the APETAL3 gene in Arabidopsis. The VP64 TAD and the mSin3 interaction domain (SID) TRD were fused to an APETAL3 ZF, yielding the expected transcription changes and generating alternated floral patterns (Guan et al. 2002). In parallel, engineered TALEs were also proved as efficient programable regulators in plants. Interesting examples are the regulation of EGL3 and KNAT endogenous genes in Arabidopsis (Morbitzer et al. 2010), or the regulation of the AtPAP1 transgene in tobacco (Liu et al. 2014). There are few examples in plants of ZF or TALE programmable regulators using catalytic epigenetic effectors. A remarkable exception is the specific and programmable demethylation of the FWA gene in Arabidopsis obtained with ZF technology using the catalytic domain of human TEN-ELEVEN TRANSLOCATION1 (TET1) (Gallego-Bartolomé et al. 2018), a dioxygenase involved in the demethylation of DNA (Chen et al. 2014). In a related example, a ZF fused to DNAJ1 methylation reader, which form complexes with SUVH1 and SUVH3 proteins, increased the expression of adjacent genes through the recognition of DNA methylations (Harris et al. 2018).

The potential of CRISPR in epigenomic editing in plants

In the last decade, the CRISPR/Cas systems emerged as new versatile programmable effectors. They offer a wide range of applications with high efficiency and specificity, avoiding the main problem that limited previous tools, since CRISPR/Cas eliminates the need to make a new protein for each target (Waryah et al. 2018; Arya et al. 2020). The simplicity and versatility of this system based on a small RNA guide (sgRNA) and an effector Cas protein was proved initially in human cells (Cong et al. 2013; Mali et al. 2013; Cho et al. 2013) and later extended to all types of organisms (Sanders et al. 2018; Kim et al. 2019; Shi et al. 2019; Schuster and Kahmann 2019) including plant crops, oferring new possibilities to plant breeding (Veillet et al. 2019; Gao 2019; Shao et al. 2020; Zaynab et al. 2020). Moreover, CRISPR systems can work as programmable epigenetic editors and efficient transcriptional regulators. Following a similar strategy as that described for ZFs and TALEs, endonuclease-inactivated versions of Cas9 protein, known as “dead Cas9” or dCas9, obtained by mutagenesis of specific aminoacids in RuvC1 and HNH nuclease domains (Qi et al. 2013), can be transformed in easily programmable activation (CRISPRa) or repression factors (CRISPRi), depending on the effector domain that is attached to the Cas9 protein. This approach has been tested extensively, using both catalytic and non-catalytic domains, for the regulation gene expression with remarkable results in mammalian cells and other organisms, such bacteria and fungi (Shakirova et al. 2020; Zhang et al. 2020), opening new perspectives to its application in plant systems.

Transcriptional regulation strategies for dCas9

The initial strategies to generate programmable transcriptional effectors in plants based on CRISPR-dCas9 made use of direct fusions to the C-terminal of the Cas9 protein (Fig. 2a). TAL, VP64 and EDLL were the first non-catalytic domains attached to dCas9 for transcriptional activation in Arabidopsis and N. benthamiana. Following the same strategy, plant-derived BRD and SDRX were employed for transcriptional repression (Piatek et al. 2015; Vazquez-Vilar et al. 2016). In all cases, the activation/repression efficiency obtained was only moderate. However, a remarkable improvement in gene regulation was achieved when several domains were fused in tandem (Fig. 2b). The activation TV (6xTAL–VP128) domain, and the activation VPR (VP64-P65-RTA) domain (Chavez et al. 2015; Li et al. 2017, 2019), were the most successful examples of CRISPRa tandem fusions used in Arabidopsis and rice, achieving up to 190 fold activation in AtCDG1 gene when employing dCas9-TV (Li et al. 2017). At the same time, in the CRISPRi approach the tandem fusion of the SRDX repressor domain was employed, obtaining repressions of up to 80% in Arabidopsis miRNA target genes (Lowder et al. 2015). Regarding catalytic epigenetic domains, so far most examples have been tested only in mammalian cells, such as the histone demethylase LSD1, the human DNA demethylase TET1, the methyltransferase DNMT3A or the plant DNA demethylase ROS1 (Kearns et al. 2015; Vojta et al. 2016; Liu et al. 2016c; Devesa-Guerra et al. 2020). These examples prove that a direct fusion of these elements to dCas9 can change the epigenetic status in the target sequence. This makes its application to plants very promising, considering the orthogonality of the catalytic domains (Ji et al. 2018). A pionerering example in plants is the successful epigenetic regulation of the AREB1/ABF2 gene in Arabidopsis (Roca Paixão et al. 2019). The AREB1/ABF2 gene is involved in the ABA signaling pathway and its loss of function presents a phenotype of sensitivity to drought stress. In this work, aimed to specifically increase the expression of AREB1 and to obtain drought resistant plants, a programmable epigenetic editor was developed based on the direct fusion to dCas9 of the P300 catalytic domain derived from the Arabidopsis histone acetyltransferase 1 (AtHAC1). As distinctive feature in this approach, P300 was fused to the N-terminal end of Cas9 instead of the C-terminal end as it is usually described. Although the transcriptional activation of the gene was modest, reaching only twofold, a substantial increase in the survival rate was obtained when plants were exposed to drought stress.

Fig. 2
figure 2

Representation of Cas9 strategies for gene regulation. a Cas9 strategy based on the direct fusion of the effector domain in C- terminal. b Cas9 strategy based on the direct fusion in tandem of the effector domains in C- terminal. c SAM aptamer-based Cas9 strategy. A direct fusion of effector domains could be added in C-terminal. d scRNA aptamer-based Cas9 strategy. A direct fusion of effector domains could be added in C-terminal. e SunTag strategy with ten tandem repeats of GCN4 peptide in C-terminal that are recognized by ScFv fused to a effector domain. Created with BioRender.com

In a further elaboration of the technique, two new strategies were designed to recruit an even greater number of regulatory domains or combinations of them in a single dCas9 molecule, thus generating a new battery of CRISPR/dCas9-based regulation tools. The first type of strategies, represented by the so-called SAM (Synergistic Activation Mediator) and scRNA (scaffolding RNAs) approaches, are based on the permissiveness of the gRNA scaffold to accept RNA aptamers in its structure, which bind with high specificity other proteins such as the coat protein of the MS2 virus (Zhang et al. 2015; Konermann et al. 2015; Zalatan et al. 2015). This provides new anchoring sites in the dCas9 complex, increasing its ability to recruit regulatory domains. Also, SAM (Fig. 2c) and scRNA (Fig. 2d) are compatible with the direct fusion strategy discussed above, so that it allows different combinations of regulatory domains to be recruited simultaneously. The use of both SAM and scRNA approaches improved dramatically the activation rates in mammalian cells and in plants (Lowder et al. 2015, 2018; Park et al. 2017; Lee et al. 2019), and opened the way to further combinatorial optimizations of the system, as shown with the combination of EDLL and VPR domains in the recently described dCasEV2.1 activation system (Selma et al. 2019). In addition, the presence of an RNA aptamer also allows the incorporation of epigenetic catalytic effectors. In Arabidopsis, a scRNA strategy involving the histone acetyltransferase catalytic domain P300 produced a H3K27ac enrichment in the FT target gene. In the same work, a similar strategy using the histone methyltransferase KRYPTONITE was shown to produce a H3K9me2 enrichment in the same gene. However, although the epigenetic marks were evident, the effects on flowering produced by these marks were mild (Lee et al. 2019).

A second type of strategies uses multi-epitope tags to attach several regulatory domains to a single dCas9 molecule. The so-called SunTag strategy (Tanenbaum et al. 2014) fuses dCas9 to tandem GCN4 peptide repeats, which are recognized by a single chain antibody (ScFv) fused to a regulatory domain (Fig. 2e). Although the first versions of the Suntag strategy offered good results in mammalian cells for gene activation (Gilbert et al. 2014), the same effects were not achieved in plants, offering lower activation rates than the previously mentioned strategies (Selma et al. 2019). However, a new version of this strategy appeared shortly after. In this optimization, the size of the linker sequence separating tandem GCN4 epitopes changed from seven to twenty-two amino acids (Suntag22a), avoiding possible steric impediments (Morita et al. 2016). The new Suntag22a version offered very high activation when used with the non-catalityc VP64 TAD (Papikian et al. 2019). Furthermore, in another remarkable achievement, this platform was successfully used in the catalytic modification of epigenetic marks, first in mammal cells (Morita et al. 2016; Huang et al. 2017) and later in plants, by attaching the catalytic domains of TET1, involved in demethylation, and the NtDRM2 methylase (Gallego-Bartolomé et al. 2018; Papikian et al. 2019) respectively. In both studies, the FWA gene, involved in flowering time, was chosen as the target for epigenetic regulation. The study carried out by Gallego-Bartolomé et al. (2018) employing the dCas9-Suntag-TET1 strategy, achieved specific demethylation of the FWA promoter, and therefore, late flowering phenotypes in Arabidopsis. Likewise, the strategy carried out by Papikian et al. (2019), was based in the same molecular mechanism, but in this case dCas9-Suntag-DRM2 produced target demethylation on the FWA promoter, generating early flowering phenotypes. The epigenetic mutations obtained with these systems were transfered to the next generation of plants, which showed the same flowering phenotype as their parents. The programmable transcriptional and epigenetic effectors based on CRISPR described in plants are summarize in Table 1.

Table 1 CRISPR-based programmable transcriptional and epigenetic effectors described in plants

Transcriptional regulation using dCas12a and dCas12b

After the success of Cas9, new nucleases belonging to other types of the CRISPR/Cas family have emerged. In plants, Cas12a has been presented as a good alternative to Cas9 as a genetic editor (Kim et al. 2017a; Bernabé-Orts et al. 2019), showing a remarkable efficiency for targeted mutations. Following the same strategy used with dCas9, dCas12a versions have been also generated (Tang et al. 2017) (Fig. 3a) and used as programmable epigenetic editors in plants (Table 1). However, the sgRNA architecture of dCas12a does not allow the same modifications as dCas9 sgRNA and, for this reason, the aptamer-based strategies for gene regulation cannot be applied. On the other hand, transcriptional activation and repression has been achieved using a direct fusion strategy (Tang et al. 2017; Tak et al. 2017), and although catalytic epigenetic effectors are not reported in plants using the Cas12a SunTag strategy, there are very likely to work, as it was earlier shown in mammalian cells (Kim et al. 2017a; Zhang et al. 2018c).

Fig. 3
figure 3

Representation programmable epigenetic editors based in dCas12a and dCas12b. a dCas12a strategy for gene regulation based of direct fusion of an effector domain in C-terminal. b SAM aptamer-based dCas12b strategy for gene regulation. Created with BioRender.com

More recently, a new Cas12b protein from Alicyclobacillus acidiphilus has emerged as a novel programable effector in plants (Fig. 3b). This new Cas version can generate targeted mutations with moderate efficiency, and transcriptional regulators have been engineered using its “dead” version fused to the TV domain (Table 1). The dCas12b tool shows more flexibility than the dCas12a in its sgRNA structure, allowing the addition of RNA aptamers for MS2 protein recognition, and showing acceptables rates of gene activation in rice (Ming et al. 2020).

Perspectives of epigenomic editing applied to crop breeding

As reviewed above, the rapidly evolving CRISPR technologies offer new crop breeding tools in the form of programmable epigenetic editors. As mentioned, examples of successful epigenetic edition are already in place in model organisms, as for the regulation of flowering time in Arabidopsis through the manipulation of DNA methylation of the FWA gene (Gallego-Bartolomé et al. 2018; Papikian et al. 2019). As can be easily inferred, fine regulation of flowering time could be applied to crops to obtain early or late-flowering varieties, or even varieties that do not flower as a strategy to increase biomass. Importantly, it has been demonstrated that, at least for targeted DNA methylation at the FWA locus, modifications are inherited in subsequent generations (Papikian et al. 2019), this being an important prerequisite in crop breeding where trait stability needs to be ensured. Other promising breeding trait for epigenetic editing, especially relevant in the context of climate change, is drought stress resistance. Very recently, Arabidopsis lines have been generated with improved tolerance to drought by activating the AREB1/ABF2 genes using a dCas9 fused to a histone acetyltransferase (Roca Paixão et al. 2019). Being able to apply this type of technology to edible crops in areas with poor irrigation systems or dry climates could help to alleviate the effects of climate change in crop production.

Although most examples of directed epimutations described to date were developed in model species, it is not difficult to envision epigenetic versions of well-established alleles conferring favourable traits in crop species. Extensive research was carried out on the activation and repression of rice genes though CRISPR-based effectors (Li et al. 2017; Lowder et al. 2018; Ming et al. 2020). Potential candidates for epigenetic down-regulation in rice affecting grain size are GW2, GW5 or TGW6 genes. Likewise, the nutritional properties of the grain could be increased with the accumulation of essential microelements or health-related metabolites through the epigenetic activation of e.g., genes of the nicotianamine synthase (NAS) family, or the carotenoid synthesis pathway in the endosperm (Zheng et al. 2010; Xu et al. 2008; Fiaz et al. 2019). As new epigenetic mechanisms regulating traits of interest are being discovered, the number of potential targets of the new programmable editors also grows. For example, possible interesting targets for breeding are those mentioned previously in this review, such as the ripening genes RIN and and CNR in tomato (Liu et al. 2016a; Lang et al. 2017; Li et al. 2020), the clock genes CCA1 and LHY (Ni et al. 2009), or the FLC and COR genes involved in cold resistance (Yang et al. 2014; Park et al. 2018) (Fig. 4).

Fig. 4
figure 4

Representation of target traits for crop improvement using programmable epigenetic editors. Epigenetic regulation of genes involved in phenotypic traits of interest as possible targets of programmable epigenetic editors. The RIN and CNR genes are selected as targets of fruit ripening. Early ripening included the demethylation of H3K27me3 associated to RIN gen and DNA demethylation of RIN and CNR promoters. Late ripening is represented by the enrichment of H3K27me3 marks associated to RIN and DNA methylation in RIN and CNR promoters (Liu et al. 2016a; Lang et al. 2017; Li et al. 2020). The clock genes CCA1 and LHY are negative regulators of growth and biomass improvement (Ni et al. 2009) and can be repressed by the reduction of the associated active epigenetic marks H3K9Ac and H3K4Me2. Drought resistance is linked to the expression of AREB1, which expression can be increased thought the enrichment of histone acetylation (Roca Paixão et al. 2019). FWA gene regulates the flowering time. FWA activation though DNA demethylation promotes late flowering and FWA repression though DNA methylation promotes early flowering (Yang et al. 2014; Park et al. 2018). Created with BioRender.com

Epigenetic editions could bring new sources of genetic variation to crop breeding, enabling trait associated features that are not readily accessible to classical and/or directed mutagenesis as e.g. tissue/organ specificity or environmental plasticity. Interestingly, next to functional and agronomical considerations, legal and regulatory aspects need to be considered when discussing the future of epigenetic breeding. It is uncertain what will be the legal status of epigenetically modified crops, which, according to its definition, cannot be considered strictly speaking as genetically modified. Taking as reference the EU directive 2001/18/EC on the deliberate release of Genetically Modified Organism, where GMOs are defined as those organisms whose “genetic material has been altered”, it is reasonable to argue that histones can not be considered as genetic material and, consequently, modifications in histone-based epigenetic marks should not fall within the scope of this directive. Similar uncertain regulatory scenarios are likely to occur regarding DNA epigenetic marks. Although transgenics is currently the most common method to deliver CRISPR/Cas and gRNAs to the plant cells, new delivery methods based on nanoparticles, biolistics and viral vectors are being developed that could circumvent the need for transgenic intermediates (Uranga et al. 2020; Ariga et al. 2020). In this scenario, epimutations obtained through non-transgenic editing tools could circumvent current GMO regulations and this would undoubtedly be a strong incentive for the development of new tools and techniques.