Introduction

Embryonic stem cells (ESCs) are derived from the inner cell mass (ICM) of day 5–8 blastocysts (Thomson et al. 1998; Stojkovic et al. 2004) or morula-stage embryos (Strelchenko et al. 2004). ESC are pluripotent, are capable of karyotypically stable prolonged self-renewal and are characterised by their potential to differentiate into cells of the three germ layers, both in vitro and in vivo (Evans and Kaufman 1981; Thomson et al. 1998; Stojkovic et al. 2004). In contrast to the specific gene expression programmes observed in differentiated cells, ESC are defined by their potential to activate all of the gene expression programmes that are found in embryonic and adult cell lineages (Chambers and Smith 2004). Various studies have endeavoured to understand the molecular mechanisms that give ESC their properties (Ivanova et al. 2002; Ramalho-Santos et al. 2002; Sperger et al. 2003; Margueron et al. 2005; Armstrong et al. 2006a, b) and these attempts to identify a molecular “signature” of ESC have borne some fruit, including the identification of the transcription factors OCT4 and NANOG as markers for ESC.

However, these experiments also show that there is little other overlap in terms of gene expression between different ESC lines and also within the same ESC lines studied in different laboratories. This lack of overlap questions whether a common molecular identity of ESC can be uncovered (Evsikov and Solter 2003; Fortunel et al. 2003; Vogel 2003). However, recent studies have indicated that epigenetic mechanisms, processes that can alter gene function without a change in the DNA sequence, may regulate self-renewal, pluripotency and lineage-specific differentiation, giving ESC their unique characteristics. Epigenetic mechanisms include modification of histones proteins, DNA methylation, ATP-dependent remodelling, incorporation of variant histones, changes in local and higher order conformation of DNA, and RNA interference (RNAi). Through the combined efforts of these epigenetic mechanisms, gene expression patterns can be tightly and dynamically regulated.

Chromatin modification in ESC pluripotency and differentiation

Various observations have suggested that the genome undergoes important epigenetic alterations during mammalian development and ESC differentiation (Margueron et al. 2005; Morgan et al. 2005; Reik 2007), from an open environment rich in euchromatin in ESC, to a more compact heterochromatic structure upon differentiation (Francastel et al. 2000; Arney and Fisher 2004). Somatic cell nuclear transfer (SCNT) experiments have also demonstrated that reprogramming of the somatic nucleus to a pluripotent state requires large-scale epigenetic modifications (Armstrong et al. 2006a, b). Several groups have now begun to understand the role of epigenetics in the ESC and have suggested some interesting models.

The chromatin structure of mouse ESC (mESC) has now been demonstrated to be hyperdynamic, in that major architectural proteins, such as H10, H2B, H3 and HP1α, are loosely bound to chromatin with extremely short residency times (Meshorer et al. 2006). Differentiation leads to a decrease in the dynamic nature of these proteins, as demonstrated by increased residency time, suggesting that this dynamic nature of chromatin is specific to pluripotent cells and that differentiation leads to the restructuring of the genome. Additionally, heterochromatic markers have been demonstrated to change from a dispersed localisation in mESC, to more concentrated distinct foci in differentiated cells, with increased global levels of trimethylated lysine 9 H3 (TriMeK9 H3), a heterochromatic histone modification linked to gene repression, and decreased levels of acetylated histones H3 and H4 (AcH3 and AcH4), modifications linked to euchromatin and the permissivity of gene expression. Analyses of global histone modification patterns in ESC has previously suggested that the ESC genome is subject to generalised histone acetylation and lysine 4 H3 methylation (MeK4 H3; Kimura et al. 2004; Lee et al. 2004). Therefore, these changes in global genomic architecture and global histone modifications suggest that the chromatin environment in ESC is highly euchromatic and that the genome is therefore highly permissive for gene expression. This would allow for pluripotent nature of ESCs, with the genome becoming more structured, condensed and heterochromatic upon differentiation, leading to loss of pluripotency.

Another study on mESC has aimed at understanding regulatory mechanisms involved in development by studying highly conserved noncoding elements (HCNEs; Bernstein et al. 2006). These elements have been studied because of their presence in regions concentrated for genes encoding developmentally important transcription factors. Large-scale chromatin immunoprecipitation (ChIP) assays have found large areas of a repressive histone modification, methylation of Lysine 27 histone H3 (MeK27 H3; Cao and Zhang 2004), along-side smaller regions of a permissive modification, MeK4 H3 (Schubeler et al. 2004; Bernstein et al. 2005) at HCNEs. This is unusual insomuch that these modifications are generally mutually exclusive; hence, these regions have been termed “bivalent” domains. Interestingly, the bivalent domains coincide with differentiation-associated transcription factor genes expressed at extremely low levels in the ESC and they have therefore been proposed to act not only to silence such genes in order to maintain pluripotency, but also to allow them to remain poised for transcription, so they can be rapidly activated on differentiation. Upon differentiation, these bivalent domains “resolve”, so that silent genes become concentrated for MeK27 H3 only and active genes become enriched for euchromatic modifications. The large regions that the bivalent domains cover may also provide a means of transmitting epigenetic modifications through cell division (“epigenetic memory”; Henikoff et al. 2004; van Steensel 2005) to maintain lineage-specific expression or repression of such critical genes. This would be important for the transmission of such regulatory mechanisms through differentiation and replication towards somatic cells. Approximately half of the identified bivalent domains have been found to have binding sites for at least one of the three pluripotency-associated transcription factors, viz. OCT4, NANOG and SOX2 (Boyer et al. 2005), meaning that these silenced genes are associated with pluripotent transcription factors normally involved with gene activation (Boiani and Scholer 2005; Stewart et al. 2006). NANOG, SOX2 and OCT4 are known to be able to interact with epigenetic modifiers, including Polycomb Group (PcG) proteins, and therefore may have a role to play in modifying the epigenetic environment of gene promoters (Wang et al. 2006).

Perhaps most interestingly, the study (Bernstein et al. 2006) has also found a correlation between the chromatin environment and the underlying DNA sequence, suggesting a mechanism by which the DNA sequence can dictate histone modification patterns. A striking correlation exists between the presence of MeK4 H3 and CpG islands and the transcription start sites in ESCs. MeK27 H3 is correlated with the lack of transposon-derived sequence and such regions have been further shown to be enriched at genes encoding developmental and tissue-specific transcription factors. This mechanism would provide a means for the early chromatin environment to be organised efficiently and suitably for the purpose of pluripotency in the ESC.

Similar studies into the epigenetic basis of ESC pluripotency and differentiation have uncovered a role for PcG proteins. Genetic experiments in Drosophila have demonstrated that PcG members are required for maintaining the inactive state of homeotic and other important regulators during development (Ringrose and Paro 2004). They have also been shown to be required for early embryo development (Voncken et al. 2003; Pasini et al. 2004; Isono et al. 2005) and have been previously implicated in ESC pluripotency (O’Carroll et al. 2001). The first study utilised replication time at S-phase as an indicator of the chromatic state of the genome (euchromatic accessible genes replicate earlier, heterochromatic condensed genes later; Schubeler et al. 2002; Azuara et al. 2003) and compared ESC and differentiated cells (Azuara et al. 2006). Surprisingly, silent-lineage-associated genes were transcribed at the same time as pluripotency-associated genes in ESC, indicating similarities in their chromatin environment. However, upon differentiation, pluripotency-associated genes changed their replication timing to later, whereas lineage-associated genes replicated earlier. The finding that lineage-associated genes replicate early in ESC alongside pluripotency-associated genes suggests that they have a permissive chromatin structure while remaining silent. Further studies have demonstrated that the silent-lineage-associated genes also have the “bivalent” chromatin structure observed previously. The functionality of the MeK27 H3 modification has been demonstrated in ESC lacking EED, a member of the PRC2 PcG complex, which is required for EZH2-mediated K27 H3 methylation (Kirmizis et al. 2004); these cells show premature expression of previously silenced genes and loss of pluripotency.

The role of MeK27 H3 and its link to pluripotency in ESC has been further analysed in another study of mESC. Similar to the previous studies, TriMeK27 H3 has been found at the promoters of silent-lineage-associated genes in mESC; this is associated with the binding of components of PRC1 and PRC2 PcG complexes (Boyer et al. 2006). Another study has also revealed that the binding sites for the PcG group protein SUZ12 (Lee et al. 2006) are associated with developmentally important transcription factor genes, HCNEs, TriMeK27 H3 and gene repression. Again, OCT4, NANOG and SOX2 also bind a high percentage of these genes. Upon differentiation, such genes are specifically upregulated, with the loss of TriMeK27 H3, suggesting that PcG proteins have specialised roles in silencing genes in ESC. Again, the functionality of MeK27 H3 has been shown by the lack of repression of lineage-associated gene expression in EED-deficient cells (Boyer et al. 2006). Additionally, SUZ12 binding has been assayed in differentiated somatic cells (Lee et al. 2006). SUZ12 is depleted at transcriptional regulators required for the said somatic cell type, whereas SUZ12 is still found at promoters of regulators not required in the said cell type, supporting a model in which PRC2 binding in ESC represses key developmental regulators that are later expressed during differentiation.

PcG proteins have also been linked to DNA methylation (Vire et al. 2006), which is in turn linked to gene silencing and alterations of chromatin structure (Lande-Diner et al. 2007). DNA methylation patterns change during differentiation, with lower DNA methylation, and therefore a more euchromatic and open chromatin environment, as is apparent in ESC (Shiota et al. 2002). Additionally, DNA methylation profiling has shown that ESC have a unique DNA methylation pattern that may contribute to pluripotency and development when compared with many other cell types (Bibikova et al. 2006). A model whereby the long term silencing of lineage-irrelevant genes upon differentiation could be directed by PcG proteins and given stability by DNA methylation is appealing (Shiota et al. 2002).

Chromatin bivalency provides an interesting mechanism to maintain pluripotency and regulate differentiation. However, some loci show no evidence for chromatin bivalency at the ESC stage (Anguita et al. 2004; Szutorisz et al. 2005). Szutorisz et al. (2005) have suggested another epigenetic model by which ESC can regulate pluripotency and differentiation; they have demonstrated that intergenic regions are marked by histone modifications in the ESC and aid in the temporal regulation of gene transcription (Szutorisz et al. 2005). Their study has concentrated on the λ5-VpreB1 locus in mESC and they have found that the intergenic region contains a discrete region of AcH3 and MeK4 H3 in the ESC. Further analysis has shown that, during B-cell development, the modifications spread from this discrete region, leading to the formation of a large active chromatin domain in pre-B cells (correlated to gene expression), whereas in non-haematopoietic cells, the modifications become erased (correlated to gene repression). These modifications also act as a nucleation point for general transcription factors and RNA polymerase II. Evidence for this regulatory mechanism has also been extended to some other genes (Chakrabarti et al. 2003; Chambeyron and Bickmore 2004; Vieira et al. 2004). This model suggests that the pluripotent state, viz. a state with large differentiation potential, would have more of these discrete regions of histone modifications (termed early transcription competence marks; ETCMs) than, for example, a hepatic stem cell, which is more restricted in terms of differentiation potential and which therefore would have fewer ETCMs.

Another suggested model for epigenetic regulation of the ESC is the histone “pulsing” model (Gan et al. 2007) in which each individual ESC exhibits variable histone modification patterns at individual genes over time. The model suggests that silent-lineage-associated genes, whose chromatin environment has been previously described as bivalent, have a modification patterning that changes at a higher frequency and/or a longer duration, whereas those not showing a bivalent chromatin structure have a lower frequency and/or a shorter duration of change. This leads to the supposition that two mutually exclusive modifications exist at the same gene (as for bivalency) but that, really, the ChIP assay is giving a representation of two (or more) different chromatin states because of the use of a heterogeneous ESC population for the studies. Therefore, gene expression would only occur in a cell that had a permissive chromatin environment at the gene at the time of stimulation. This suggests that ESCs are a highly heterogeneous population, with regards to their chromatic environment, correlating with previous data demonstrating elevated dynamics of histone residence in ESC (Meshorer et al. 2006). It also correlates with data demonstrating that the pluripotent cells of the ICM are heterogeneous (Chazaud et al. 2006); this may be attributable to the different reactions of the pluripotent cells of the ICM to a stimulatory signal, caused by a heterogeneous chromatin environment and gene expression potential. However, Re-ChIP analysis has shown that MeK4 H3 and MeK27 H3 do occur at the same time on the same gene suggesting that bivalency is a bona fide regulatory mechanism (Bernstein et al. 2006). This does not however rule out this epigenetic model in ESC.

Chromatin remodelling and differentiation

Consistent with the notion of genomic re-organisation upon differentiation, numerous ATP-dependent chromatin remodellers are elevated in ESC (Kurisaki et al. 2005) and are required for differentiation (Kaji et al. 2007). The requirement for these proteins correlates with previous studies of the hyperdynamism of structural proteins (Meshorer et al. 2006). Loss of chromatin remodelling complex proteins has also been shown to lead to lethality at the blastocyst stage, when the ICM is being formed (Klochendler-Yeivin et al. 2000; Cao et al. 2003; Houlard et al. 2006), suggesting that some rearrangement must occur before the formation of the ICM and that epigenetics plays a role in the development of the pluripotent state.

Histone modifications influence formation of pluripotent ESC

Modifications to histones may also play a role in the initial formation of the pluripotent cell within the ICM of the mammalian embryo. Differing developmental fate and potency in the mouse embryo may be established as early as the four-cell stage (Fujimori et al. 2003; Piotrowska-Nitsche et al. 2005) and now Torres-Padilla et al. (2007) have suggested that the regulation of histone modifications might play a key role in this mechanism. Arginine methylation of histone proteins has been linked to gene activation (Chen et al. 1999; Ma et al. 2001) and this study (Torres-Padilla et al. 2007) has found that levels of methylation of specific arginines are correlated with cell fate and potency. High methylation at the four-cell stage is correlated with a contribution to the ICM and polar trophectoderm with elevated levels of global transcription, whereas low levels of methylation are correlated with a contribution to the mural trophectoderm. Overexpression of CARM1, an arginine methyltransferase, in individual blastomeres directs these cells progeny to become part of the ICM and also results in a dramatic upregulation of NANOG and SOX2. Again, this study links an open chromatin environment, leading to permissivity of gene expression, with pluripotent potential and identifies specific histone modifications as the earliest known epigenetic markers contributing towards the development of pluripotent cells. It also indicates that the manipulation of epigenetic information can influence cell fate determination.

The ability to manipulate the “epigenotype” of cells may have far-reaching consequences, for if we can alter histone modifications at will, we may be able to reprogram somatic genomes to a more pluripotent state. Epigenetic reprogramming is of course the basis of SCNT, which is capable of repressing the unique repertoire of somatic cell gene products (Armstrong et al. 2006a, b); however, as a means of producing patient-specific pluripotent cell lines, SCNT has many drawbacks. The value of the technique lies in the investigation of the mechanisms by which epigenetic reprogramming occurs; our increasing knowledge of the impact of epigenetics on pluripotency and the newer technologies for examining epigenetic changes by using small cell numbers are laying the groundwork for such an investigation.

Exciting new research has perhaps provided a means of the generation of relatively large numbers of patient-specific pluripotent cells, by the transfection of just four defined factors into somatic cells (Takahashi and Yamanaka 2006). Although the exact details of just how these factors (Oct3/4, Sox2, Klf4 and Myc) reprogram normal cells into pluripotent cells are not fully understood, Myc is known to be able to induce widespread changes in the chromatin environment and to mediate immortalisation and has been suggested to be required for the other factors further to mediate the reprogramming process (Yamanaka 2007).

Testing ESC epigenetics in vivo: a problem solved?

Large-scale ChIP analyses have proved to be a vital tool in understanding the epigenetic mechanisms in cultured ESC cells. However, a conventional ChIP assay tends to use cells in their millions for analysis, preventing its use for cells of the ICM, which can be as low as 20 cells. The employment of a modified ChIP assay with Drosophila SL2 cells as a “carrier” for ESC (carrier ChIP) has lowered the level of ESCs needed from millions to as low as 100 cells (O’Neill et al. 2006). This has allowed the study of histone modifications at key regulator genes in small amounts of ESCs and cells of the ICM and trophectoderm of cultured blastocysts. O’Neill et al. (2006) have used this assay and found that the promoter regions of NANOG and POU5F1 (OCT3/4) are enriched for AcH4 and TriMeK4 H3 in the ICM, where they are active, but depleted in AcH4 and TriMeK4 H3 and enriched for MeK9 H3 in the trophectoderm, where they are silent, with the converse being found for CDX2, which is expressed in the trophectoderm. Carrier ChIP analyses of ICM cells and ESC correlate well, although the high levels of MeK9 H3 and low levels of AcH4 found in the ICM cells for silent genes are lower in ESC; carrier ChIP also correlates with typical ChIP analysis using ESC.

Nuclear architecture and pluripotency

Genomes are known to have a distinct structure within the cell nucleus (Parada and Misteli 2002) and different nuclear positioning can correlate with their functional output (Spector 2003). Generally, gene-rich chromosomes are located in the centre of the nucleus, whereas more gene-poor chromosomes localise towards the nuclear periphery (Gilbert et al. 2005; Sproul et al. 2005). Studies of the nuclear organisation of human ESC (hESC) has shown that the organisation of chromosomes in hESC is similar to that observed in differentiated cells but that certain chromosome regions and gene loci with a role in pluripotency have a distinct localisation (Wiblin et al. 2005). Chromosome 12p, which contains clustered pluripotency genes (including NANOG), has a central nuclear localisation in ESC relative to differentiated cells, whereas chromosome 6p shows no change in nuclear chromosome position, but the OCT4 locus is relocalised to a position outside its chromosome territory. This has also been observed for the HOXB locus in mESC upon retinoic-acid-mediated differentiation (Chambeyron et al. 2005). These genes have been shown to be associated with high levels of permissive chromatin modifications and this may therefore link to the chromosomal territory (O’Neill et al. 2006).

RNAi and pluripotency

A further suggested model of the regulation of pluripotency and differentiation in ESC involves RNAi, which has roles in heterochromatin formation (Wassenegger 2005). More genes are observed to be expressed in ESC relative to differentiated cells (Abeyta et al. 2004; Eckfeldt et al. 2005) correlating well with the globally permissive chromatin environment observed. Non-coding sequences are also active in ESC (Lehnertz et al. 2003; Martens et al. 2005); this may be vital for heterochromatin formation by RNAi upon differentiation (Meshorer et al. 2006). Indeed, perturbations to the RNAi machinery in ESC disrupts differentiation (Kanellopoulou et al. 2005). Interestingly, PcG proteins have been shown not only to colocalise with ARGONAUTE1, a core component of the RNAi machinery, at target promoters (Kim et al. 2006), but also to possess RNA-binding properties (Zhang et al. 2004), suggesting that RNAi and PcG may work together to mediate heterochromatin formation upon differentiation.

Concluding remarks

Many interesting models and mechanisms have been suggested for the way that epigenetics impact ESC biology, but which model is utilised by ESCs? All the mechanisms suggested probably have some role to play and further studies might uncover the nature of the dynamic interaction between transcription factors, epigenetic regulators and other important effectors in the regulation of ESC biology.