Introduction

Dosage compensation mechanisms have evolved in mammals because of the divergence of the sex chromosome complement between males (XY) and females (XX). Chromosome-wide transcriptional silencing of one X in females evens gene expression dosage differences between the sexes (Lyon 1961). X inactivation is a chromosome-wide phenomenon and thus the majority of genes on the inactive X are silenced. However, some genes escape X inactivation (escape genes), i.e., remain expressed from both the active and inactive X alleles (Prothero et al. 2009). In this review, we will focus on this particular subset of X-linked genes, in terms of differences between species, molecular mechanisms of escape, sex differences caused by differential expression and roles in disease.

We will first briefly summarize the main features of X inactivation, as this will be extensively discussed in this issue. In mouse, X inactivation is initiated in female embryos by transcription of the long noncoding RNA (ncRNA) Xist from the X destined to become the inactive X (Xi), while Xist is repressed on the active X (Xa) by the antisense Tsix (Brown et al. 1991; Lee et al. 1999). Expression of Xist is regulated by factors that control pluripotency during mouse development and keep Xist repressed prior to X inactivation (Navarro et al. 2008). Additional factors that have been implicated in the initiation of mouse X inactivation comprise the ncRNA Jpx and the E3 ubiquitin ligase RNF12 (Barakat et al. 2011; Tian et al. 2010). X inactivation is initially imprinted in early mouse embryos with the paternal X (Xp) always being silenced (Takagi and Sasaki 1975; West et al. 1977). However, in the blastocyst, the Xp reactivates and random X inactivation takes place. Recent evidence suggests that a bias toward inactivation of the Xp persists in mouse neonatal brain, perhaps due to incomplete erasure of the Xp imprint and/or preferential growth of cells with the active maternal X (Xm) (Wang et al. 2010). In humans and in rabbits, X inactivation also involves increased XIST expression but it occurs later in development, is random and not imprinted (Okamoto et al. 2011). As a result of random X inactivation, somatic tissues in females are mosaic with cells that differ in the parental origin of the X that is inactivated (Migeon 2007) (Fig. 1). Following the onset of X inactivation, a series of epigenetic modifications that implement chromosome silencing and heterochromatin formation takes place (Heard and Disteche 2006). Xist cis-spreading along the X is still not well understood, but specific motifs may be involved. Notably, repeat elements such as interspersed repetitive elements (LINE-1) are critical for the formation of a silenced compartment and AT-rich motifs may also be involved (Agrelo et al. 2009; Chow et al. 2010; Nguyen et al. 2011). Xist recruits the polycomb complex PRC2, which mediates the initial histone changes (e.g., methylation of lysine 27 of histone H3) on the inactive X (Plath et al. 2003). Additional changes include methylation at CpG islands and late replication.

Fig. 1
figure 1

X inactivation and escape patterns. Before differentiation, the paternal X (Xp) and the maternal X (Xm) chromosomes are active. As cells differentiate, random X inactivation is initiated by coating one X chromosome with Xist RNA (pink cloud). This will become the inactive X, while the other X remains active (green chromosome). Since the process is random, either the Xp or the Xm is inactivated in a given cell, resulting in mosaicism in females. Some genes escape X inactivation, i.e., are expressed from both the Xi and the Xa (yellow bars) in all cells. Other genes escape from X inactivation in a subset of cells in a given tissue resulting in mosaicism of escape patterns. An additional layer of variability in escape patterns results from differences between individuals (Carrel and Willard 2005; Yang et al. 2010)

Genes that escape X inactivation are located throughout the X chromosome, but they predominate in the small regions of homology and pairing that persist on the sex chromosomes called the pseudoautosomal regions (PAR). As expected, genes within the PAR are usually not subject to X inactivation since functional, equivalent alleles are present on the X and Y chromosomes in males and on both X alleles in females. Not surprisingly, non-pseudoautosomal genes that retain a Y-linked copy also often escape X inactivation and thus have two expressed alleles in both male and female somatic tissues (Disteche et al. 2002). Nonetheless, a number of escape genes have lost or differentiated their Y-linked copy and would be predicted to have higher expression in females, which may cause phenotypic sex differences (see below). This suggests either that establishment of X inactivation lags behind Y degeneration, or that specific mechanisms exist to maintain expression of a subset of genes from the Xi due to selective advantages (Park et al. 2010).

Escape from X inactivation not only occurs in female soma, but has also been observed in male meiotic cells where meiotic sex chromosome inactivation (MSCI) takes place. In this cell type, X-linked microRNAs (miRNAs) escape silencing by MSCI, but the effects are unknown (Song et al. 2009). In the present review, we will limit our discussion of escape from X inactivation to female somatic cells. We will summarize the progress made in comparing the number and distribution of human and mouse escape genes, examine possible molecular mechanisms involved in facilitating escape from X inactivation and discuss the role of escape genes in sex differences and disease.

Escape from X inactivation in human and mouse

A systematic survey of human genes has shown that about 15% of X-linked genes consistently escape, based on expression analyses in rodent x human hybrid cell lines that retain a human Xi (Carrel and Willard 2005). In contrast to the situation in human, only 3% of mouse genes escape X inactivation as shown by our next-generation RNA-sequencing study to survey allele-specific expression based on SNPs in a somatic cell line derived from a cross between two mouse species (Yang et al. 2010). Thus, X inactivation in mice is more complete than in humans, although additional analyses will be necessary to extend these findings to other cell types and tissues. Measurements of the relative expression of each allele identified by SNPs in human and mouse cells demonstrate that expression from the Xi is significantly lower than from the Xa (Carrel and Willard 2005; Yang et al. 2010). Expression from the Xi ranges from a few percent to equivalent expression to that from the Xa, likely because of partial spreading of silencing in some of the escape domains. While these findings demonstrate a continuum between complete silencing and escape, genes with a minimum expression level from the Xi that represents 10% of that from the Xa are usually considered escape genes.

The distribution of escape genes along the X is not random and differs between human and mouse. In humans, escape genes are clustered (as many as 13 adjacent genes in large domains ranging in size between approximately 100 kb and 7 Mb), whereas in mouse, single escape genes are embedded in regions of silenced chromatin. This suggests that escape from X inactivation in mouse may be controlled at the level of individual genes rather than chromatin domains (Carrel and Willard 2005; Tsuchiya et al. 2004; Yang et al. 2010). Most escape genes are located on the short arm of the human X, likely because this region has most recently diverged from the Y (Carrel and Willard 2005; Lahn and Page 1999; Ross et al. 2005). Another factor may be the centromeric heterochromatin of the human X, which could prevent proper cis-spreading of XIST RNA from its location on the long arm (Disteche 1999; Duthie et al. 1999). In mouse where the centromere is located at one end of the X, spreading of Xist RNA may be facilitated, contributing to fewer escape genes (Yang et al. 2010).

Little is known about the number and distribution of escape genes in other mammalian species. However, some information exists in marsupials, where key features of eutherian X inactivation are absent. Notably, Xist is absent in marsupials (Duret et al. 2006) and the paternal X is always silenced in somatic tissues (Graves 1996). Variable escape from X inactivation has been reported for a number of marsupial genes in a tissue-dependent manner that is incomplete and stochastic, suggesting that X inactivation is more heterogeneous (Al Nadaf et al. 2010; Deakin et al. 2009).

Variable escape

In eutherian mammals, escape from X inactivation can vary between tissues and/or individuals, but this variability appears to affect a minority of genes (Fig. 1). In human, about 10% of X-linked genes show variable escape, as shown by expression in some but not all hybrid cell lines retaining a human Xi (Carrel and Willard 2005). This variability has also been observed at the level of individuals (differences between women), and tissues or individual cells within a tissue (differences within an individual) (Anderson and Brown 1999; Carrel and Willard 1999, 2005). TIMP1 provides an example of a gene with variable escape between women and in different tissues (Anderson and Brown 1999). When analyzed in primary diploid human female cell lines, REP1 is expressed solely from the Xa in some cell lines, but is bi-allelic in others (Carrel and Willard 1999). Furthermore, expression from the Xi may change over time during development or adulthood: indeed, escape genes may be initially silenced, followed by reactivation during development or with aging (Schoeftner et al. 2009). For example, Kdm5c, a gene that consistently escapes X inactivation in adult mouse cells and tissues, is silenced in a number of embryonic cells, the number of which progressively diminishes during development (Lingenfelter et al. 1998). Changes in expression from the Xi can also be observed during normal aging. Reactivation of Otc has been detected in individual cells within tissues of aged mice (Wareham et al. 1987). Analyses of the chromatin structure of mouse escape genes in embryos and adult tissues also suggest changes in escape from X inactivation during development (Yang et al. 2010) (see below). Thus, while a subset of X-linked genes are stably inactivated, or stably escape X inactivation, another category of genes displays plasticity in terms of allelic expression. It has been suggested that this may contribute to phenotypic differences between women (Fig. 1) (Carrel and Willard 2005; Migeon 2007).

Molecular mechanisms of escape

While many features of the initiation of X inactivation are now understood, the process of heterochromatization of the entire X is still not fully clarified. How does Xist RNA coat the Xi, and how is the silent compartment formed? The study of escape genes can significantly help in figuring out the process of spreading and heterochromatin formation. Indeed, one would expect that escape genes are depleted of epigenetic changes characteristic of genes silenced by X inactivation. Importantly, escape genes, for example Kdm5c and Kdm6a, appear to be devoid of Xist RNA coating (Murakami et al. 2009) (Fig. 2). The distribution of repeat elements may also be important in establishment of domains of escape and inactivation. Overrepresentation of LINE-1 repeats on the X has been proposed as an important factor that may serve to anchor Xist RNA, thus aiding its spreading (Lyon 1998) (Fig. 2). Furthermore, a recent study in mouse shows that LINE-1 repeats themselves recruit Xist and are essential for the formation of a condensed heterochromatic core (Chow et al. 2010). Studies of escape genes have lent support to the LINE-1 hypothesis by demonstrating that such genes have fewer LINE-1 repeats (Bailey et al. 2000; Carrel et al. 2006; Ross et al. 2005).

Fig. 2
figure 2

Molecular characteristics of escape and inactivated X-linked genes. This model shows that silenced regions on the inactive X are coated with the non-coding Xist RNA (pink cloud). Nucleosomes in the inactive X territory contain histones decorated with modifications associated with transcriptional repression and chromatin condensation, for example, H3K27me3 and H3K9me3, as well as the variant histone macroH2A (green circles). In addition, the CpG islands at the 5′ end of inactivated genes are methylated (solid black circles). Silenced chromatin also harbors specific DNA motifs (e.g., LINE-1 elements and AT-rich motifs) (yellow and blue boxes, respectively) to facilitate Xist RNA binding as well as binding of specific proteins such as SATB1. In contrast, genes that escape X inactivation are not coated with Xist, contain nucleosomes with modifications associated with active transcription, for example, H3 and H4 acetylation and H3K4me3, and their CpG islands are unmethylated (open circles). Chromatin between inactivated and escape genes may be bound by the chromatin insulator CTCF preventing the spread of heterochromatin into the escape regions or, vice versa, the spread of euchromatin into the silenced regions (Filippova et al. 2005)

In addition to LINE-1 elements, other repeats and sequence motifs may also play a role in escape. In mouse, long terminal repeats (LTRs) are depleted at escape genes (Tsuchiya et al. 2004). There is also evidence that AT-rich motifs are depleted at escape genes (Fig. 2). Wang et al. (2006) reported that the majority of 3-mers and 5-mers enriched around human genes subject to X inactivation, but not around escape genes, are AT-rich (Wang et al. 2006). This is consistent with our recent observations that mouse escape genes contain fewer AT-rich motifs than genes subject to X inactivation (Nguyen et al. 2011). The role of AT-rich motifs in the process of silencing by X inactivation is supported by findings that SATB1, a protein that specifically binds AT-rich DNA, both binds to the Xi and causes disruption in the silencing function of Xist if mutated in mouse cells (Agrelo et al. 2009). Taken together, these observations suggest that the distribution and content of DNA sequence motifs on the X influence silencing and escape, via controlling Xist RNA recruitment and/or binding of specific proteins that silence genes on the Xi. In addition to Xist, other ncRNAs could be involved in establishing escape and inactivation domains as suggested in a study in which genes that transcribe ncRNAs with a female-biased expression were found adjacent to known protein-coding escape genes (Reinius et al. 2010).

Among the epigenetic marks associated with X silencing, tri-methylation at lysine 27 of histone H3 (H3K27me3) is a repressive chromatin modification catalyzed by the PRC2 complex (Plath et al. 2003). Other repressive marks enriched on the Xi include tri-methylation at lysine 9 of histone H3 (H3K9me3) and incorporation of the histone variant macroH2A (Changolkar and Pehrson 2006; Heard et al. 2001; Peters et al. 2002) (Fig. 2). Such repressive marks are depleted at escape genes (Boggs et al. 2002; Yang et al. 2010; Goto and Kimura 2009; Changolkar et al. 2010). In contrast, active marks such as acetylation of H3 and H4 and tri-methylation at lysine 4 of histone H3 (H3K4me3) are lost from silenced chromatin, but are retained within euchromatic regions of escape (Goto and Kimura 2009; Jeppesen and Turner 1993; Marks et al. 2009; Khalil and Driscoll 2007) (Fig. 2).

How do escape and silenced regions on the Xi co-exist adjacent to each other? There is evidence of boundary elements, such as chromatin insulator proteins that may block either the spreading of heterochromatin into escape regions, or vice versa the spreading of euchromatin within the silenced regions. The chromatin insulator protein CTCF binds to the transition region between the escape gene Kdm5c and the inactivated gene Iqsec2 in mouse (Filippova et al. 2005). In contrast, CTCF does not bind to the corresponding transition region in human where both genes escape X inactivation. Another region of transition between the escape gene Eif2s3x and the silenced gene Klhl15 also binds CTCF. We have also determined that the CpG island at the 5′ end of Kdm5c remains hypomethylated throughout mouse development, possibly because it is bound by CTCF (Filippova et al. 2005) (Fig. 2). Kdm5c retains its escape status when it and its flanking regions are inserted into inactivated loci on the Xi, indicating that the gene itself together with limited flanking sequences is sufficient to elicit escape (Li and Carrel 2008). However, there is evidence that CTCF-binding alone is not sufficient to protect from silencing. Indeed, insertion of CTCF-binding sites from the HS4 insulator site (from the chicken β-globin gene cluster) at each end of a short reporter gene does not prevent its silencing when inserted within an inactivated gene on the mouse Xi (Ciavatta et al. 2006). Further studies are needed to sort out the role of CTCF in escape from X inactivation. Spatial orientation of chromatin is probably critical in maintaining escape from X inactivation. One attractive hypothesis is that escape results from looping of specific domains out of the condensed Xi territory (Heard and Bickmore 2007; Filippova et al. 2005) (Fig. 2). This hypothesis is further supported by evidence that LINE-1 elements are important for the formation of a condensed core of heterochromatin, which would nucleate X-linked repeats, around which silenced genes would be arranged within the Xist cloud, while escape genes would occupy the most outer layer (Chow and Heard 2009). CTCF could anchor chromatin loops that contain the escape genes, thus insuring that flanking regions remain within the condensed heterochromatin; in this case, the role of CTCF would be to protect silenced domains from unraveling and from reactivation.

As discussed above, a subset of escape genes appears to have variable patterns of escape (Fig. 1). That such genes would also have tissue-specific chromatin modifications is supported by our chromatin profiles of the mouse X, which reveal changes in H3K27me3 in a developmental stage and tissue-specific manner (Yang et al. 2010). For example, Mid1 is enriched in H3K27me3 in embryos, but not in liver. Specific enzymes such as the histone demethylases KDM6A and KDM6B may play an important role in removing H3K27me3 at escape genes in specific cell types (Agger et al. 2007; Hong et al. 2007; Lan et al. 2007).

Escape genes and sex differences

Dosage compensation has evolved to insure equal expression from the X chromosome and autosomes between the sexes. However, evidence exists that sexually dimorphic genes are predominant on the X (Arnold and Burgoyne 2004; Saifi and Chandra 1999; Vawter et al. 2004; Yang et al. 2006). Sexual dimorphisms are largely attributed to the action of sex-specific steroid hormones. Nevertheless, prior to the presence of serum testosterone in rodent embryos, sex differences are already detectable in transcription of specific genes, arguing the involvement of additional factors in sexual differentiation (Burgoyne et al. 1995; Carruth et al. 2002; Dewing et al. 2003; Reisert and Pilgrim 1991; Sanchez and Vilain 2010). Development of the four-core genotype (FCG) mouse model has made it possible to study the phenotypic effects of sex chromosome complement independent of sex-specific hormones. Briefly, when gonadal males with the Sry gene deleted from the Y chromosome and inserted on an autosome (XY Sry) are mated with XX females, four-core genotypes are produced: XX females, XY females, XXSry males and XY Sry males. The FCG system has clearly shown that the sex chromosome complement as well as the gender can cause sex differences (reviewed in Arnold and Chen 2009). In addition to Y-linked genes, clearly unique to males, X-linked genes, especially escape genes, would also contribute to sex differences. Differential expression between the sexes makes escape genes plausible candidates for causing gender/sex differences in development and physiology (Arnold 2009; Jazin and Cahill 2010; Xu and Disteche 2006; Bermejo-Alvarez et al. 2011). Furthermore, the sex chromosome complement also influences expression of autosomal genes, suggesting a global effect (Wijchers et al. 2010).

For genes outside the PAR, escape from X inactivation results in higher expression in females because of bi-allelic expression compared to mono-allelic expression in males. Expression from the Xi being significantly less than from the Xa (see above), the total expression level in females is usually less than twice that in males. The contribution of escape genes to sex differences has been studied by comparing expression levels in male and female cell lines and tissues in human and mouse (Talebizadeh et al. 2006; Yang et al. 2006; Johnston et al. 2008). In human lymphoblastoid cell lines, the contribution of escape genes to sex differences in gene expression has been found to be consistent between multiple individuals, but differences between the sexes are relatively modest (Johnston et al. 2008).

Escape genes may be especially important in brain function and this could lead to sex differences (Xu and Disteche 2006). About 10% of X-linked genes but only 3% of autosomal genes cause intellectual disability when mutated (Ropers 2008; Ropers and Hamel 2005; Zechner et al. 2001). This makes the X chromosome an attractive target for study with regard to brain development and function. This pro-brain characteristic of X-linked genes seems to be especially pronounced among X escape genes. Despite fewer escape genes in mouse than human (Berletch et al. 2010; Lopes et al. 2010; Yang et al. 2010), the majority of the mouse escape genes identified thus far play a role in brain development and are conserved in human where they have been implicated in intellectual disability syndromes (Ropers 2010). For example, Kdm5c is a key player in mouse brain development, which regulates neuronal differentiation, neuronal cell survival and dendritic outgrowth (Iwase et al. 2007; Tahiliani et al. 2007). In humans, KDM5C mutations result in intellectual disability, epilepsy, aggression and autism (Abidi et al. 2009; Adegbola et al. 2008; Jensen et al. 2005). Other examples are Ube1x, a ubiquitin activating enzyme that protects the functional integrity of synapses in response to chemical imbalance, and Ddx3x, an RNA helicase that helps transport mRNAs from the nucleus to active synapses for local protein translation (Kanai et al. 2004; Wishart et al. 2007). These escape genes are transcribed more highly in females than males, and the difference is not compensated by their Y-linked paralogues, which are either not transcribed (e.g., Ube1y and Usp9y), transcribed at significantly lower levels than the X paralogues (e.g., Kdm5d) or not translated (e.g., DDX3Y) in the brain (Ditton et al. 2004; Koopman et al. 1989; Xu et al. 2002, 2008a, 2008b). A number of ncRNAs also escape X inactivation in particular regions of the mouse brain, suggesting that they may have a role in mental processes specific to these regions (Reinius et al. 2010).

X inactivation and escape could not only enhance phenotypic differences between males and females, but could also enhance variability within the female sex due to mosaicism for cells with the Xm or Xp inactivated and to variable escape from X inactivation (Fig. 1). For example, some females can perceive significantly more colors in comparison to males, due to mosaicism for expression of the two X-linked retinal photopigment alleles (Jameson et al. 2001; Migeon 2007). In addition, monozygotic female twins are less similar than monozygotic male twins in traits such as social interaction and verbal skill (Loat et al. 2004).

Escape genes and disease

Escape genes, including PAR genes, play important roles in human diseases as women with a single X (X chromosome monosomy; 45,X) have Turner syndrome with severe phenotypes including ovarian dysgenesis, short stature, webbed neck and other physical abnormalities. In addition, as many as 99% of 45,X embryos die in utero (Hook and Warburton 1983). Deficiency in escape genes is thought to play a major role in phenotypes observed in Turner individuals (Zinn and Ross 2001). Since the Y chromosome protects men from these deficiencies, the most likely Turner syndrome candidate genes would have a Y copy, except for genes that control female-specific phenotypes such as ovarian function or processes such as X inactivation itself. Few specific genes have been directly identified as causing Turner syndrome. Thus far, the PAR gene SHOX that encodes a homeodomain transcription factor important for limb development in human has been implicated in the short stature phenotype (Clement-Jones et al. 2000). Interestingly, early lethality of 45,X embryos may be due to a defect in placenta development, which is supported by the finding that many placental genes have much higher expression in 46,XX versus 45,X cells (Urbach and Benvenisty 2009). Urbach and Benvenisty (2009) posit that placental defects in 45,X fetuses could result from the significantly lower (ninefold) expression of the PAR gene CSF2RA that encodes a receptor for a hematopoietic differentiation factor.

The fact that few escape genes exist in mouse is consistent with the significant differences in the impact of X monosomy in female mice and in women (Yang et al. 2010). Candidate genes for the severe phenotypes in Turner syndrome would likely be either genes located within the large human PAR (e.g., SHOX) but located on autosomes in mouse where the PAR contains a single gene (Sts), or non-pseudoautosomal genes that escape from X inactivation in human but are subject to X inactivation in mouse. Interestingly, deletion mapping has identified an 8.3-Mb interval at Xp22.3 involved in the neurocognitive symptoms of Turner syndrome, which contains two human escape genes, STS and NLGN4X, both involved in synaptic/dendritic development (Jamain et al. 2003; Laumonnier et al. 2004; Reed et al. 2005; Zinn et al. 2007).

Escape from X inactivation can also cause phenotypes in individuals with additional copies of the X chromosome. While these supernumerary X copies are inactivated, genes that escape would be present in triple rather than double dose. For example, phenotypes associated with trisomy X (47,XXX) could be a result of over-expression of genes escaping X inactivation (Tartaglia et al. 2010b). Neurological studies of individuals with sex chromosome aneuploidy suggest that X inactivation and escape make important contributions to brain development and function. Indeed, X aneuploidy causes disruptions in cognitive, emotional, and/or behavioral development, for example in 47,XXY Klinefelter’s syndrome (Geschwind et al. 2000; Tartaglia et al. 2010a).

Another potential role for escape from X inactivation is in aging. A recent study has found epigenetic alterations including X reactivation in a mouse model of accelerated aging due to telomere shortening (Schoeftner et al. 2009). As mentioned above, inappropriate reactivation of Otc has been reported in mouse tissues (Wareham et al. 1987). So far, no such reactivation of X-linked genes has been observed in human (Migeon 2007). It will be important to determine whether environmental factors could cause inappropriate escape from X inactivation due to changes in epigenetic marks. This could result in disease phenotypes due to increase expression of genes that are normally repressed or in expression of X-linked mutations in women carriers.