Introduction

After the elucidation of the genetic code in the late 1950s, it was considered that the information encoded by genetic material (DNA and mRNA nucleotide sequences) would be unchangeable and pass faithfully from one cell to another and from one generation to the next, except for the occurrence of genetic mutations. However, a few decades ago, a new code arose, namely the epigenetic code, which revealed that the message the nucleotide sequence of DNA harbors could be different depending on the epigenetic marks it contained. Thus, epigenetic marks can be defined as modifications affecting the DNA sequence or associated proteins, like histones, others than the DNA sequence, that are heritable through cell division. DNA methylation, histone modifications, non-coding RNAs and other forms of higher-order chromatin remodeling affect how tightly DNA is packaged in chromatin and, consequently, its accessibility to the transcriptional machinery [155].

The epigenome is stable, but also flexible, and can change rapidly during cell development and in response to environmental factors. Therefore, the epigenetic status of a gene is much more dynamic than its DNA nucleotide sequence [360]. Epigenetic marks are considered to be relatively stable during cell growth and development due to somatic maintenance (mitotic transmission), but during reprogramming events in the gametes and early embryos, there is a global erasure of epigenetic marks, which will be further re-established. Some loci can escape from this clearing and the epigenetic pattern will then be transmitted from one cell generation to the next, a process referred to as transgenerational inheritance [60].

Epigenetic changes are crucial for normal development, and disruptions of the epigenetic program can unchain pathogenic processes. In 1983, cancer was the first disease described to have altered epigenetic marks. Colorectal cancers exhibited a global loss of DNA methylation in comparison with their normal counterparts [91]. This hypomethylation resulted in genomic instability, chromosome rearrangements, and induced aberrant activation of certain genes [90]. Subsequent works found out that DNA hypermethylation of the promoter region of tumor suppressor genes was a frequent event in tumors [140]. Later, other epigenetic mechanisms displayed abnormal regulation, like histone modifications [98].

The exact mechanisms that lead to the aberrant epigenetic pattern in cancer remain still unknown. Interestingly, numerous single gene disorders of the epigenetic machinery mediate impaired gene expression, suggesting that mutations in genes coding for epigenetic enzymes might have a major role in the pathogenesis of cancer and other complex diseases. In support of this, epigenetics can elucidate the origin of some non-Mendelian inheritance diseases in which etiology goes further than genetic abnormalities [361]. Hence, both genome and epigenome can contribute to phenotypic variation and disease susceptibility, and the complex outcome of common and rare diseases may be compounded by both genetic and epigenetic anomalies as well as the interaction between them. Currently, analyses of genetic variants within genes with epigenetic function are being carried out in order to find evidence about the particular role of epigenetics in the etiology of a wide range of diseases. Also, genetic association studies represent a very useful tool to identify polymorphisms that show systematic variation among individuals who differ in phenotype and could represent the effects of risk-enhancing or protective alleles towards suffering a certain disease [14].

In this chapter, we will focus on epigenetic diseases associated with (i) the disruption of DNA methylation patterns by genetic defects in DNA methyltransferases, as in the case of the immunodeficiency, centromeric instability, and facial anomalies (ICF) syndrome and cancer; (ii) the inability to recognize methylated DNA because of altered functioning of MBD proteins, as is the case in Rett syndrome or systemic lupus erythematosus; (iii) genetic variants of one-carbon metabolism enzymes which coordinate the availability of the methyl group; (iv) other mutations that may have an impact on DNA methylation; (v) erroneous histone modifications originated by mutations in histone-modifying enzymes, which can also cause rare diseases, such as the Rubinstein-Taybi, Coffin-Lowry, and Sotos syndromes; and finally, (vi) the altered distribution and/or function of chromatin remodeling proteins, which might affect chromatin structure with the consequent deregulation of gene expression, as happens in α thalassemia X-linked mental retardation, CHARGE, Cockayne, facioescapulohumeral syndrome, and cancer (Table 1).

Table 1 Altered epigenetic mechanisms observed in genetic syndromes

Genetic control of DNA methylation

Methylation of DNA at the carbon 5 of cytosine residues that precede guanines (usually referred to as CpG dinucleotides) is the most studied and best-known epigenetic modification in mammals and other vertebrates. In healthy cells, methylated cytosines represent 70–80 % of total cytosines, but they are not randomly distributed. Instead, there are CpG-rich and CpG-poor regions. Those areas of the genome with higher concentrations of CpGs are called CpG islands and are located at the promoter region of approximately 50 % of genes in human and also follow tissue- and cell-specific patterns [79]. In healthy cells, most of the CpG islands are unmethylated, whereas repetitive sequences, intergenic regions and the body of the genes are usually heavily methylated [85]. The methylation outside CpG islands is related to the chromosomal stability maintenance, translocation prevention, and silencing of endoparasitic inserted sequences [26]. In some cases, gene-promoter regions can also show dense methylation in normal physiological status. This would result in transcriptional silencing and could be the case of genes regulated by genomic imprinting, genes located in the X-chromosome in women, and tissue-specific repressed genes. DNA methylation patterns are established in the early stages of embryonic development and further maintained after each round of cell division by the DNA methyltransferases.

Alterations of the normal DNA methylation pattern are associated with human disease [286], like infertility [272], genetic syndromes, autoimmune disorders [226], cancer [90, 165], and aging [27] (Table 1). However, the molecular mechanisms involved in the establishment of aberrant DNA methylation are still poorly understood. Animal models like the agouti and axin-fused alleles in mice and the white gene in Drosophila melanogaster are good examples of epigenetic disturbance with a genetic basis. In addition, genetic defects in the enzymes that transfer methyl groups to DNA, or those that participate in one-carbon metabolism might be possible sources of abnormal DNA methylation, as discussed later.

Animal models of mutation-induced epigenetic changes

The term “metastable epialleles” has been coined to define those alleles that display phenotypic diversity in absence of genetic heterogeneity to distinguish them from “normal” alleles. Thus, the activity of these alleles could be attributed to their epigenetic status [273]. One example of a metastable epiallele is the viable yellow agouti (A vy) allele in murine models. The wild-type murine agouti gene encodes a protein that determines the relative amount of black (eumelanin) and yellow (phaeomelanin) pigment in hair. The transcription start site is located within exon 2, and the expression of the gene varies with development, resulting in a sub-apical yellow band on each black or brown hair, which is the typical brown agouti coat color of wild-type mice. The A vy allele results from the insertion of a contraoriented intracisternal A particle (IAP) retrotransposon upstream of the transcription start site of the agouti gene [74]. As a consequence, constitutive and ectopic expression of agouti is induced not only in hair follicles but in every cell. The murine phenotype of the A vy allele is characterized by yellow fur, obesity, diabetes, and higher susceptibility to tumors [372]. Interestingly, CpG methylation of the newly established A vy IAP promoter is inversely correlated with agouti expression and the degree of methylation causes variation in coat color that ranges from yellow (unmethylated) to pseudoagouti (methylated) among isogenic A vy/a mice. The density of methylated CpG depends on maternal nutrition and environmental exposures during early development [72]. Another metastable epiallele is the axin-fused (Axin Fu), identified in 1937 [278]. The role of the axin protein is to regulate embryonic axis formation in vertebrates [378]. Similarly to the A vy allele, the Axin Fu contains an IAP retrotransposon within intron 6 [343]. The typical phenotype of Axin Fu is the kinked tail, due to axial duplications during embryogenesis. However, the Axin Fu phenotype has variable expression, which ranges from normal to kinked tail and correlates with differential DNA methylation at the IAP [274] and exhibits epigenetic plasticity to maternal diet supplementation [359]. The epigenetic state of both the agouti and Axin Fu alleles can be inherited transgenerationally after parental transmission, maternal in the case of agouti, and both maternal and paternal in the case of axin-fused, resulting in the inheritance of the phenotype, probably due to an inefficient reprogramming of the epigenetic marks during gametogenesis [236, 274].

Also, changes in the activity of a gene or the distribution of its product within a tissue and/or a developmental stage could be due to position effects. H. Muller was the first to describe in 1930 the position-effect variegation in the white gene of Drosophila melanogaster [240], attributing the inactivation of the gene in some cells to its abnormal juxtaposition with heterochromatin. The white gene of Drosophila melanogaster, located at the distal end of the X chromosome and subjected to gene dosage compensation, is related to the pigmentation of Malpighian tubules, testes and eyes of the fruit fly, and expressed in a tissue- and time-specific manner. Chromosome rearrangements place the wild-type gene next to a block of centromeric heterochromatin in some cells, dramatically altering the expression of the gene which, in the case of the eye, gives rise to a variegated phenotype composed of patches from rearranged (colorless) and wild-type (red) alleles. Replacement of the white gene in different euchromatic sites within the genome reversed the effect, demonstrating that the colorless phenotype is due to a position effect and not to mutations within the gene [168].

Epigenetic changes induced by genetic alterations at DNA methyltransferases

DNA methylation is established by a family of proteins called DNA methyltransferases (DNMTs). These enzymes catalyze the addition of methyl groups to cytosine residues at CpG [285], using S-adenosylmethionine (SAM) as the methyl group donor. Three active DNMTs have been described in mammals: DNMT1, DNMT3A, and DNMT3B. DNMT1 exhibits great preference for hemimethylated DNA and, therefore, is responsible for maintaining methylation patterns after DNA replication [18, 24, 126, 271, 314, 346]. DNMT3A and DNMT3B are the de novo methyltransferases, because they can establish the DNA methylation pattern, although they show equal preference for unmethylated or hemimethylated DNA [197, 254]. A third homologue of this family, DNMT3L, lacks the methyltransferase capacity but it is known to assist the other members of the family in methylation reactions by interacting with their catalytic domains [125]. DNMT3L also interacts with histone deacetylases [1]. Each DNMT is essential for life, as homozygous knockout alleles of them have been identified. Both Dnmt1 and Dnmt3b cause embryonic lethality in mice and in the case of Dnmt3a mice die soon after birth [198, 253]. Experiments in deficient Dnmt3l mice have shown that heterozygous progeny of Dnmt3l knockout females exhibits loss of imprinting, stochastic imprinting, and biallelic expression of imprinted genes [31].

Genetic defects in DNMT3B: Immunodeficiency, centromeric instability, and facial anomalies syndrome

The immunodeficiency, centromeric instability, and facial anomalies (ICF) syndrome (OMIM #242860) is a rare autosomal recessive disorder that was first described in the 1970s [150, 331], and very few cases have been detected in the world. An ICF child is characterized by variable immunodeficiency, which results in recurrent infections, mild facial anomalies, and chromosomal abnormalities in the juxtapericentromeric heterocromatin of chromosomes 1, 9, and 16 [150, 331], and in the heterochromatic region of the Y and the inactive X chromosomes [136].

The ICF syndrome is, so far, the only disorder known to be caused by germline mutations in one of the genes that encodes for a DNMT gene, the de novo methyltransferase DNMT3B, as first described by Hansen et al. [137]. Nevertheless, two types of ICF syndrome have been described based on the presence or absence of mutations in the DNMT3B gene, namely ICF type 1, which corresponds to ICF patients who exhibit biallelic DNMT3B mutations [160], and ICF type 2, in which mutations in the exons of DNMT3B have not been detected, although most patients showed hypomethylation at the centromeric Satα tandem repeat [160]. Unless otherwise indicated, ICF will be used to denote the type 1 disease.

The typical symptoms of ICF children are probably due to DNA methylation defects and diverge in three aspects. Firstly, important decreases in serum immunoglobulins are seen in spite of the normal presence of B cells, which predispose to recurrent infections responsible for high mortality rates in early childhood [147] although the immunodeficiency can range from mild reduction of the immune response to severe agammaglobulinemia [132]. Secondly, facial anomalies, which are also variable, are not severe but frequent [132]. Typically, an ICF child has a broad flat nasal bridge, very widely spaced eyes, epicanthic folds, and low-set ears [310]. Less frequent but often related to the syndrome are the presence of a small jaw and macroglossia (protrusion and enlargement of the tongue). In addition to facial anomalies, 50 % of the affected fail to thrive properly and have lower weights at birth [132]. About one-third of the patients can also suffer from mental retardation or other neurologic defects affecting motor and cognitive function [311]. Finally, and used for diagnosis, lymphocytes of ICF patients exhibit hypomethylation of satellite DNA sequences (Sat2 and Sat3) in the juxtacentromeric heterochromatin of chromosomes 1, 9, and 16 [158, 334] because of defects in DNMT3B. Acrocentric NBL2 and subtelomeric D4Z4 tandem repeats are also less methylated but still appreciable in comparison with normal cells [80]. This hypomethylation is likely to originate characteristic rearrangements of the chromosomes mentioned above, including breaks, deletions of whole arms and multibranched chromosomes, and chromatin decondensation [78]. These cytogenetic features observed in ICF cells were very similar to the ones induced in normal cells treated with 5-azacytidine, which led researchers to combine efforts in the study of DNA methylation defects in ICF patients [302, 345]. More recently, it has been shown that in ICF-derived cells, the levels of telomeric repeat-containing RNA (TERRA), involved in the formation of telomeric heterochromatin [67], are abnormally elevated and telomeres are particularly shortened. This might be due to mutations in the DNMT3B gene and the consequent defects in DNA methylation [66].

Most of the reported patients with ICF syndrome exhibited heterozygous and varied mutations of the missense type within the part of the gene that contains the ten conserved motifs responsible for catalytic activity the DNMT3B [137, 366]. As a result, catalytic activity may be impaired but there is genetic heterogeneity about how this happens [362]. Occasionally, ICF patients have mutations located in the amino-terminal region of the catalytic domain [337, 365] that do not affect the capacity to transfer methyl groups but do instead affect the interaction with DNMT3L, which enhances DNMT3B for DNA methylation [320]. Other point mutations also affect the localization of the methylase enzyme [337], substrate binding, and oligomerization [232].

Several mice models were designed to better understand the ICF syndrome. A deletion of the catalytic domain of Dnmt3b (null mutation) resulted in embryonic lethality [337], similar to that observed in Dnmt3b-deficient mice [253]. However, other murine models carrying two human ICF-like missense mutations in the catalytic domain allow the embryos to develop to term. Post-natal mice showed an ICF-like phenotype, characterized by low body weight at birth, facial anomalies (shorter nose and wider nasal bridge), and hypomethylation of repeats [337]. These observations lead to the assumption that patients exhibit residual activity of this methyltransferase; otherwise, homozygous null DNMT3B mutations would have lead to early abortions.

To conclude, it appears that deregulation of DNMT3B induces hypomethylation of satellite DNA-rich regions located mainly in the juxtapericentromeric heterochromatin of certain chromosomes, which in turn alters gene expression due to abnormal scavenging of transcription factors, changes in nuclear architecture, and expression of non-coding RNAs, as proposed by several works [80, 106, 159]. Thus, further research needs to focus on the exact function of this region of heterochromatin and the target genes affected, which might be related to immune function, development, and neurogenesis [161] by altered transcription factor binding and other trans-acting effects. Deeper knowledge into the molecular physiopathology of murine ICF syndrome model will definitely shed light on this.

Genetic defects in DNMTs are associated with cancer

One of the first reports that establish a clear connection between DNA methylation and cancer was published by A. Feinberg and B. Vogelstein in 1983, who found extensive hypomethylation in certain genes of tumor samples in comparison with the corresponding normal tissue [91]. DNA hypomethylation affects mainly repetitive sequences dispersed within transposable elements, introns, and coding regions [85]. The functional meaning of this loss of DNA methylation results in chromosome instability and reactivation of transposable elements. When transposable elements are hypomethylated, they would become transcriptionally active and could mediate genomic rearrangements and act as alternative promoters altering the expression of genes implicated in the regulation of cell cycle, apoptosis, and DNA repair. Also, imprinted genes can be reactivated [86]. However, what has really shed light on epigenetic cancer hallmarks is the aberrant site-specific DNA hypermethylation. The de novo hypermethylation of CpG islands located within the promoter region of approximately half of the genes, which are usually unmethylated in normal tissues, results in transcriptional silencing of tumor suppressor genes (TSG), cell cycle control genes, apoptosis-related genes, and DNA repair genes [234], and this is probably a crucial step to consolidate tumor appearance and development. These epigenetic alterations could depend on genetic mutations or polymorphisms in DNMTs and certain genes encoding enzymes of the one-carbon metabolism pathway, altering their normal function.

DNMT genes are usually up-regulated in several types of cancer [162, 204, 261, 287, 303]. In particular, significant over-expression of DNMT3B is observed while over-expression of DNMT1 and DNMT3A is more modest [261, 287]. DNMT1 and DNMT3B seem to be directly implicated in the altered distribution of DNA methylation in cancer cells [20] and chromosomal instability [172]. In particular, lack of Dnmt3b blocks the transition from microadenoma to tumor in the murine Apc Min/+ colon cancer model, demonstrating the vital role of this DNMT in cancer progression [206].

DNMT3B plays an important role in the establishment of aberrant DNA methylation in malignant tumors because it contributes to gene promoter silencing through DNA hypermethylation of tumor suppressor genes and genes involved in cell cycle, apoptosis, and DNA repair. Also, methylation of CpG dinucleotides might facilitate the occurrence of C-to-T transition mutation in genes involved in carcinogenesis [123] and increase susceptibility to environmental carcinogens [68, 374].

The DNMT3B gene is located on chromosome 20q11.2. Its approximately 47 kb of the genomic DNA correspond to 23 exons and 22 introns. DNMT3B has approximately 345 single-nucleotide polymorphisms (SNPs) but only a few, of very low frequency, cause amino acid change [208]. SNPs are the most common form of human genetic variation, and they might increase the risk of an individual to develop a certain disease. Although the functional meaning of these SNPs is still a matter of debate, it has been hypothesized that some variants could influence DNMT3B activity and, consequently, DNA methylation, thereby modulating genetic susceptibility to cancer. Another possibility is that SNPs at the promoter region could alter transcription factors binding sites, which in turn will affect gene expression [368]. Indeed, only three SNPs in the promoter region of DNMT3B, −149 C > T (rs2424913), −283 T > C (rs6058870), and −579 G > T (rs1569686), have been studied as possible genetic susceptibility factors for several types of cancer. The first two SNPs appear to significantly increase the promoter activity of DNMT3B [192, 306], but in regard to the −579 G > T no such evidence has been reported [192]. Anyhow, these SNPs have been evaluated in different populations as risk factors that predispose to cancer.

For instance, carriers of the T allele at the 149 bp from the transcription start site were at significant risk of suffering from lung cancer in a non-Hispanic white population [306]. This −149 C > T polymorphism also increased significantly the genetic susceptibility to prostate cancer [309], head and neck squamous cell carcinoma [208, 355], and hereditary non-polyposis colorectal cancer [164]. Curiously, in the case of breast cancer, the very same T allele exhibited a significant protective effect [235]. However, other studies could not find a significant association between the presence of the −149 T genotype and colorectal cancer (Bao et al. [88], head and neck squamous cell carcinoma [44], hepatocarcinoma [87, 364], gastric cancer [10, 148], gastric cardiac adenocarcinoma [358], and acute leukemia [200].

A similar situation occurs with the −579 G > T polymorphism. Some studies have established a clear link between this SNP and some cancers [16, 88, 129, 143, 148, 192, 306] while others point out that genetic polymorphisms at the DNMT3B gene are not risk-enhancing factors [44, 88, 208, 316].

Little is known about the −283 T > C (from exon 1A transcription start site) SNP. So far, carriers of −283 T genotype were at decreased risk of lung cancer compared with individuals carrying the C allele in a Korean population, due to the fact that the SNP resulted in increased transcriptional activity of the DNMT3B gene [192].

Studies of multiple SNPs have been, in general, inconclusive [45, 371]. That said, SNPs may alter susceptibility to suffering from a certain type of cancer, although cumulative evidence suggests that many other factors have to be taken into consideration together with genetic variants like race, ethnicity, geographic areas, diet, environment factors, etc. The frequency distribution of the different polymorphisms of DNMT3B depends on the population considered. Moreover, SNPs in DNMT genes suffer the consequences of modifier genes that are central for their outcome (DNA methylation) associated to genetic DNMTs gene variants. In addition, different histopathological types of cancers may have different etiologies regarding genetic susceptibility [192]. Furthermore, in the case of DNMT3B, the alternative splice variants that may alter catalytic activity are expressed in a tissue-specific manner [253, 294, 354], so it is important to consider the complex interplay among DNMT3B splice variants and different types of tumors.

Another possible source of altered patterns of DNA methylation in cancer together with SNPs are truncated proteins. DNMT3B transcripts that originate from a promoter within intron 4 correlate with the promoter DNA methylation status of p16 and RASFF1A in non-small-cell lung cancer [354] and some of them encode truncated DNMT3B proteins that could interfere with the normal function of this DNMT [356]. Ostler et al. identified numerous aberrant DNMT3B transcripts in a diverse population of cancer cell lines and primary acute leukemia cells that encode truncated DNMT3B proteins lacking the C-terminal catalytic domain, and were catalytically inactive [255]. They further originated a stable transgenic cell line overexpressing DNMT3B7, the most common transcript overexpressed in the analyzed sample. Microarray expression revealed that the introduction of this aberrant splice variant induced changes in gene expression profile in genes usually located on chromosomes 1, 9, and X. These data led the researchers to conclude that truncated DNMT3B proteins play an important role in the abnormal distribution of DNA methylation in cancer cells. Further investigations need to be done in order to determine the molecular mechanisms, but three hypotheses were launched: (1) that the DNMT3B7-truncated protein may interfere with the normal DNA methylation machinery when binding any of the DNMT3B interacting molecules; (2) that DNMT3B7 could bind itself to DNA impeding adequate DNMTs to act; and (3) that due to altered DNA methylation, histone modifications could be consequently altered [256].

Variations in the DNA methylation profile are a common phenomenon among healthy individuals. These can be gender-related [82], age-related [100], or tissue-specific variations [77], and originated by environmental [96] or genetic factors. In relation to this, DNMTs are highly polymorphic genes. Indeed, in a combined study of polymorphisms in the coding region of DNMT genes and the levels of DNA methylation in a healthy population revealed the existence of 111 SNPs within DNMT genes. Of all genes considered, only one, the rare variant R277Q of DNMT3L, was associated with DNA hypomethylation. This SNP, always present in heterozygosis, leads to the production of a protein with reduced ability to stimulate DNA methylation due to defects that impede adequate DNMT3A-complex formation. The DNA hypomethylation did not correlate with a global loss of DNA methylation but to specific CpG islands repetitive in nature with telomeric location and away from genes (and promoters) or, to a lesser extent, within introns [83]. These data support previous hypotheses that suggest that DNMT3L is involved in the maintaining of the epigenetic identity of telomeres, which in turn constitutes an important factor that regulates telomere length [28]. Although at the time of the study the individual carrying the rare variant R277Q of DNMT3L was in a healthy state, he was young, and further analysis would be necessary to investigate the predisposition to disease at older age. No crucial SNPs affecting the rest of DNMT functions were found, probably because they lead to deleterious effects involved in abnormal development and serious disease, like the ICF syndrome in the case of DNMT3B.

Epigenetic changes induced by genetic alterations in MBD proteins

The information that DNA methylation contains is read and translated into a functional state in chromosomes by three distinct families of proteins: the MBD (methyl-CpG binding domain) family, the SRA family, and the Kaiso and Kaiso-like family [296]. The MBD family is highly conserved and characterized by the presence of an MBD that allows for the interaction with methylated DNA [138]. From all the members, methyl-CpG binding protein 2 (MeCP2) was the first one identified [244], but four more members were included in the family: MBD1, MBD2, MBD3, and MBD4 [138]. MBD proteins play mainly an active role in heterochromatin formation and transcriptional regulation, but recent research suggests that MeCP2 is an especially multifunctional protein involved in the pathogenesis of genetic disorders like Rett syndrome and autoimmume diseases like systemic lupus erythematosus.

Genetic defects in MeCP2 (I): Rett syndrome

The MeCP2 gene was first associated with gene silencing as its gene product binds to methylated DNA and recruits SinA3 (transcriptional repressor) and histone deacetylases (HDACs) [166, 245, 249], thus creating a repressive chromatin environment. The presence of (A/T)n sequences next to methylated cytosines, as well as methylation density and the length of the sequence, promotes high-affinity binding and confers some sequence specificity [180]. However, MeCP2 can also bind to unmethylated DNA (see [135] for a review) and recent findings suggest that it can also: function as a transcriptional enhancer due to its interaction with the transcription factor CREB at the promoters of active genes [41]; mediate RNA splicing by interactions with the RNA binding protein YB1 [375]; induce compaction of chromatin at unmethylated DNA regions [108]; and stabilize large chromatin loops [146]. For instance, there is a dynamic interplay between MeCP2 and linker histone H1 that modulates chromatin structure [113]. Through DNA binding, they both induce chromatin condensation by reduction of the linker DNA entry-exit angle. When expression levels of MeCP2 are lower than those of H1, MeCP2 would be expected to bind to specific sites, but when both genes are equally or similarly expressed, MeCP2 could be exerting both gene-specific and global functions related to chromatin binding [113]. Interestingly, Line 1 retrotranscription is enhanced by MeCP2 absence in murine models, suggesting a new mechanism by which loss of MeCP2 function alters gene expression and could lead to disease [241].

Genome-wide analyses show that MeCP2 has many target genes involved in brain development, like BDNF (brain-derived neurotrophic factor) [49], DLX5 (distal-less homeobox 5) [146], clusterin and cytochrome c oxidase subunit 1 [119], and protocadherins PCDHB1 and PCDH7 [231], among others. These genes must be carefully regulated to ensure the correct development of the neural system, reinforcing the cross-talk between DNA methylation, chromatin remodeling, and gene expression.

In 1999, mutations within the MeCP2 gene were found to originate Rett syndrome (OMIM #312750) [4]. Since then, countless genetic defects in MeCP2 are unequivocally associated with this syndrome that afflicts predominately females. Rett syndrome is an X-linked neurodevelopmental disorder whose symptoms become evident after an uneventful pregnancy and delivery, and normal development until 6–18 months of age, when patients start showing a progressive neurological dysfunction probably due to an impairment of neuronal development and a reduction in brain size [119]. The regression process involves clinical manifestations like mild learning disabilities, autistic behavior, stereotype hand movements, and encephalopathy. Also, severe mental retardation and motor impairments affecting the respiratory tract are commonly seen in Rett syndrome patients [42].

MeCP2 mutations are detected in both sporadic and familial cases of Rett syndrome, the most pathogenic being those affecting the MBD and TRD (transcriptional repression domain) domains [95]. Even though, MeCP2 mutations causing Rett syndrome occur within the whole sequence and are of varied nature: point mutations, missense mutations, frameshift mutations in the C-terminal, etc. Among males, MeCP2 duplication is the most common MeCP2 dysfunction, which is also related with neurological defects and has defined a new syndrome, the MeCP2 duplication syndrome [275]. Often, male individuals carrying mutations in MeCP2 have neurodevelopmental delay, which does not necessarily frame within Rett syndrome. Nonetheless, MeCP2 mutations have different impacts in protein function, depending on where the mutation lies. Precisely, the large phenotypic variability observed in Rett syndrome patients might be due to the specific mutations together with the skewing of the normally expected X chromosome inactivation, which is usually random and balanced [112].

MeCP2, located on chromosome Xq28, is ubiquitously expressed [195], but particularly high levels of the protein are found in the brain, in a time-specific manner depending on neuronal maturation and synaptogenesis [55]. Specific loss of MeCP2 in central nervous system causes Rett syndrome and other autistic-like behaviors, but its overexpression results in profound motor dysfunction [213], progressive neurological disorders and premature death [56, 307, 338], suggesting that MeCP2 must be minutely regulated in order to maintain correct brain functioning. Mice deficient in MeCP2 exhibit a Rett-like phenotype that mimics the human syndrome, a phenotype which can be reverted after restoring MeCP2 expression [130].

Genetic defects in MeCP2 (II): lupus erythematosus

Systemic lupus erythematosus (SLE, OMIM #152700) is an autoimmune and inflammatory disease characterized by the production of autoantibodies against multiple nuclear antigens that affects numerous organs and is associated with high rates of morbidity and mortality. It affects predominantly females [266].

The etiology of SLE remains still elusive. Cumulative evidence suggests that the etiology of SLE is quite complex, and involves not only genetic and environmental factors but also the interaction among them [15, 124, 352, 377]. Association studies in monozygotic twins suffering from SLE showed disease discordance rate and point to epigenetic factors as possible contributors [157]. Indeed, strong evidence supports an important role for aberrant T cell DNA methylation [298], which results in the elevated expression of methylation-sensitive genes like ITGAL (CD11a), TNFSF7 (CD70), PRF1 (perforin), and TNFSF5 (CD40LG) [300]. Similarly, the use of demethylating drugs like 5-azacytidine induces genome-wide hypomethylation and aberrant expression of many genes [282], and also increases autoreactivity in CD4 + cells [280]. Interestingly, when these cells were transferred to mice, they induced an SLE-like disease in mice [376].

In SLE T cells, DNMT1 activity is decreased [65], contributing to the hypomethylation of the aforementioned genes. However, so far, no significant association has been found between SNPs in the DNMT1 gene and susceptibility to SLE [260]. In relation to this, to maintain adequate DNA methylation levels, DNMT1 needs the cooperation of MeCP2 [177]. MeCP2 exerts a vital role in epigenetic transcriptional regulation of methylation-sensitive genes by binding to methylated DNA and recruiting HDACs. As a result, chromatin acquires a configurationally closed state and becomes inaccessible to the transcription machinery [166]. SNPs within the MeCP2 gene showed a strong association with SLE [299], reinforcing the importance this gene plays in this disease. The role of MeCP2 in SLE supports previous data suggesting that an X-linked gene is involved in the pathogenesis. For instance, males with Klinefelter’s syndrome (47 XXY) are at similar risk of suffering from SLE in comparison with females [305].

In addition to this, epigenetics is fundamental in the autoreactivity of SLE CD4+ T cells. The Regulatory Factor X-box 1 (RFX1) binds specifically to regulatory regions in healthy CD4+ T cells, and acts as a transcriptional co-repressor by recruiting SUV39H1, HDAC1, and DNMT1 [383]. Thus, H3K9 tri-methylation, histone hypoacetylation and DNA hypermethylation cooperate to silence RFX1 target genes among which we can find the adhesion molecule lymphocyte function-associated antigen 1 (LFA-1), integrated by CD11a, CD18, and CD70 [251, 281]. The typical downregulation of RFX1 in SLE CD4+ T cells downregulates in turn the aforementioned epigenetic marks, leading to the overexpression of CD11a and CD70, triggering autoimmune responses [383].

Mainly, however, changes in the DNA methylation profile may be associated with different SLE onsets, as reported by Javierre et al. in discordant monozygous twins for SLE [157]. Notably, they provide a list of epigenetically deregulated genes which are relevant in autoimmune inflammatory diseases (cell activation, immune response, cell proliferation, cytokine production), some of which were previously described as altered genes in SLE. Further investigations into these methylation-sensitive genes will probably improve understanding of SLE and other autoimmune diseases.

Epigenetic changes induced by genetic alterations at the one-carbon metabolism enzymes

Not only anomalies in DNMTs and associated proteins, like MBD proteins, represent a possible source of aberrant DNA methylation patterns in human disease, but also anomalies present in the enzymes of the one-carbon metabolism, which regulate SAM supply within cells.

SAM is the main methyl group donor at cellular level and SAM-binding proteins have varied functions, notably those that methylate DNA, RNA, histones, and other proteins and small molecules [209], which may possibly have an important impact on epigenetic regulation. Apart from this, SAM plays a significant role in the regulation of hepatocyte growth, death and differentiation. SAM reactions, included in one-carbon metabolism, are coupled to polyamine synthesis and to the folate cycle. SAM is also a precursor of important metabolites like glutathione, through transsulfuration reactions (Fig. 1). Chronic liver injury results in decreased levels of hepatic SAM, whereas excess of SAM induces liver damage and increase predisposition to steatosis and hepatocellular carcinoma (HCC) [222]. Therefore, it is crucial to maintain SAM levels in liver within a tight range. Methionine adenosyl-transferases (MATs) and glycine-N-methyltransferase (GNMT) are the main enzymes responsible for SAM synthesis and catabolism, respectively [182, 322], and alterations in any of them are related to liver pathologies.

Fig. 1
figure 1

Schematic illustration of the one-carbon metabolism. Highlighted by the grey rectangle is the S-adenosylmethionine cycle, which is coupled to polyamine synthesis, transmethylation and transsulfuration pathways, and folate cycle. MAT deficiency leads to hypermethioninemia and SAM deficiency, causing neurological symptoms and liver injury. GNMT deficiency in humans can be asymptomatic, although some cases have been reported to result in hepatomegalia. MAT methionine adenosyltransferase, SAM S-adenosylmethionine, GNMT glycine N-methyltransferase, SAH S-adenosylhomocysteine, MT methyl transferases, THF tetrahydrofolate, 5′,10′-MTHF 5′,10′-methylene tetrahydrofolate, 5′-MTHF 5′-methyltetrahydrofolate, MTHFR 5′,10′-methylene tetrahydrofolate reductase, MTHFS 5′,10′-methylene tetrahydrofolate synthase, MS methionine synthase, R methyl group acceptors

Genetic variants in MATs lead to SAM deficiency

SAM synthesis occurs in all mammalian cells, but mainly in liver [222]. SAM is originated from methionine and ATP in a reaction catalyzed by MATs. In mammals, MATs are codified by two different genes, MAT1A and MAT2A, whose gene products are MATI and MATIII, and MATII, respectively [221]. MAT1A is primarily expressed in liver and encodes for the α1 subunit found in both native MATI and MATIII. In contrast, MAT2A has a more ubiquitous expression and encodes the α2 subunit found in native MATII [182]. MAT2A is very abundant in fetal liver but, during development, is progressively replaced by MAT1A [120, 145]. MAT1A expression becomes reduced in liver diseases and is almost absent in HCC, while MAT2A is transcriptionally active in HCC [36]; thus, it might facilitate liver cancer cell growth [35]. SAM can influence MAT2A expression: at low levels of SAM, MAT2A expression is induced, inversely, an increase of SAM presence correlates with MAT2A downregulation. However, such effect has not been seen into the case of MAT1A [218, 369].

The most common cause of inherited hypermethioninemia is MAT deficiency (Fig. 1) [58, 107, 176]. Depending on the mutation, the inheritance can be autosomal recessive or dominant. Around 30 different mutations have been described in the MAT1A gene causing MAT deficiency, being R264H the most frequent and of dominant inheritance [17]. MAT deficiency is benign in most cases, but on some occasions symptoms of neurological nature can be seen [17].

Disruption of MAT1A in mice produced high levels of plasma methionine and SAM deficiency in liver, which increase susceptibility of liver to oxidant-cell death, predispose liver to further injury and spontaneous HCC and to impaired liver regeneration, because of an inability to upregulate cyclin D1 [47, 210, 217]. Recently, it was found that forced expression of MAT1A in cells derived from human liver cancer reduced tumorigenesis in vivo, supporting the role of MAT1A deficiency in cancer progression [199]. In addition, MAT1A is necessary to correct VLDL (very low density lipoprotein) assembly and plasma lipid homeostasis in mice, and MAT1A knockout mice model develops non-alcoholic fatty liver disease, which could be mediated, at least in part, by SAM deficiency [37].

Genetic variants in GNMT increase susceptibility of liver disease

SAM transfers the methyl group to a varied range of acceptors, thanks to the action of methyltransferases, and the common product of these reactions is S-adenosylhomocysteine (SAH). Most of SAM-dependent methylation reactions are inhibited by increases in SAH and decreases in SAM [220]. GNMT is considered of particular importance because GNMT deficiency appears to be the only methyltransferase deficiency that causes SAM accumulation [17]. GNMT is the most abundant methyltransferase in liver, but it is also present in prostate and pancreas [252] and it optimizes methylation reactions by maintaining a constant SAM/SAH ratio, which is an indicator of the methylator potential in cells [174]. It is also a major folate-binding protein [57]. Interestingly, a specific form of folate, 5-methyltetrahydrofolate pentaglutamate, which binds to GNMT in vivo, was found to inhibit GNMT activity [351, 373], suggesting that GNMT is an important regulator of the one-carbon metabolism [13]. Indeed, by modifying tissue folate status, GNMT could promote chromosome breakage or abnormal DNA methylation [75]. In addition, it participates in detoxification pathways and might have protective effects against carcinogens exposure by reducing DNA adducts formation [48].

Mutations in the GNMT gene, mapped to chromosome 6p12 [50], have been reported in humans, resulting in GNMT deficiency. Two Italian siblings, who displayed mild hepatomegalia and increased presence of serum transaminases, were diagnosed with GNMT deficiency [238]. Later, Luka et al. demonstrated that both individuals were compound heterozygotes and carried the same two missense mutations within the GNMT gene that affect enzymatic activity [215]. Another patient, reported by Augoustides-Savvopoulou [9] was asymptomatic but, as with the other two patients, he had increased levels of methionine and SAM in plasma and normal SAH and sarcosine [9, 238].

Although GNMT deficiency in humans appears to be a benign disorder, lack of GNMT in mice models leads to different conclusions. Two GNMT knockout mice models have been reported in the literature and phenotypic differences may arise as a consequence of the knock-out strategy [51]. One of them developed chronic liver hepatitis and HCC spontaneously, and glycogen storage disease [203, 207]. 2-D PAGE and real-time experiments revealed that absence of GNMT led to the downregulation of proteins involved in antioxidant/detoxification response, the glycolytic energy metabolisms and the one-carbon metabolisms pathways [202]. As for the other model, it was initially found to be normal [214] but later it was discovered that the lack of GNMT resulted in liver steatosis, fibrosis, and HCC [219], and liver regeneration impairment [341]. In this case, GNMT seemed to modulate DNA and histones methylation of critical carcinogenic pathways, like the JAK/STAT pathway, resulting in epigenetic alterations in HCC [219].

In relation to this, loss of heterozygosity has been reported in approximately 40 % of human HCC. Hence, GNMT has been proposed to be a tumor-susceptibility gene because its expression is diminished in tumorous tissues and in HCC cell lines [333]; it is also downregulated in liver of patients at risk of developing HCC like those with hepatitis C viruses and alcohol-induced cirrhosis [11]. In addition, it is able to bind to benzo[a]pyrene (BaP), reducing the BaP-DNA adducts formation [48].

Thus, depending on the defect that affects SAM levels, treatment with SAM could be considered. For instance, SAM supplementation could be helpful when the liver disease is associated with MATs impairments, accompanied by methionine restriction. However, when the defects affect for instance GNMT, SAM administration is probably useless and might even become toxic as it accumulates [17]. In the latter case, nicotinamide, substrate of nicotinamide N-methyltransferase (present in liver), treatment can prevent SAM accumulation depending on GNMT deficiency and normalize the expression of genes under GNMT regulation involved in diverse pathways that could lead to a pathogenic phenotype [342]. In conclusion, one-carbon metabolism genes need to be carefully regulated and early detection of genetic defects in these enzymes that regulate SAM levels will contribute to the development of adequate treatments and preserve human health.

Epigenetic changes induced by nucleotide expansion and position effect

CGG trinucleotide expansion affects DNA methylation: Fragile X syndrome

The fragile X syndrome (FXS, OMIM #300624) is the most common heritable form of impaired intellectual ability worldwide (Jacquemont 2007). FXS is an X-linked neurodevelopmental disorder of dominant inheritance characterized by cognitive and behavioral difficulties and facial dysmorphisms (elongated face, large and protruding ears), which can manifest in mild to severe forms. Affected females exhibit symptoms but usually to a lesser extent, due to the presence of a normal allele [103].

Cytogenetic analysis revealed that FXS patients have an expansion of a single trinucleotide sequence (CGG) at the promoter of the fragile X mental retardation gene (FMR1), mapped at Xq27.3. Healthy individuals carrying between 5 and 44 CGG repeats, while affected patients with full mutations carry more than 200 repeats. A premutational state has been attributed to carriers of between 55 and 200 repeats and grey zone denotes expansions between 45 and 54 repeats. The larger the number of repeats, the more unstable the alleles are. The length of the expansion also determines the risk of inheriting the full mutation allele [131]. Full mutation and premutation alleles can coexist within an individual, giving rise to a FXS mosaicism, which may make the diagnosis difficult [229].

The expansion of CGG repeats results in the methylation of the affected DNA, which leads in the full mutation alleles to the epigenetic silencing of the FMR1 and the lack of its product, the fragile X mental retardation protein (FMRP) [103]. FMRP is ubiquitously expressed and especially abundant in neurons, where it regulates synaptic plasticity and neuron maturation. FMRP has RNA-binding properties and regulates multiple processes related to RNA metabolism, especially the transport and localization of mRNA to dendrites and synapses [186]. The loss of this protein causes defects in neuronal morphology and physiology, giving rise to the neurodevelopmental delay typical of FXS patients.

Position effects affect DNA methylation: Facioscapulohumeral muscular dystrophy

Facioscapulohumeral muscular dystrophy (FSHD, OMIN #158900) is an autosomal dominant myopathy characterized by progressive, commonly asymmetric, weakness and wasting of the facial, shoulder and upper arm muscles [325]. FHSD patients can also exhibit extramuscular features like hearing loss, retinopathy, mental retardation, and epileptic seizures [326]. Mounting evidence suggests that FSHD is not caused by defects in a single gene; instead, it appears that deregulation of epigenetic mechanisms results in aberrant transcription of multiple disease-related genes.

Most FSHD patients are linked to molecular rearrangements in the subtelomeric region of chromosome 4 long arm (4q35) which maps to a 3.3-kb tandem-repeated macrosatellite, D4Z4 [339]. Healthy individuals carry between 11 and 100 repeats, whereas FSHD patients have a reduced number of copies of D4Z4 that range from 1 to 10. Although the phenotypic variability of FSHD is remarkable, there seems to be a correlation among the number of residual D4Z4 repeats, DNA hypomethylation, and the age of onset and the severity of the disease, being larger deletions related to an earlier onset of the disease and a more rapid progression [340]. Interestingly, the complete deletion of D4Z4 repeat and 200 kb of nearby DNA is not associated to FSHD [335], suggesting that a reduced number of D4Z4 copies implies a gain-of-function mutation probably of the surrounding of the repeat array [33].

D4Z4 is highly methylated in healthy subjects but, interestingly, all FSHD patients show DNA hypomethylation in that region. Sometimes, the DNA hypomethylation correlates with the contraction of the affected area. This fact suggests that the genetic defects observed in FSHD can be dependent (FSHD1 type 1) or independent (FSHD type 2) of D4Z4 deletions [62]. From now on, unless otherwise noted, we will be referring to the FSHD type 1 as FSHD.

On chromosome 10q26, there is a D4Z4 repeat array that shares a 98 % identity with the one on chromosome 4q35 [34]. However, it appears that the short D4Z4 array is not itself responsible of FSHD because almost identical arrays on 10q26 have non-pathogenic effects when they are equally short [194]. In addition, asymptomatic carriers are also associated with reduced D4Z4 DNA methylation in the contracted locus [62]. These observations indicate that D4Z4 hypomethylation is not sufficient to cause FSHD, but that could be altering the expression of nearby genes. Indeed, FSHD patients are characterized by overexpression of FSHD region gene 1 (FRG1), FSHD region gene 2 (FRG2), ANT1, and DUX4, which are located proximal to D4Z4 [102]. Thus, it seems that the role of D4Z4 is to suppress the expression of nearby genes, and the D4Z4 hypomethylation dependent or independent of D4Z4 deletions, results in the up-regulation of candidate genes. Also, several epigenetic mechanisms appear to be involved in the progression of FSHD.

In FSHD, apart from the loss of DNA methylation, histone modifications of D4Z4 on 4q35 in FSHD patients remind of facultative heterochromatin or silent euchromatin more than constitutive heterochromatin itself [381]. D4Z4 has normally both H3K9me3 and H3K27me3 as transcription repressive marks, which are lost in FSHD patients. The loss of H3K9me3 prevents the binding of D4Z4 to the heterochromatin-binding protein HP1γ and the sister chromatoid complex, cohesin, which contributes to the loss of gene repression [379]. Furthermore, D4Z4 and the promoter of the FRG1 are bound by transcription factor YY1, which in turn recruits EZH2 [29]. The latter is a H3K27 methyltransferase that belongs to the Polycomb group proteins involved in transcriptional silencing of higher eukaryotes [308]. As a consequence to the reduced binding of YY1 and EZH2 due to the D4Z4 contraction, the repressive H3K27me3 mark is also depleted [29], which in turns contributes to the loss of repression of genes usually silenced by D4Z4 in healthy individuals.

In addition, Gabellini and colleagues found a 27-bp sequence in each D4Z4 unit that specifically binds to a repressor complex integrated by YY1, HMGB2, and nucleolin [102]. This complex usually binds to D4Z4 mediating transcriptional repression of 4q35 genes, apart from interacting with DNMTs, HDACs, and HP1, which participate in heterochromatin formation and gene silencing. When there is a reduction in the number of D4Z4 repeats, the complex cannot bind and an open state of chromatin is induced, facilitating overexpression of target genes. Moreover, there is a nuclear matrix attachment site in the vicinity of D4Z4, which participates in the organization of DNA loop domains. This matrix attachment is much weaker in cells from FSHD patients than in healthy individuals, altering chromatin structure. The repeat and upstream genes reside in two loops in healthy myoblasts but only in one loop in FSHD myoblasts, enhancing cis-regulation of affected genes [270].

More recently, the group of Gabellini provided a new epigenetic mechanism that contributes to gene silencing. In the healthy population, the Polycomb group targets D4Z4 to repress gene expression. In FSHD patients, the deletion of D4Z4 is associated with reduced Polycomb silencing. In particular, the non-coding RNA DBE-T, which is selectively produced in individuals affected with FSHD, recruits Ash1L to the FSHD locus, remodeling chromatin structure and leading, ultimately, to active transcription of the surrounding genes [32].

Taking the above into consideration, FSHD could thereby arise as a toxic gain-of-function of some genes due to the altered chromatin structure. From candidate genes, researchers first focused on FRG1 because it is located with the D4Z4 repeat array on 4q35, it is overexpressed in FSHD patients [102], and only its induced expression in mice gives rise to a phenotype that resemble human FSHD, which did not appear when other possible candidate genes like FRG2 and ANT1 were induced [101]. The exact function of FRG1 has not been yet elucidated. Hanel et al. recently demonstrated that this protein is a developmentally regulated sarcomeric protein responsible for musculature and vasculature damage in FSHD [133] and this function could be attributable to its involvement in multiple aspects of RNA biogenesis [321]. RNA interference of FRG1 corrects myopathic features in mice expressing toxic levels of FRG1, a promising remedy against dominantly inherited myopathies like FSHD [353]. Thus, the severity of the disease could be mediated, at least in part, by FRG1 gene expression.

Other lines of research have joined efforts on the study of the retrotransposed gene encoding DUX4 (a putative double homeodomain gene) present in each subunit of D4Z4. The DUX4 gene was considered as functionally inactive because it lacks introns and polyadenylation signals and there was no evidence for active in vivo transcription [71]. Interestingly, the open reading frame and its organization within the array are highly conserved. Expression studies detected both DUX4 mRNA and protein in FSHD-derived primary myoblasts but not in healthy myoblasts, suggesting that D4Z4 could influence disease progression by aberrant production of DUX4 [71]. The D4Z4 repeat does not contain a polyadenylation signal and the DUX4 mRNA is generated exclusively by transcription of the most distal region of the array, a pLAM sequence containing itself a polyadenylation signal [71]. The pLAM polyadenylation signal can stabilize DUX4 transcripts and, precisely, a group of FSHD patients were found to carry the same SNP in the pLAM sequence, allowing expression of the DUX4 gene [193]. The DUX4 gene codes for two alternative splice variants. Healthy individuals typically express low levels of a short DUX4 variant in muscles, which encodes a truncated protein, while a full-length DUX4 isoform was found in FSHD cells [313]. The full-length DUX4 mRNA is only expressed in certain developmental stages and suppressed in most somatic tissues [313]. Although little is known about DUX4 function, transfection experiments suggest that it may have pro-apoptotic effects and might affect myogenesis [317].

Epigenetic defects in a specific genetic background trigger FSHD in humans. Although some of the genes mentioned above could create a permissive background for FSHD, a new case of FSHD carrying a D4Z4 deletion in chromosome 10 was recently reported by Lemmers and colleagues [193], pointing out that although FRG1 and DUX4 can contribute to FSHD progression, FSHD-related genes remain currently unknown.

Genetic control of histone modifications

Another important epigenetic factor is histone modifications. Two copies of each of the core histones (H2A, H2B, H3, and H4), around which approximately 146 bp of DNA wrap, constitute the nucleosome, which is the structural subunit of chromatin. Histones not only exert a structural function but also affect chromatin function. Histone tails (the N-terminal region of histone proteins) protrude from the nucleosome and are susceptible of posttranscriptional modifications at different residues, such as acetylation, methylation, phosphorylation, poly-ADP ribosylation, ubiquitination, and glycosylation [184]. The presence of these chemical groups in histone tails severely influences the histones–DNA interactions and also the interaction of non-histonic proteins with chromatin, giving rise to what it has been define as the “histone code” [336] in an attempt to reflect the combination of histone modifications at a certain region of chromatin. As we know, chromatin structure affects gene expression and as it is affected by histones modifications, it is clear that gene expression can be altered by histones modifications.

Modification of histone tails is part of a dynamic and reversible process. Usually, each modification is carried out by a group of enzymes that add the chemical group to the specific residue of the histone tail and removed by an antagonistic group of enzymes. Histone acetylation is the most studied among all histone modifications. Histone acetyltransferases (HATs) are the enzymes that catalyze the addition of acetyl groups using acetyl-coenzyme A as donor. The acetylation occurs mainly in lysine residues of H3 and H4. As a consequence of the acetyl group addition, lysine loses a positive charge and, therefore, DNA will be less attracted due to its negative nature. As is the case of other epigenetic features, histone acetylation can be reverted by HDACs. Residues of lysine (K4, −9, −27, −36 and −79 for H3 and K20 for H4; can also be monomethylated, dimethylated, or even trimethylated) and arginine can become methylated. The methyl group in histone tails is incorporated by histone methyltransferases and can be removed by histone demethylases. In this case, the effect depends on both the modified residue and the extent of methylation. For instance, methylation of histone H3 on lysines 4 and 36 (H3K4 and H3K36) is related to an open chromatin structure and, thus, an active transcriptional state, whereas the contrary occurs in the case of H3K9 and H3K27 methylation [184].

The high number of possible sites for modifications and the varied source of enzymes reveal a complex regulation system which, when altered, can contribute significantly to the onset of numerous diseases (Table 1).

Genetic defects in CBP and EP300: Rubinstein-Taybi syndrome

Rubinstein-Taybi syndrome (RSTS, OMIM #180849) is an autosomal dominant disease with a heterogeneous genetic origin. Affected individuals are characterized by growth and psychomotor development delay, unusual facies including down-slanting palpebral fissures, broad nasal bridge, beaked nose and micrognathia, and skeletal abnormalities in extremities, like radially diverted phalanges and broad and duplicated distal phalanges of thumbs and halluces [290]. Other symptoms, like talon cusps, congenital heart defects, and skin problems can be displayed in RSTS [139]. An increased susceptibility to tumors is also a common feature of RSTS patients [230].

Cytogenetic analyses revealed that de novo mutations in two genes, the cAMP response element-binding protein (CREBBP or CBP), located at 16p13.3 [268], and the E1A-associated protein p300 (EP300), located at 22q13 [289], account for about half of RSTS cases. The mutational spectrum is wide, frameshift, nonsense, splice site and missense mutations often occur in the CBP gene [288], but large deletions, inversions, and translocations have been also reported, although less frequently [267]. Mutations in EP300 are considered rare because of their low incidence [384]. All mutations are heterozygous and even mosaic mutations are found in the literature [110].

Both gene products enhance transcription in numerous processes [121] by recruitment of RNA polymerase II complex to promoter regions and the intrinsic histone acetylation activity they have [43], and although they may overlap in some functions, there is clear evidence that the proteins are different [169]. However, out of all the possible mutations, those affecting the HAT domain of CBP are sufficient to cause RSTS [170, 242], suggesting that chromatin remodeling is a key factor in the development of this disorder and that CBP and EP300, together with other possible candidate genes, may converge in molecular pathways. Homozygous mice deficient in Cbp or Ep300 result in embryonic lethality, as well as the heterozygous double knockout, revealing the fact both genes are needed to achieve normal development [370].

Because of its diverse genetic nature, approximately only 50 % of the patients who suffer from RSTS have a correctly identified genetic lesion. Hence, current studies aim to identify new genes related with RSTS or RSTS-like features [111].

Genetic defects in RSK2: Coffin-Lowry syndrome

Coffin-Lowry syndrome (CLS, OMIM #303600) is a rare form of mental retardation with X-linked inheritance. Affected males show growth and psychomotor developmental delay, peculiar facial characteristics like widely spaced and down-slanting eyes, prominent jaw and thick lips, microcephaly, short stature, and skeletal deformities, like hands and feet with tapering digits [375], as cardinal features of the syndrome. Female carriers are at risk of suffering from impaired learning abilities and physical anomalies.

CLS originates in loss-of-function mutations in the ribosomal protein S6 kinase 2 (RSK2), a serine/threonine kinase which maps to Xp22.2 [332] and acts in the mitogen-activated protein (MAP) kinase pathway, together with the other three members of the RSK family: RSK1, RSK3, and RSK4. RSK proteins are activated by extracellular signal-regulated kinase 1 and 2 (ERK1 and ERK2) in response to a wide range of signals, like stimulation with insulin, growth factors, neurotransmitters, UV-irradiation and malignant transformation, and participate in important cellular routes like proliferation, differentiation, apoptosis, and response to stress by phosphorylation in multiple substrates [154]. Mutations in RSK2 are extremely heterogeneous (missense, nonsense, frameshift, spliced site, short deletions and inversion, and large deletions) [154]. Neurologic symptoms of CLS might be due to reduced or even absent activity of RSK2 that impedes correct hormone release in neurocrine cells [380], provokes dysregulation of neurite growth [93], perturbs the differentiation of neural precursors into neurons [73] and induces AMPA dysfunction of the AMPA glutamate receptor, which mediates fast excitatory synaptic transmission in the central nervous system [227]. By contrast, RSK2 is often upregulated in tumors [52, 54, 171], suggesting that this protein needs to be properly regulated to prevent disease.

The epigenetic nature of this syndrome arises from the fact that RSK2 affects chromatin structure by two independent actions: the direct phosphorylation of histone H3 and the interaction with the CBP, affecting gene expression [61, 228]. In addition, RSK2 can phosphorylate other proteins, so the number of genes under RSK2 regulation could be higher than expected and further research is required to identify them and its role in CLS.

Genetic defects in NSD1: Sotos syndrome

Sotos syndrome (OMIM #117550) is an overgrowth condition in children, firstly described as “cerebral gigantism” by Sotos et al. in 1964 [315]. It was not until 2002 when Kurotaki and colleagues found the main cause of Sotos syndrome, which is mutations and deletions in the NSD1 (nuclear receptor SET domain containing gene 1) gene [188], which belongs to a family of mammalian histone lysine methyltransferases relevant in multiple aspects of development.

Children affected with Sotos syndrome have a large body size and early accelerated growth, advanced bone age, facial gestalt (i.e., macrocephaly, prominent jaw, down-slanting palpebral fissures, high hairline with sparse hair growth) and developmental delay, usually accompanied by learning difficulties. There are also many other features associated with the syndrome, such as neonatal jaundice, hypotonia, seizures, scoliosis, cardiac defects, and genitourinary anomalies. Growth in height and weight tends to normalize at puberty, probably due to the epiphysal fusion and adults are close to normal average height and weight (see [19] for revision). Sotos syndrome, like other overgrowth conditions, is typically associated with higher frequency of neoplasms than that of healthy individuals, recently estimated about 3 % and some patients are prone to develop neural crest tumors, sacrococcygeal teratomas and hematological malignances [324].

The NSD1 gene, located at 5q35, encodes for a histone lysine methyltransferase with multiple functional domains, to highlight the SET (Su[VAR]3-9, enhancer-of-zeste, thritorax) domain, the SAC (SET-associated Cys-rich) domain, five PHD domains and two PWWP domains, all of them related to chromatin regulation, and two nuclear receptor interaction domains, NID-L and NID+L, related to bimodal transcription co-regulators [149, 277]. NSD1 gene is expressed in several tissues among which we find fetal and adult brain, kidney, skeletal muscle, spleen, and thymus [187]. It has recently been reported that NSD1 regulates gene expression by modulating the levels of the various forms of methylation at H3K36 [201, 212]. Target genes, like the bone morphogenetic protein 4 (BMP4), participate in various processes, consistent with the role of NSD1 in certain diseases such as cancer and Sotos syndrome. Indeed, the NSD1-dependent methylation activity promotes recruitment of RNA polymerase II onto promoters, like BMP4, to initiate transcription and elongation [212]. NSD1 also targets methylation in nonnucleosomal targets like K218 and K221 of the p65 subunit of NFKβ [211, 212].

Most of the Sotos syndrome cases are sporadic and originate from haploinsufficiency because of de novo mutations (point mutations, 5q35 microdeletions, and partial deletions) within the whole gene although some cases of autosomal dominant inheritance have been reported in literature [46, 134, 300, 311, 363, 385]. At any rate, the varied phenotype of Sotos syndrome suggests a heterogeneous origin, being NSD1 intragenic mutations the most plausible cause. In any case, mutations, which may vary among populations, lead to truncated proteins that impede the correct functioning of NSD1. Homozygous null mice for Nsd1 cannot complete gastrulation and exhibit apoptosis at early embryonic development [277].

Genetic defects in histone-modifier enzymes lead to altered histone modifications in cancer

Although histone covalent modifications are of varied nature, acetylation and methylation of specific amino acid residues at the tails of histones H3 and H4 remain the most related to cancer. Certain histone marks have been largely described in cancer, like the loss of H4K16Ac and/or H4K20me3 in repeated sequences [97]. Also, aberrant modifications of histones such as deacetylation of histones H3 and H4, loss of H3K4me3 and gain of H3K9me and H3K27me3, in promoter regions lead to a deregulation in gene expression, in conjunction with DNA methylation [167]. Not only are histone marks deregulated in cancer but also the expression pattern of histone-modifying enzymes is different in tumor samples from their normal counterparts, and these differences also depend on the tumor type [256]. This evidence suggests that the questioned histone modifiers could carry out a specific role in cancer.

For instance, altered HAT activity has been reported in hematological and solid cancers, due to mutations within the gene or deregulation in gene expression by viral oncoproteins. In leukemia, chromosomal translocations affecting histone-modifying enzymes, such as HATs, alter their activity. For instance, translocations of HAT or related genes, like TIF2, MOZ and MORF, HDACs, create fusion proteins, which possibly deregulate the expression of target genes by chromatin remodeling [84]. The family of mixed-lineage leukemia (MLL) histone methyltransferases is frequently mutated in human cancer. Rearrangements of MLL1 have been related to deregulation of Hox genes in leukemia [141], and inactivating mutations have been found in MLL2 and MLL3 in primary tumors of medulloblastoma and in non-Hodgkin lymphoma [237, 263]. Specifically, in this type of malignant brain tumor mutation, deletions and amplifications in genes that modulate H3K9 methylation have been previously described [250]. These mutations affect not only other histone methyltransferases, like EHMT1 and SMYD4, but also histone demethylases containing the jmjC domain, such as JMJD2B and JMJD2C [250]. In recent decades, there has been a significant increase in the number of publications providing evidence for a specific role for histone posttranslational modifications in human tumorigenesis.

Epigenetic changes induced by nucleotide expansion

CAG trinucleotide expansion affects histone posttranslational modifications: Huntington disease

Huntington disease (HD, OMIM #143100) is a fatal neurological disease caused by an autosomal dominant mutation on chromosome 4p. The life course of HD lasts approximately 20 years and individuals carrying the mutation are usually asymptomatic until mid-life. Progressively, the brain degenerates and patients show motor impairments (chorea and dystonia), and psychiatric and behavioral disturbances together with a cognitive decline which eventually triggers dementia.

The affected gene that causes HD is IT15, which encodes the protein huntingtin [127], of ubiquitous expression although it is the brain which is the affected organ in this disease, and in particular, the striatum [350]. The mutation originates an expansion of the CAG trinucleotide within exon 1, which is translated into a polyglutamine (polyQ) tract. After this discovery, HD was categorized within a group of neurological disorders called polyQ diseases, all of which are characterized by the presence of CAG repeat expansions [265]. In HD, those unaffected have 5–36 CAG repeats, while expansions of 40 or more CAG repeats imply complete penetrance. The HD repeat length is inversely correlated with age of onset of the disorder and gradually increases from youth to the age of 70 [76]. The presence of 36–39 CAG repeats indicates incomplete penetrance and carriers are at high risk of developing HD [291]. The abnormal polyQ chains hinder normal huntingtin folding with a consequent loss of function [297]. In addition, mutant huntingtin is reluctant to proteolytic cleavage, and, together with the proteins it interacts with, accumulates in protein aggregates, which are toxic to brain cells and trigger brain dysfunction [59, 70].

Even though the genetic origin of HD is well established, not all the phenotypic variation can be explained by genotypic variation. For instance, parent-of-origin effects and genetic anticipation are some of the non-Mendelian features that cannot be explained by conventional genetics [279, 283, 328]. Sabl and Lair postulated in 1992 a hypothesis based on position effect variegation and epigene conversion, which provided a new framework in which the molecular mechanisms underlying HD could be better explained [293]. The role of epigenetics in HD was supported by the fact that monozygotic twins, who, share identical CAG repeats length, exhibit marked differences in HD symptoms [5, 99, 109, 122].

Histone acetylation in this disease represents a robust modulator of genomic stability. By using budding yeast and cultured human astrocytes, Debacker et al. have recently demonstrated that specific HDACs promote CAG repeats expansion while the HAT activity of CBP and p300/CBP-associated factor, which directly bind to mutant huntingtin, inhibit the elongation [63], supporting earlier works in Drosophila [258, 318]. Different HDAC inhibitors administered to transgenic mouse models of HD ameliorate varied symptoms, e.g., improvements in motor performance or reduction of the neural atrophy typically observed [92, 104, 142, 330]. All these reports provide solid evidence for the role of histone modifications in the etiology of HD and the use of HDAC inhibitors represents a promising field for this, so far, incurable disease. However, other mechanisms have also been described in HD, like increased methylation levels in H3K9 in humans and transgenic mice due to abnormally elevated expression of the histone methyltransferase ERG-associated protein with SET domain (ESET) [292]. Also, mutant huntingtin has been found to induce widespread transcriptional dysregulation [6], a phenomena that could be explained by the increased number, and varied nature, of posttranslational modifications to which huntingtin can be subjected, unless mutated [81].

Genetic control of chromatin remodeling

The way DNA and histones are packaged into nucleosomes and higher-order structures is fundamental to maintain a whole genome into the nucleus of a cell but, at the same time, it represents an obstacle for transcription, replication, DNA repair, and recombination [196]. Distinct nuclear machineries are in charge of controlling the interactions within chromatin to allow specific transcription factors to bind to the target sequence in DNA and carry out adequate gene transcription. Chromatin remodeling proteins, together with DNA methylation and histone tail modifiers, thoroughly regulate these processes. Some of the chromatin remodeling proteins are characterized by the presence of a DNA-dependent ATPase domain, which belongs to the SNF2 (sucrose non-fermenting 2) family and allows them to use the energy of ATP hydrolysis to modify interactions between histones and DNA [312]. Depending on the sequence homology of their conserved ATPase domain, these chromatin remodeling proteins have been divided into different families like SWI2/SNF2 (switching/sucrose non-fermenting chromatin remodeling complex), CHD, and so on [94].

Mutations in genes encoding for enzymes that participate in chromatin remodeling severely affect chromatin structure, leading to a deregulation of gene expression and possibly to inadequate protein expression. As a consequence, numerous epigenetic disorders arise, like α thalassemia X-linked mental retardation (ATR-X) syndrome, CHARGE syndrome, and Cockayne B syndrome, and cancer susceptibility increases (Table 1).

Genetic defects in ATRX: α thalassemia X-linked mental retardation syndrome

The α thalassemia X-linked mental retardation syndrome (ATR-X, OMIM #301040) is a rare X-linked recessive developmental disorder that affects mainly males, who are typically characterized by a global developmental delay, facial dysmorphism, mild α thalassemia, and skeletal and urogenital anomalies [114].

Commonly, an ATR-X patient displays a global developmental delay which in 95 % of the cases, accounts for mental retardation. Sometimes, there is also an autistic-like behavior. At any rate, expressive language is limited and patients exhibit hypotonia. Characteristic facial anomalies are often seen, including frontal hair upswept, telecanthus and epicantic folds, flat nasal bridge, small triangular upturned nose, tented upper lip, full and everted lower lip, and protruding tongue. Typically, ATR-X individuals also suffer from skeletal anomalies, probably secondary to the hypotonia, gastroesophageal reflux, and genital anomalies, which range from undescended testes or deficient prepuce to ambiguous genitalia, depending on the ATRX mutations. Mild α thalassemia, characterized by the formation of hemoglobin H inclusion bodies in red blood cells and caused by defects in α globin genes, is not always present as previously thought (reviewed in [114]). Intriguingly, female carriers have usually normal physical and intellectual capabilities. This is due to the skewed X-chromosome inactivation, which is quite common in female carriers of X-linked mutations, and affects preferentially the abnormal bearing allele [117]. This does not occur during early embryogenesis but it is established during development by tissue-specific selection [239].

Most of the reported cases of ATR-X syndrome contain mutations in the ATRX gene, which is located on chromosome Xq13 and encodes a chromatin remodeling protein [118]. The ATRX gene was first described in 1995 [116]. The ATRX gene belongs to the SNF2 family, whose members are involved in the regulation of transcription, cell cycle, DNA repair, and chromosome segregation during mitosis and meiosis; and seem to facilitate these functions by chromatin remodeling [118]. Two domains were clearly identified in the ATRX gene. The ADD domain (ATRX-DNMT3-DNMT3L, because of the sequence homology with the family of de novo DNMTs) at the N-terminus of the protein, which is integrated by N-terminal GATA-like Zn finger, a PHD-like finger and a long C-terminal a helix, giving rise to a globular domain, which may have a DNA binding function [7], as previously suggested by Cardoso et al. [39]. The other main domain is a SWI/SNF-like ATPase/helicase domain, which shows weak chromatin remodeling activity but strong DNA translocase activity in vitro [367].

Mutations in the ATRX gene cluster occur mainly in the ADD domain (50 %) and the helicase domain (30 %) [118]. The majority are missense mutations and those occurring in the ADD domain can take place in buried residues, affecting the structure of the protein; and in surface residues, damaging potential transcription factors binding sites and protein interaction sites [7], but even large intragenic duplications with the consequent loss of function have been detected in ATR-X patients [329].

The ATRX protein is present mainly in pericentromeric heterochromatin [224]. Usually, ATRX is recruited by HP1α [23] through a histone H3-binding module, whose binding is promoted by H3K9me3 and inhibited by H3K4me3 [153] and H3K4me2 [69]. Mutations affecting either ATRX or H3K9me3 result in defective binding and pericentromeric heterochromatin localization [153, 183]. A yeast two-hybrid assay demonstrated the interaction between the Polycomb group protein EZH2, a histone-lysine N-methyltransferase, and ATRX [40]. In promyelocytic leukemia nuclear bodies, ATRX complexes with Daxx, a death-domain-associated protein, and participates in chromatin remodeling for genes that are controlled by Daxx-interacting sequence-specific transcription factors [367]. MeCP2-deficient neurons exhibit ATRX diffuse nuclear distribution suggesting that MeCP2, another chromatin remodeling protein, is responsible for ATRX recruitment in mature neurons [243]. Atrx null mutations in mice result in embryonic lethality and induce DNA methylation changes [105]. Nevertheless, the variability of phenotypic ATR-X syndrome features is influenced by the location of the mutations affecting the ATRX gene but in any case, none of the reported human mutations are true nulls.

Mutations in the human ATRX gene were at first related to decreases in α globin gene expression, which was responsible for the α thalassemia phenotype. However, some ATR-X syndrome patients were completely normal at hematologic level. Interestingly, DNA methylation changes have been observed, affecting mainly rDNA, Y-specific satellites (DYZ2) and subtelomeric repeats [115]. Current knowledge of the ATR-X syndrome establishes a robust link between chromatin remodeling and DNA methylation with gene expression.

Allelic mutations in the ATRX are commonly associated with mental retardation. Indeed, ATR-X syndrome is not the only one caused by mutations in the ATRX gene. Other syndromes related to ATRX dysfunction, which combine mental retardation with dysmorphic features, include the Juberg-Marsidi [348], Carpenter-Waziri [2], Holmes-Gang [319], Smith-Fineman-Myers [347] and Chudley-Lowry [3] syndromes.

So far, several ATRX gene cofactors have been discovered: HP1 [23], EZHZ [40], Daxx [367], MeCP2 [243], and Cohesin [173]. Tang and colleagues have recently identified 12 transcription factors binding sites in the 5′ regulatory region and 5′UTR of the mammalian ATRX gene that could be bona fide candidates for regulation of ATRX gene expression [323]. ATRX is fundamental for proper development not only in the brain and multiple parts of the central nervous system: it also regulates somatic and germ cell function and gametogenesis during both testicular and ovarian development [151]. Altogether, these data suggest that the ATRX gene appears to be more involved in regulating specific target genes rather than behaving as a global regulator. Further experiments are needed to corroborate this hypothesis, which will probably cast light on ATRX-mediated molecular pathogenesis and its relation to mental retardation. At any rate, a clear link is established by ATRX between chromatin remodeling protein that plays an important role in the maintaining and establishment of DNA methylation patterns, although the mechanisms are still elusive.

Genetic defects in CHD7: CHARGE syndrome

CHARGE syndrome (OMIN #214800) is an autosomal dominant complex genetic disorder characterized by the conjunction of multiple birth defects. It was in 1981 when Pagon et al. coined the acronym CHARGE to summarize the main symptoms of the syndrome, which are coloboma of the eyes, heart defects, atresia of the choanae, severe retardation of growth and development, genital abnormalities and/or hypogonadism, and ear abnormalities and/or deafness [257].

The chromodomain helicase DNA-binding protein seven (CHD7), located at 8q12.2, is found to be mutated in two of every three patients [259] and was first identified by Vissers et al. [349]. CHD7 belongs to the CHD superfamily, which groups ATP-dependent chromatin remodeling proteins. The CHD7 protein contains two N-terminal chromodomains, which mediate binding to methylated histones; a central SWI2/SNF2-like ATPase/helicase domain with chromatin remodeling activity; a SANT domain, which is involved in DNA and/or modified histones binding, and two BRK domains with unknown function [382]. Chromatin remodeling enzymes usually exert their function by interacting with other proteins. This also applies to CHD7. It associates with PBAF (Polybromo and BRG-associated factor-containing complex), a chromatin remodeling subcomplex of the SWI/SNF family, which is essential for the activation of the transcriptional program associated with the formation of multipotent migratory neural crest, a transient cell population with a multilineage differential potential [12]. CHD7 is ubiquitously expressed at early stages of development. Then, expression increases in CHARGE syndrome-related organs in both humans and mice, suggesting that the expression pattern of CHD7 seems to be both spatially and temporally regulated [382].

CHD7 is recruited to specific sites of the chromatin depending on the cellular context. The pattern of occupancy is highly correlated with the location of the H3K4me. This, together with the facts that CHD7 binding sites are predominantly in a distal location to the transcription start site, DNAse hypersensitive, frequently conserved and near highly expressed genes, suggests that CHD7 is involved in chromatin remodeling and gene expression by recognizing histone modifications and altering chromatin structure [304]. Recently, it was found that CHD7 binds to hypomethylated rDNA, and could be acting as a positive regulator of rRNA synthesis [381].

Some mutations are missense but the majority are nonsense and frameshift mutations that arise de novo and might result in haploinsufficiency of CHD7, thereby producing a truncated protein. Mice with Chd7 homozygous null mutations die at an early embryonic state but the heterozygous are born with malformations that resemble human CHARGE syndrome [30]. Chromosomal abnormalities have also been reported in patients with CHARGE syndrome, like an interstitial deletion of 8q12.2-q13 [8] and a balanced translocation affecting 8q12 [163]. Mounting evidence points that CHD7 mutations causing sporadic CHARGE syndrome are predominantly of paternal origin [264]. Mutations in CHD7 are also related to other rare genetic syndromes, such as the Kallmann [175] and DiGeorge [295] syndromes.

The clinical manifestations of CHARGE syndrome are extremely variable, and mutations in CHD7 could be of help for diagnosis but may not predict the phenotype [22]. In addition, approximately one-third of the patients do not present CHD7 mutations, suggesting that other chromosomal regions or candidate genes may be involved in the etiology of this disorder. The study of CHD7 target genes and potentially candidate genes will provide vital information with regard to this syndrome.

Genetic defects in CSB: Cockayne syndrome B

The Cockayne syndrome (CS) is a rare autosomal recessive genetic disorder characterized by growth failure within the first years of life, progressive neurological dysfunction, and varied symptoms related with premature aging, like hearing loss, retinal degeneration, and skin problems [246].

CS is classified into two genetic complementation groups, A (CSA, OMIM #216400) and B (CSB, OMIM #133540). The majority of CS patients carry mutations in the CSB gene and in less proportion, in the CSA gene. The CS phenotype can also arise with mutations in xeroderma pigmentosum genes (XPB, XPD, and XPG) [276].

CSA and CSB proteins participate in the nucleotide excision repair mechanism. In particular, the CSB protein is involved in the transcription-coupled repair process. CSB protein, also known as “group 6 excision repair cross-complementing protein” (ERCC6), belongs to the SWI2/SNF2-like-dependent ATPases, which is known to wrap around DNA [21] and acts as a chromatin remodeler [53]. Mutations causing CSB are predominantly located beyond exon 5, affecting ATPase and/or C-terminal regions and producing truncated proteins, which could impede the correct functioning of wild-type CSB protein by hindering CSB interactions [144]. At the cellular level, hypersensibility to UV light [301] and inability to synthesize RNA to normal rates after UV exposure are the main features of cells from CS individuals [223]. After UV exposure, the ATPase and the C-terminal domains are essential for a stable CSB-chromatin association at DNA lesion-stalled transcription sites but in a normal physiological state, the N-terminal domain of CSB inhibits this association by conformational changes [190]. Curiously, Newman and coworkers found that the patterns of gene deregulation in CSB-null cells were very similar to those observed after treatment with epigenetic drugs like HDACs inhibitors and demethylating agents. Furthermore, CSB does not form complexes with histone modifiers like other SWI2/SNF2 ATPase-like chromatin remodeling proteins; instead, it acts directly on chromatin and its effect is seen on particular genes [248].

The piggyBac transposable element 3 (PGBD3) is integrated within the CSB intron 5 and alternative splicing generates a fusion protein, CSB-PGBD3, which is normally expressed in CSB patients and may contribute to the pathogenesis of the disease by unknown mechanisms [247].

Genetic defects in SNF5 are associated with cancer

The SWI/SNF is probably the most studied family of the chromatin remodeling complexes, which is highly conserved from yeast to humans. SNF5 (INI1), present in all variants of the SWI/SNF complex [357], is essential for normal cell viability and its loss due to mutation significantly contribute to tumorigenesis, hence it has been largely considered as tumor suppressor gene. Unfortunately, the molecular basis through which loss of SNF5 leads to cancer is still to be elucidated.

Inactivating mutations in SNF5 were first described in malignant rhabdoid tumors, one of the most aggressive and lethal of childhood tumors [25, 344]. Also, familial cancers arise from inherited mutant alleles of SNF5, a condition termed “rhabdoid predisposition syndrome” [327]. Apart from specific biallelic mutations in the SNF5 gene, tumor cells have normal karyotypes, and this genomic stability in SNF5 deficient cancers suggests that transcription changes of epigenetic origin are key drivers of malignant rhabdoid carcinogenesis [225].

In Drosophila, loss of this gene within the context of Brahma (SWI/SNF) complex lead to alterations in RNA polymerase elongation, pre-mRNA splicing regulation and chromatine accessibility of ecdysone hormone regulated genes, demonstrating that multiple pathways could be affected by loss of SNF5 function [386]. In mice, homozygous deletion of Snf5 causes embryonic lethality, and although the heterozygous genotype appears normal, early in postnatal life tumors develop [128, 178, 284]. In mouse embryonic fibroblasts, inactivation of the Snf5 gene leads to cell cycle arrest and apoptosis [152]. Conditional silencing of Snf5 and p53 accelerates tumor formation, suggesting that Snf5 tumor suppressor activity could be mediated by p53 [64, 152, 179]. In light of these findings, it has been assumed that antitumorigenic activity of SNF5 could be mediated by its role in cell cycle regulation. In this support, further works have shown that other cell cycle-related genes, like p21 [189] and Aurora Kinase A [191] could also mediate SNF5 function. Taking into account that malignant rhadoid tumor cell lines are highly invasive, Caramel and colleagues reported that inhibition of migration is crucial for the protective role of SNF5 against cancer [38]. Another molecular pathway that has been linked to oncogenic processes under loss of SNF5 is the Hedgehog pathway, which is activated during tumorigenesis [156].

Genetic defects within SNF5 gene sequence have been found in other types of malignancies [181, 185, 205, 233]. The different target genes whose expression is regulated by SNF5, together with the cellular context of each tumor type, are major obstacles to determining the exact mechanisms by which SNF5 mutations lead to cancer, but cumulative evidence points to the cell cycle deregulation, which follows SNF5 loss of function perhaps being a vital event for tumor development.

Concluding remarks

It is well known that variation in the DNA sequence can contain genetic differences that predispose to disease, and this fact has allowed scientists to decipher the molecular origin of many illnesses of Mendelian inheritance. However, the mechanisms underlying complex and non-Mendelian disorders are more difficult to elucidate. Adopting an epigenetic perspective could be of help at the time of interpreting the molecular processes that might account for this type of malignances (Fig. 2).

Fig. 2
figure 2

Representative illustration of several genes which have epigenetic function and whose deregulation leads to disease

Nevertheless, several reasons make it difficult to establish a robust link between epigenetics and complex diseases. Firstly, there is a physiological epigenetic variation as a response to internal and external stimuli. Thus, the epigenetic differences observed may not necessarily account for the disease. In addition, the epigenetic status of cells is tissue-specific. This means that the epigenetic profile varies significantly among different cell types. Thus, the epigenome of easily accessible tissues like blood or bucal cells may not reflect the epigenome of the functional organ of interest. Thus, we should focus on the tissues and organs that contribute to the phenotype being studied. Additional problems are the difficulties at the time of acquiring a sample of those tissues and that the tissue of interest may contain a variety of cell populations, and relevant epigenetic changes can be missed because of cellular heterogeneity. However, it should also be noted that the disease itself and associated factors like the treatment can further induce epigenetic variation, because epigenetic changes, in contrast to genetic ones, can be reverted. Hence, it would be interesting to study not only the organ of interest but also the unaffected ones in healthy individuals as well as in those carrying the disease in order to exclude normal phenotypic variation [269]. Moreover, since most epigenetic modifications are reversible, the field of epigenetic drugs, such as demethylating agents or HDACs inhibitors, is rapidly growing and becoming a therapeutic approach to treat or prevent epigenetic-related diseases. Although DNA methylation and histone acetylation have been thoroughly studied, the field of epigenetics is expanding and very little is known about aberrations in the rest of the epigenetic layers, like other histone modifications, chromatin remodeling proteins, and microRNAs.

The emerging field of “epigenetic epidemiology”, as recently proposed by Feinberg [89] to describe the measurement and cataloguing of the epigenetic variations within and across populations and to characterize the correlation of epigenetic factors with disease, is rapidly growing and becoming more relevant in the study of epigenetic diseases. In the last few years, there has been a massive technological development in epigenetic experimental procedures that will facilitate the work on the challenges ahead, like genome-wide association analyses, DNA methylation microarray platforms and the next sequencing generation [216, 262]. A deeper knowledge in this field will allow scientists to discover epigenetic modifications that could serve as biomarkers indicative of disease and will improve diagnosis, prognosis and therapies for the patients in the future.